Salesforce 101: How To Clean URLs in Salesforce

Home     /     Blog     /     Salesforce 101: How To Clean URLs in Salesforce

Overview

One common complaint we get from customers is how inconsistent websites and URLs are stored in the system.

Often, reps will include www or https or fail to clean the url params when entering data. Messy data makes it more difficult to identify duplicates, makes matching leads to accounts more difficult and generally makes data governance a pain.

One of the first steps we take for all customers is to add a simple domain parsing field in Salesforce on Accounts which helps solve all of the aforementioned problems.

What makes up a URL?

URLs are made of of a Protocol, Domain, Path, Query String/Parameters and Fragment.

The unique element of a URL is the Domain; this is the critical component to pull out of a given URL that represents a unique entity because there is only one kicksaw.com or salesforce.com.

Url_Structure.jpg

Why Parse Domain From a Website in Salesforce?

URLs are often copy/pasted from the web as reps are doing research and three identical accounts will often have three completely different URLs

  • https://www.kicksaw.com
  • https://kicksaw.com
  • https://kicksaw.com/?info=123

If you had three separate Kicksaw accounts with these domains there would be no simple way of identifying them as duplicates. We've seen this problem crop up in dozens of SFDC instances and because of that, we've developed a URL parser that stores the root domain of a given record.

We've found that it makes data cleanup simples and improves data quality.

Why Parse Domain From an Email in Salesforce?

Luckily, emails are structured simply with mailbox name and domain as the two key elements. Every employee at Nike uses nike.com as their email address. This makes it easy to parse the domain from a given email.

Url_Structure_White_BG.jpg

How Do I Parse a Domain From a Website Field?

The formula below looks complex but what it does is relatively simple. We're essentially trying to solve for all of the edge cases in a given URL. The most simple example would be removing http or https from the beginning of the URL. We may need to remove any www as well as any URL parameters or fragments. Salesforce doesn't offer that out of the box so a formula is the simplest way to do it.

SUBSTITUTE(
IF(
FIND("/",
IF(
FIND("www.", Website) > 0,
IF(
FIND("//", Website) > 0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//www.", Website) + 5),
NULL
),
Website
),
IF(
FIND("//", Website) > 0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//", Website) + 1),
NULL),
Website
)
)
) > 0,
LEFT(
IF(
FIND("www.",Website)>0,
IF(
FIND("//",Website)>0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//www.",Website) + 5
),
NULL),
Website
),
IF(
FIND("//",Website)>0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//",Website) + 1
),
NULL
),
Website
)
),
FIND("/",
IF(
FIND("www.",Website)>0,
IF(
FIND("//",Website)>0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//www.", Website) + 5
),
NULL
),
Website),
IF(
FIND("//",Website) > 0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//",Website) + 1
),
NULL
),
Website
)
)
) -1
),
IF(
FIND("www.",Website)>0,
IF(
FIND("//",Website)>0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND("//www.",Website)+5),
NULL
),
Website
),
IF(
FIND("//",Website)>0,
SUBSTITUTE(
Website,
LEFT(
Website,
FIND(
"//",
Website) + 1
),
NULL
),
Website)
)
),
'www.',
''
)

How Do I Parse a Domain From an Email Field?

Both of the formulas below can be dropped into a formula field in Salesforce to parse the domain from a given email. This is extremely helpful when auto-converting leads to the appropriate account or to match leads against existing contacts.

Option 1:

SUBSTITUTE(
Email,
LEFT(
Email,
FIND(
"@",
Email
)
),
NULL
)

Option 2:

RIGHT(
Email,
LEN(
Email
) -
FIND(
"@",
Email
)
)


Related Articles