
DemandTools Module Detailed Help Single Table Deduplicator
Updated October 25, 2007
OVERVIEW
The Single Table Deduplicator allows extremely flexible viewing of potential duplicates within salesforce.com. This detailed document will walk you through selecting an object, criteria and merge process. Alternatively, CRMfusion is now offering a NO DUPES service that you can take advantage of!
***IMPORTANT NOTES ABOUT MERGING***
General Duping Guidelines:
1. Run dupes in the proper order. Ensures most information retained during merging with the least amount of time and effort expended by the CRM Admin.
Recommended Order: Account; Contact; Lead; Lead to Contact; Lead to Account; Opportunity; Custom Objects
2. Use atleast a 3 pass strategy for identifying dupes
-Start with very rigid criteria
-Loosen criteria with each pass
-STD comes with pre-built scenarios for primary objects - These are just a starting point, NEED REVIEW AND MODIFICATION TO SPECIFIC CUSTOMER NEEDS.
NOTE: Items 3,4, and 5 no longer apply when checking teh "Use Salesforce Merge" for ACCOUNTS, CONTACTS, AND LEADS. This is the new default when creating a scenario from scratch and/or using a pre-built scenario for ACCOUNTS, CONTACTS, AND LEADS. If you are de-duping any other object thes caveats will apply.....Please review carefully!
3. Objects owned by inactive users
In PRE-1.7 versions of DemandTools, Contacts, Opportunities, Leads, and Cases would only be merged if they were owned by an Active user. For example, if an opportunity within an Account that is being merged into a master Account was owned by an Inactive User, the Opportunity would NOT be merged to the master account. You would need to first change the ownership of the object to an active user, and then run the merge.
In this release, we have added the ability to merge objects owned by inactive users, and KEEP THE OWNERSHIP HISTORY. In step#3 Merge Control, you can select the “Keep Ownership” option under “Inactive Ownership Exceptions”. This will re-activate the inactive user just long enough to transfer the object to the master, and then de-activate the user once again. YOU WILL NEED A SPARE SF LICENSE TO USE THIS OPTION. You also have the option to change the ownership of these objects to the Master Owner, and/or the Current User.
Tasks and Events that are owned by inactive users will be moved in the deduplication process.
4. Archived completed activities
Completed Activities will only be moved if they are less than a year old. Salesforce.com will "Archive" activities older than a year and they are not accessible via the API, and hence, the DemandTools.
You can request from salesforce.com that only activities older than 3 years be archived so you will be able to move tasks and events less than 3 years old. You will need to contact salesforce.com technical support to have this done.
UPDATE: All requests for archive date changes are evaluated individually and implementation of the request is not guaranteed. Possible ramifications of changes to the archive period include a possible performance impact on activity related reports and list views for the specific organization. Unarchiving activites will also increase your storage usage.
NEW: Many customers choose to change the archive period temporarily (i.e. 30 days) then run their first round of de-duplication. Going forward running"maintenance" dupes (i.e. weekly, monthly etc.), select the "oldest record" as the master. This assumes that all new duplicates will have been created byrecently added records that would not have any activities greater than one year old.
The process to do this as follows:
- Create a case salesforce.com with a subject of "unarchive activity request"
- In the body of the message specify:
- the justification for this request.
- how many years you'd like your activities to remain active (unarchived.)
- if the previously archived records should be unarchived as well.- The request must be made by a user with the System Administrator profile on the Account.
All requests for archive date changes are evaluated individually and implementation of the request is not guaranteed. Possible ramifications of changes to the archive period include a possible performance impact on activity related reports and list views for the specific organization.
5. In order for objects that have to be cloned vs. reparented to retain the proper CreatedDate and CreatedByID field values, you need to create a case with salesforce.com to allow you to modify system fields.
There is a feature in salesforce.com called "Modifiable System Fields" and is only available when you are creating new records (cannot be used when updating existing records).
The process to do this is as follows:
1. Create a case with Salesforce.com with a subject of "Modifiable System Fields Request"
2. In the body of the message specify the justification for this request. Include as MUCH DETAIL as possible. (i.e. migrating from a legacy system and wanting to keep the integrity of the original data, merging data using DT ).
3. The request must be made by a user with the System Administrator profile on the Account.All requests for are evaluated individually and implementation of the request is not guaranteed.
HOW TO FIND IT?
Access the Single Table Deduplicator (STD) from the DemandTools Today screen using the menu in the left column. Select the Single Table Deduplication option from this menu or from the pull down menus across the top of the interface to be taken to the Single Table Deduplicators first wizard screen.
WHAT DO YOU WANT TO DEDUPLICATE?
There are two options for running the STD:
WHERE TO VIEW THE DUPLICATES AND SELECT MASTER OPTIONS?
The scenarios are listed in the left hand column, descriptions of the scenarios are listed in the middle column and the fields that will be viewable are listed in the right hand column.
Load Scenario |
Loads the scenario and all settings. User must manually select the next screen for each step. |
Run Scenario |
Selection of Run Scenario takes the user to the final Single Table Deduplication screen. Duplicates are displayed, master records are selected. User must execute the merge. |
Delete Scenario |
Deletes the source xml file and removes the file from the interface listing |
Save Scenario |
Used for newly created or modified scenarios. Scenarios are stored in the Demand Tools directory. |
Clear Scenario |
Lets you start over with either selecting another scenario or selecting an object to deduplicate |
COMMON SETTINGS FOR ACCOUNTS, LEADS AND CONTACT OBJECTS:
Cases
Contacts
Leads
Opportunity
Solution
Task
Developing and Executing Dedupe Scenarios
Through the development of a deduplication pass (all the way from the selection of the object, the sub set of objects, match criteria and options, master rule and score delta through the merge options) can now be saved in a new DemandTools object called the deduplication scenario. The dedupe scenario saves all setting except for specific record custom actions such as field edits, manual master vs. slave selections etc.
The scenario is given a name and a self generated description of what the scenario settings are. After the system self creates this file it is end user modifiable to allow for custom notations etc. All fields and value settings are automatically recorded in the description file to save time.
NOTE: If you modify a scenario, to update the text description, you must select REBUILD SCENARIO. This will update all of the DemandTools modifications, you will need to re-enter any notes you had on the old scenario.
Save scenario dialogue box
Scenarios pre built or built by user being displayed in Screen 1Scenarios are saved on screen 1 with the scenario name sorted under the object it belongs too. Scenarios can also be found in the the DemandTools directory in a folder called scenario and under the appropriate object type sub-folder.
Scenarios are xml files that can be sent from one DemandTools user to another as long as the selections of objects and fields are logical for both computers (either the same salesforce.com database or very similar ones).
Please visit our NO DUPES SERVICE that will show you how DemandTools can offer their own custom built scenarios with your company.
Scenarios are controlled by the scenario control buttons on page 1. The scenario load button (or double clicking the scenario name) will load the scenario into the deduplicator. You can also select “Save Scenario” on all of the 3 dupe screens, so the scenario can be saved or updated at any time in the process.
The delete scenario button will delete the source xml file and remove the file from the interface listing.
Section B. Choose the table/object you would like to deduplicate and all associated settings for merging
Screen 1 of Single Table Deduper - Steps #1 - #4

Step #1 - Select the Object (Table) to Deduplicate
| Select the proper salesforce.com database to deduplicate. All tables are available beneath the pull down list and are sorted in alphabetical order. After selecting the object in the pull down menu, press the "Use Object" button to begin. |
Step #2 - Select the fields to show on the Potential Duplicates List
The primary goal of step two is to select the fields to display in the found duplicates grid (Screen Three of the STD). Most users will want to display a combination of fields that would help verify that the records are truly duplicates and fields that display the most important data values. DemandTools shows both the table field name as well as the Field Label from the interface. Select the fields in the order you would like to see them when the potential duplicates list is shown. The Order column will be populated numerically to indicate you selections as you make them.

Select either the default radio button "Use All" (to select all objects to search for potential duplicates) or the radio button "Use Conditions" (develop conditions to narrow the search range).
After each selection of field and value to narrow by the user must press "Add Condition" so that the expression builder will place the expression in the large expression display box.

Developing conditions to narrow the records in the Table that are being searched for potential duplicates.
Due to a salesforce.com API limitation, it is not possible currently to select records in one table based upon another tables conditions. For example you cannot select Contacts associated with Accounts that start with the letter A.
Step #4 - Proceed to Mapping Screen
After selecting the table (object) to deduplicate, the fields to show on the potential duplicates list, and the subset of object to search, select " Screen 2 - Mapping" to proceed.

Step #5 - Select "Add Mapping Condition"
Mapping conditions consist of Fields to Match, Match Category and Mapping Options.
TIP: To remove a mapping condition, highlight the row by clicking on the box to the left of the Match Blank Values Column and hit Delete on your keyboard.
Step #6 - Selecting "Fields to Match "
If you select the MATCH BLANK VALUES checkbox, DemandTools will return records where the values for that particular field are blank in all records that match on the other fields you select for matching.
![]() |
Select the salesforce.com field that will be used to identify duplicates. Multiple selections of course may be necessary in order to properly determine duplicates with confidence. |
Step #7 - Choose a Mapping Technique
Select the 'Match Category" to look through the techniques by category or go directly to the "Match Type" (leaving "Match Category" as All) to see a list of all available match types.
The available list of Match Categories includes the following options:
In this version of the STD the following is a list of all available Matching types (techniques), they can be found under their respective "Match Category".
![]() |
![]() |
Mapping Types Explained in Detail
|
Replace Value Editor |
![]() |
The Nickname list allows the deduplication tool to see Bill, William, Billy etc as potential duplicates of each other. This list is also customizable by the end user for localization or even in theory for non contact substitution on any field by replacing the nickname list with synonyms.
![]() |
With relaxed address match, all of the followingl seen as the deduper in the lowest common denominator of "123 Pavillion":
Apt #4, 123 Pavillion Street ----- 123 Pavillion, Apt 4 ---- 4-123 Pavillion Ave NW
Step #8 - Select Mapping Options
In addition to the already covered mapping types there are also options that can be applied to many of them. The STD provides for three different options to be augmented to the appropriate types of mappings. The available options are:
| Type | Description | Mapping Types |
|---|---|---|
| Phonetics engine capable of analyzing words for how they sound when pronounced. Through a technique of removing vowels and analyzing the remaining consonants the fuzzy engines works very well for matching fields with spelling mistakes. | Cleaned Account Name Exact FirstName |
|
|
The transpositional engine allows for fields to appear to be duplicates even if the have differences in their word order. For example Jones, Smith and Jackson will appear to be a duplicate of Jackson, Smith and Jones. |
Cleaned Account Name Exact FirstName Street |
The alpha cleaner extends some of the capabilities of the account name cleaner to other fields for matching. The alpha cleaner is used when you know you only have ascii (north american) data and you would like to ensure that the only characters that are analyzed are the 26 characters of the english alphabet and the numbers 0-9. Any other character that the field may contain will be ignored and not seen by the deduplication matching algorithms.
|
Cleaned Account Name Exact FirstName Numeric Street Zip 5 and 9 |
Step #9 - Go back to Step 5 and define more Mapping Options until your criteria is complete!
Step #10 - Search for Potential Duplicates
Select the "Screen 3 - Merge Control" Button to search for potential duplicates and move to screen 3 to analyze and eventually process.
![]()

Step #1 - Work in the Grid: View, analyze, edit, process, single record merge




Gray pins
means no record has been selected as master
Green pinsmeans that this record has been selected as the master record
Red pinsrepresent the records that have been selected or determined to be servant records
You can merge duplicates one set at a time by manually right clicking and selecting "Merge Group". The deduplicator will then merge that single duplicate set into a single record.
Or, you can identify all the records to be processed by placing check marks in boxes. Manually pre select all master records and then process all the sets in batch mode. Select the “Merge Checked Objects” button to process the entire loaded set of duplicates, find all checked ones, then determine which have been selected as masters and which as servants.
Step #2 - Choose or design a master rule
The Single Table Deduplicator is packaged at initial installation to have default master rules ready for use for most of the standard salesforce.com objects that are typically chosen for deduplication. These may include rules like oldest or newest record, most recently modified record, owned by, etc. Custom objects and less frequently de-duped objects may require that even the simplest rules such as oldest record and newest record will have to be created by the end user.
When using strict deduplication criteria (looking for records that are nearly for sure duplicates) you may decide that (after a quick scan maybe) that you want to process all the records in the grid view.
In order to process duplicates in this mode you do not need to apply master rules to each duplicate set. Simply select the button "Check All" at the bottom of the grid and all the duplicate sets will be checked. All records are now checked but no master servant selection has been pre-made. Masters will automatically be selected based on the currently selected master rule in the upper right hand of the interface when the end user selects "Merge Checked Objects" and the merge process will be completed.
Currently selected master rule
Any object can have a master rule written for it. When building a master rule you have the option to add unlimited conditions for analyzing either the record (as a whole) or the field values that the record contains.
At the record level, options such as oldest, newest, most recently modified and many other advanced constraints are available. For field values, the STD displays constraints appropriate for the data type of the field being analyzed.
Sample constraints on different field types Rule Dialog Box
After the setup of the initial field and constraint you have the ability to score the positive incident of the constraint with an arbitrary point value. The default rules that ship with the software are based upon a base 100 scoring system but any system can be used. When adding multiple constraints the record is scored in cumulative fashion with each additional positive expression adding to the total score of the records. Additionally negative scores can also be applied to penalize a record.

A complex rule with multiple expressions and a score delta set
The final component to building rules after determining the desired expressions is to decide to assign a score delta. The score delta will be used by the deduper to automatically determine master vs. servant records. The value of the score delta determines how many points away a master record has to score from the next nearest servant record in order to automatically be selected as the master.If the score delta threshold is set and not met:
- The deduper will use the rule as a scoring system only
- It will not be able to select a master
- The scores will be displayed, but the pins will remain grey
If no score delta is set:
- The deduper will score the records
- The record with the highest score will be chosen as the master
- If there is a tie, it will pick the first record in the list.
Methods for Applying Master Rules Against Duplicates (Semi-Automatic Deduplication)
In order to apply rules on a record by record basis the end user can right click in the duplicate key set box. This will display the popup menu with the manual merge options on the left and the available master rules in the right hand column. Simply select the master rule to apply to the duplicate set and the records will be scored according to the rules scoring system.
If a score delta has been set, the master record will be automatically selected (pin turns green) for all the records with the largest cumulative point score that surpass the next nearest servant by at least the delta point value. If no records are determined to be master worthy, then the pins will remain gray (no master selected) but the score of the records will be shown in the interface.
As opposed to right clicking in each duplicate set the interface provides for two more automated processes for apply master rules to larger duplicate sets.
The button "Apply Rule to Checked" will find all records/record sets that have a check mark in the square left box but do not currently have a master vs . servant selection made to them. The records will be scored according to the selected Master Rule and assigned a master vs. servant if the score delta is set and surpassed by any one record. As above, if no records are determined to be master worthy, then the pins will remain gray (no master selected).
The "Apply Rules to All " button which will process all the records in the duplicate grid (regardless of check status) and score and select all records.
Step #3 -Select the merge control options
Update Fields where Master is Empty
Combine Field Values
- Used to select the fields to concatenate
- Available for:
- Multi-select lists
- Text area fields
- Boolean Fields (like Check Boxes -- True always wins)
- Choose Long Text Area Field separator:
- Default is a Carriage Return/Line Feed; you may select a specific character like ; if you like
Salesforce Merge
Merge Objects
Reassign to Master Owner
Choice of which items get ownership changes (ie. Open Activities, Closed Activities, Open Cases, Closed Cases, Open Opportunities, Closed Opportunities, etc.
Options
Non-Master Objects
- Delete - Send the Servant record to the salesforce.com Recycle bin. Can be undeleted with in a 30 day period.
- Prefix - Select which field to prefix and enter the value you would like added
ObjectDefault Prefix Field AccountName ContactLast Name LeadLast Name
Duplicate Campaign Exception
- Use Master
- Use Most Recent
Inactive Ownership Exceptions
- Used to override some limitations of the API in the areas of reassignment of objects owned by inactive users. In some case the API will not allow these types of objects to be reparented. The STD gives you three options.
Reassignment
- Reassign to master object owner and then reparent
- Reassign to current STD user account then reparent
- Keep ownership (requires a spare salesforce.com license) (the deduper will reactivate use, move objects, deactivate user)
Additional log fields
- Logging of additional information in the DemandTools log file
Merging of custom object sub-objects
- Not just referenced ones, ones that truly are beneath the custom object in database hierarchy (we expect very, very rare).
Step #4 - Selecting to process records in batch mode
Once the records you wish to merge are checked, and the master records determined (either by score delta or manual selection) select "Merge Checked Objects" to begin the merge process.

Executing the duplicates batch
Fully Automatic Deduplication
Schedule Automatic Non-Attended Dedupe Scenarios
Instructions for the scheduling of the Single Table Deduplication module are located here: Scheduling Events. The STD is fully capable of working within the Windows scheduler interface and will allow for times execution of scenarios as long as the host cpu has an active internet connection.
Cleaned Account Name: Uses the built in Account Name Cleaning List. To see the Cleaning list tool select the "Edit Account Cleaning List" button. The cleaning list standardizes punctuation, spaces, removes common business prefixes and suffixes. These lists are customizable to your language(s) and/or line of business.
Country Match: The country mapping type is used to standardize field values for the recognized countries of the world. It makes the long name, 2 digit ISO short form, 3 digit ISO short form and the numeric ISO country value all to appear to be matches of each other.
Domain: The domain mapping type is used when mapping web pages and/or email addresses. It allows for the independent analysis of the domain information contained within the URL or the email address. For email addresses it uses any information to the right of the @ sign. For web pages it parses the XXXXX.com portion. This tool allows for easy comparison of web page field vs. web page field or email field vs. email field. It also by nature allows for the comparison of email addresses compared to web pages and vice versa.
Exact : The exact mapping type in the Single Table Deduplication tool is exactly that, a 100% match of every character (assuming no options apply).
FirstName: Uses the built in Nickname List. To see the Nickname tool select the "Edit Nickname List" button at the top of the interface.
The Nickname list allows the deduplication tool to see Bill, William, Billy, etc. as potential duplicates of each other. This list is also customizable by the end user for localization or even in theory for non contact substitution on any field by replacing the nickname list with synonyms.
First XX Letters: Compares only the first XX letters in a field. Text fields are the only applicable field type. The user can select as many letters as they would like to compare.
Numeric: Compares only the numeric values in a field. Other characters that the field contains, such as spaces or punctuation, will be ignored and not seen by the deduper. A field with a value of "Apt # 31" is seen to the deduper as only the numeric characters "31". This is often used with phone number fields, so that (999) 555-1212 will match to 999-555-1212. In this case the deduper will see this as 9995551212.
Relaxed Address Match: Parses the street address to the lowest common denominator. Based on North American standards, it has also proved effective with most country address formats.
With relaxed address match the following addresses are all seen in the lowest common denominator of: 123 Pavillion:
Relaxed NA Phone Match: Removes all non-numeric characters and spaces. If the first is a 1 or 0 removes it. If just 7 digits are left use those seven digits, else just return digits 4 - 10. It will not match the "Phone-word" values and will trim off the "SPOT" in the phone number and only look at the numeric portion.
Salesforce.com ObjectID Match : Used to match any Salesforce.com objects' 15 digit ID to its 18 digit equivalent ID and vice versa.
Street Address Match: The street address match is a slightly more rigid criteria than the relaxed address match tool. It will ignore the differences in street type short forms such as crescent - cres, road - rd, street – st.
Zip 5 and 9 Match: This mapping type will automatically match USPS 5 and 9 digit zip codes together without the need to standardize them first to a common number of digits.
| Type | Description | Mapping Types |
|---|---|---|
| Phonetics engine capable of analyzing words for how they sound when pronounced. Through a technique of removing vowels and analyzing the remaining consonants the fuzzy engines works very well for matching fields with spelling mistakes. | Cleaned Account Name Exact FirstName |
|
|
The transpositional engine allows for fields to appear to be duplicates even if the have differences in their word order. For example Jones, Smith and Jackson will appear to be a duplicate of Jackson, Smith and Jones. |
Cleaned Account Name Exact FirstName Street |
The alpha cleaner extends some of the capabilities of the account name cleaner to other fields for matching. The alpha cleaner is used when you know you only have ascii (North American) data and you would like to ensure that the only characters that are analyzed are the 26 characters of the english alphabet and the numbers 0-9. Any other character that the field may contain will be ignored and not seen by the deduplication matching algorithms.
|
Cleaned Account Name Exact FirstName Numeric Street Zip 5 and 9 |
Field |
Constraint |
Value |
Record |
|
|
Field Name |
|
This will vary based on the field selected. If there is a pick list associated with the field the possible values will be shown in a pop-up menu. |
| Object | Area | Description |
|---|---|---|
| Contacts | SELF SERVICE USERS | When a servant contact is merged into a master contact and the servant contains a self service user the SSU will be inactivated on the servant contact and will be recreated on the master contact. You will lose all login information for the SSU and will have to send the user a new password as this is not moved in the creation of the SSU. Additionally, if the servant was also a "Super User" you will need to manually re-activate this status on the master contact. |
| Contracts | ACTIVE CONTRACTS | The SF.com API does not allow the re-parenting of "Active" contracts. Therefore when duping account records DemandTools CANNOT merge "active" contracts from slave accounts to the selected master account. As a workaround we have added a master rule for "Most Active Contracts", so you can set the master based on active contracts. When you run your Single Table De-dupe on your accounts you will recieve errors if there are still contracts that could not be re-parented. In the log file, it will indicate which slave accounts had active contracts and that it was unable to delete these slave records. You can then select these using SF.com's merge feature and merge them one at a time. |
| Activities | CREATED DATE | In order for tasks and activities older than one year to be moved/merged, the customer must contact salesforce.com to 'open the window' for the API to functionally be able to do this. A 'work' around may be that the customer selects the oldest record as the master. This is only a limitation if you do NOT use the salesforce merge. |
| Contacts | CAMPAIGN MEMBER STATUS DATES | When Deduping contacts and opting to use the "Duplicate Campaign Exception - Use Most Recent, if the most recent resord is on the slave, when it is cloned to the master, the "Last Modified Date" (which is also the status date) will become the date of the merge. If the most recent record is on the master, we do not touch the record, so the original date stays intact. This will happen EVEN If you have "Modify System Fields" turned on by salesforce.com. The only workaround is to create a custom object to track campaign history. There are some 3rd party vendors that have developed software that does this for you. Search the SF AppExchange for a current list. |
| Use Conditions | OWNER ROLE | Using the "Owner Role" in your "Use Conditions" for selecting which objects to check for duplicates will physically convert the Owner Role to the individual user id's. If you update the members of a Role, you should re-do the "use Conditions" (clear and re-add) so the correct user will be inserted. |
: