Salesforce Data Cloud Identity Resolution — Complete Guide 2026 | Module 06
Identity Resolution
Complete Guide 2026
Master the most important and most tested topic in Data Cloud — how match rules, reconciliation rules and rulesets create one Unified Customer Profile from fragmented data
- What Is Identity Resolution and Why It Matters
- How Identity Resolution Works — Step by Step
- Deterministic Matching — Deep Dive
- Probabilistic Matching — Deep Dive
- Match Rules — Configuration and Best Practices
- Reconciliation Rules — Who Wins When Sources Conflict
- Rulesets — Combining Rules in Priority Order
- The Unified Individual — What It Looks Like After IR
- Setting Up Identity Resolution Step by Step
- Real-World Identity Resolution Scenarios
- Troubleshooting Identity Resolution Problems
- Common Identity Resolution Mistakes
- Quick Quiz
- Interview Questions for This Module
Identity Resolution is the process of recognizing that records from different source systems belong to the same real-world customer and merging them into a single Unified Customer Profile. It is the most critical and most technically complex feature in Data Cloud — and it is the most tested topic in every Data Cloud interview and certification.
Without Identity Resolution, a customer who exists in your CRM, Marketing Cloud, e-commerce platform and support system is four separate people inside Data Cloud. Their purchase history, email engagement and support cases are all siloed. Calculated Insights like LTV are computed on fragments. Segments target them multiple times. Agentforce sees incomplete history. The entire promise of Data Cloud — the 360-degree customer view — is impossible without Identity Resolution working correctly.
Identity Resolution does not guess. It applies configurable rules that you define — telling Data Cloud exactly which fields to compare, how strict the matching should be, and which value to keep when two sources disagree. The quality of your Identity Resolution output is entirely determined by the quality of your rules and the quality of your data going in.
Identity Resolution Is Like a Detective Solving an Identity Case
Imagine a detective who receives four separate police reports all describing the same suspect but written by different officers. One report says John Smith, male, 35. Another says J. Smith, brown hair, New York. A third has john.smith@gmail.com and a New York address. The fourth has a phone number and an employer name.
A good detective looks across all four reports and says — the same email, overlapping name and same city. These are all the same person. The detective merges the four reports into one comprehensive profile of the suspect.
That is exactly what Identity Resolution does. It looks across all your DLOs — each one like a different officer's report — finds overlapping signals and merges matching records into one Unified Customer Profile with the complete picture.
What Is Deterministic Matching?
Deterministic matching is an exact field comparison rule that says — if two records have exactly the same value in a specified field, they belong to the same customer. There is no ambiguity, no scoring, no probability threshold. If the values match exactly — they are the same person. If they do not match exactly — they are different people.
Deterministic matching is the gold standard for Identity Resolution. It produces zero false positives — you will never incorrectly merge two different customers using deterministic matching as long as your match fields are truly unique per person. This is why email and phone are the most common deterministic match fields — in theory, each belongs to only one person.
| Common Deterministic Match Fields | Why They Work | Caveat |
|---|---|---|
| Email Address | Globally unique — one person per email | Shared family or company emails cause false merges |
| Phone Number | Unique per person after normalization | Shared landlines, work numbers can cause issues |
| CRM Record ID | System-generated unique identifier | Only works across systems that share the same ID |
| Loyalty Card Number | Issued uniquely per customer | Cards can be transferred between family members |
| National ID / Tax ID | Government-issued unique identifier | Sensitive — must comply with data regulations |
| Passport Number | Globally unique identifier | Changes when passport is renewed |
How Deterministic Matching Works
CRM Contact record has email: john.smith@gmail.com
Marketing Cloud subscriber has email: john.smith@gmail.com
Identity Resolution compares the normalized email field from both records. The values are identical. The deterministic match rule fires. Both records are flagged as belonging to the same customer and merged into one Unified Individual.
This is why Data Transforms normalizing email to lowercase before Identity Resolution runs is so critical. If the CRM sends John.Smith@Gmail.com and Marketing Cloud sends john.smith@gmail.com — without LOWER transform the deterministic match fails even though these are clearly the same email.
Always run Data Transforms to normalize match fields before Identity Resolution. Email must be lowercased. Phone must have all non-numeric characters stripped. Names should be trimmed of whitespace. Deterministic matching is only as reliable as the consistency of the values being compared — and that consistency comes from transforms, not from Identity Resolution itself.
What Is Probabilistic Matching?
Probabilistic matching is a statistical approach that computes a match score across multiple fields and merges records if the score exceeds a configured threshold. Instead of requiring an exact field match, it says — these records share enough overlapping signals that they are probably the same person.
Probabilistic matching is essential for anonymous or partially identified customers. A website visitor who has never logged in has no email or phone on their web event records. But they might have a name, a city and a rough geographic location that, combined, suggest they are probably the same person as a known CRM contact. Probabilistic matching catches these cases that deterministic matching misses entirely.
The tradeoff is accuracy. Probabilistic matching can produce false positives — merging records that happen to share signals but are actually different people. Two people named John Smith in New York who both shop online might get incorrectly merged. The threshold setting controls the tradeoff between match rate and false positive rate.
How Probabilistic Matching Works
Each field in the probabilistic rule contributes a weighted score. Exact first name match contributes 20 points. Exact last name match contributes 25 points. Same city contributes 15 points. Same device fingerprint contributes 30 points. If the total score exceeds the configured threshold — say 70 points — the records are merged.
A higher threshold means fewer merges but higher confidence. A lower threshold means more merges but higher false positive risk. The optimal threshold depends on your data quality and business risk tolerance — a financial services company would set a very high threshold to avoid incorrect profile merges that could have compliance implications.
Configuration: Field = Contact Point Email DMO → Email field. Match Type = Exact. Normalize = Yes (lowercase automatically applied).
Logic: If two Individual records share exactly the same normalized email address they are merged into one Unified Individual.
When to use: Always — this should be the first rule in every implementation. Email is the most reliable unique identifier for consumer matching.
Caveat: Add a transform to exclude known shared emails (family@, info@, admin@, noreply@) before this rule runs.
Configuration: Field = Contact Point Phone DMO → Phone Number field. Match Type = Exact. Normalize = Yes.
Logic: If two Individual records share exactly the same normalized phone number (digits only) they are merged.
When to use: As second deterministic rule alongside email. Catches customers who have different emails in different systems but same phone.
Caveat: Requires REGEXP_REPLACE transform to strip all non-numeric characters before matching. +91-9876543210 and 9876543210 must become the same value.
Configuration: Multiple fields with weights. First Name (exact) = 25 points. Last Name (exact) = 30 points. City (exact) = 20 points. Postal Code (exact) = 25 points. Threshold = 70 points to merge.
Logic: Records that score 70 or above across name and location signals are probabilistically merged. This catches customers who have no shared email or phone but are clearly the same person from other signals.
When to use: As third rule after both deterministic rules. Only applies to records not already matched by email or phone.
Caveat: Common names in common cities (John Smith in New York) risk false merges. Set high thresholds and add additional signals (postal code, date of birth) to reduce false positive rate.
After Identity Resolution identifies that two records belong to the same customer, it must decide which field values go on the Unified Profile. CRM says the customer's name is John. The app registration says Johnny. Marketing Cloud says Jonathan. All three are the same person. Which name appears on the Unified Profile?
This is the job of Reconciliation Rules. They define the logic for resolving conflicts between source field values. You configure a reconciliation rule for each field that might have conflicting values across sources.
| Field | Recommended Reconciliation Rule | Why |
|---|---|---|
| Email Address | Most Recent | Email addresses change — newest is most likely current |
| Phone Number | Most Recent | Phone numbers change — newest is most likely active |
| First Name | Source Priority (CRM first) | CRM is typically the master system for name data |
| Last Name | Source Priority (CRM first) | CRM is typically the master system for name data |
| Mailing Address | Most Recent | Address changes — most recent update is most accurate |
| Marketing Consent | Most Recent | Consent decisions must reflect the latest customer choice |
| Customer Segment / Tier | Source Priority (CRM first) | CRM holds the official tier classification |
What Is a Ruleset?
A Ruleset is a named collection of Match Rules and Reconciliation Rules that are applied together during one Identity Resolution run. You can have multiple Rulesets in a Data Cloud org — typically one per customer type or data domain.
Within a Ruleset, Match Rules are applied in priority order. If Rule 1 (email match) finds a match, the records are merged and Rule 2 is not applied to them. Rule 2 only applies to records that Rule 1 did not match. This cascading logic allows you to start with the highest confidence rules and fall through to less certain rules only for records that remain unmatched.
Example Ruleset — Consumer Retail Company
- Rule 1 — Email Exact Match (Deterministic): Highest confidence. Matches 65% of records. Zero false positives.
- Rule 2 — Phone Exact Match (Deterministic): Applied only to records not matched by Rule 1. Matches additional 15%. Very low false positives after normalization.
- Rule 3 — Name + City + Postal Code (Probabilistic, threshold 75): Applied to remaining 20% of unmatched records. Catches cross-device anonymous visitors. Some false positive risk for common names.
This layered approach maximizes match rate while controlling false positive risk. The most accurate rules run first on all records. Less accurate rules only attempt to match the remaining unmatched population.
Rules within a Ruleset are applied in cascading order — not all at once. A record matched by Rule 1 is not evaluated again by Rule 2. This is critical for preventing over-matching. If you accidentally set probabilistic Rule 3 as the first rule, it will attempt to match ALL records probabilistically including many that would have been matched more accurately by the deterministic email rule.
Unified Individual DMO
After Identity Resolution runs, each customer cluster produces one Unified Individual DMO record. This is the master customer profile that all downstream Data Cloud features use. It does not replace the source Individual records — those still exist in their original DMOs. The Unified Individual is an additional layer that represents the merged view.
| Aspect | Detail |
|---|---|
| Created By | Identity Resolution automatically after matching and reconciliation |
| Count | One Unified Individual per cluster of matched source records |
| Field Values | Determined by Reconciliation Rules — best value from all sources |
| Source Records | All matching Individual records linked back via relationship |
| Used For | Segments, Calculated Insights, Activation, Agentforce context |
| Updated When | Identity Resolution reruns — typically after each DMO refresh |
| Individual Count | Should be fewer than raw Individual records — duplicates eliminated |
| Match Rate KPI | Monitor Unified Individual count vs Individual count — higher merge rate = better resolution |
📝 Before Identity Resolution — 4 Separate Records
CRM Contact: John Smith, john.smith@gmail.com, +1-555-0101, New York. Marketing Cloud Subscriber: J Smith, john.smith@gmail.com, marketing opt-in. App User: johnsmith_user, john.smith@gmail.com, iOS device. Support System: John Smith, 5550101, 3 open cases. These are four records in four DLOs. No connections between them.
✅ After Identity Resolution — One Unified Individual
Unified Individual ID: UI-00001. Name: John Smith (Source Priority — CRM). Email: john.smith@gmail.com (matched across all 3 sources). Phone: 5550101 (from CRM and Support — Most Frequent). Location: New York (from CRM — Source Priority). Marketing Consent: Opt-in (from Marketing Cloud — Most Recent). Linked source records: 4. Open Cases: 3. Purchase History: linked via Sales Order DMO.
Verify Data Transforms are complete
Before touching Identity Resolution, confirm that all Data Transforms are configured and running. Email must be lowercased. Phone must be normalized. Test records must be filtered. Identity Resolution quality is directly determined by data quality — this prerequisite step is non-negotiable.
Navigate to Identity Resolution in Data Cloud Setup
From the Data Cloud app, go to Setup → Identity Resolution → New Ruleset. Give the Ruleset a descriptive name — Consumer_B2C_Ruleset or Account_B2B_Ruleset depending on your context.
Add Match Rules in priority order
Add your first Match Rule — Email Exact Match. Set the field as Contact Point Email DMO → normalized email. Set match type as Exact. Save. Add Rule 2 — Phone Exact Match. Repeat for any probabilistic rules needed. The order you add them determines the priority cascade during execution.
Configure Reconciliation Rules for each field
For each field that might have conflicting values, configure a Reconciliation Rule. Set email to Most Recent. Set name fields to Source Priority with CRM ranked highest. Set address to Most Recent. These rules determine what appears on the Unified Individual when sources disagree.
Run Identity Resolution on a sample first
Before running on all data, test on a sample — 10% of your Individual records. Check the output. How many Unified Individuals were created? What is the merge ratio? Are there any obviously incorrect merges? Fix rule issues before running at full scale. Incorrect merges on production data are difficult to reverse.
Run full Identity Resolution
Once the sample validates correctly, run Identity Resolution on the full dataset. Depending on data volume this may take minutes to hours. Monitor the job status in Data Cloud Setup. When complete check the Unified Individual count compared to raw Individual count — the ratio tells you your match rate.
Schedule ongoing Identity Resolution runs
Configure Identity Resolution to run automatically after each DMO refresh cycle. New data arriving must be evaluated for matches against existing Unified Profiles. An incremental run evaluates only new or changed records — much faster than a full re-run. Configure this schedule to keep Unified Profiles current.
🛒 Retail — Cross-Channel Customer Recognition
A global retailer had customers who shopped in-store with a loyalty card, online with an email account and via the mobile app with a social login. Three different systems — three different identifiers. The IR strategy used loyalty card number as Rule 1 (deterministic — highest confidence since they issued the cards). Email as Rule 2 (deterministic — catches online and app users). Name plus postal code as Rule 3 (probabilistic at threshold 80 — catches in-store cash buyers who refused loyalty card). Result: match rate improved from 45% to 87% of transactions linked to a known customer profile.
🏢 B2B Company — Contact-to-Account Matching
A B2B SaaS company had the same contact person appearing in CRM as the procurement contact, in Marketing Cloud as a newsletter subscriber and in the support system under a different name variation. The challenge was that B2B contacts often use their work email for marketing but a personal email for support. IR was configured with work email as Rule 1, then email domain plus company name as Rule 2 (deterministic — same email domain and same company = probably same org contact), then name plus employer as Rule 3 (probabilistic). This linked contact records that shared corporate affiliation even when personal details varied.
🏥 Financial Services — High Accuracy Requirement
A bank could not risk false positive merges — incorrectly linking two different customers could have regulatory and legal consequences. Their IR strategy was deliberately conservative. Rule 1: National ID number exact match (deterministic — 100% confidence). Rule 2: Account number exact match (deterministic). No probabilistic rules were configured at all. Match rate was lower — 72% — but accuracy was near perfect. Unmatched records were flagged for manual review by a data quality team rather than using probabilistic matching that could cause false merges.
| Problem | Likely Cause | How to Fix |
|---|---|---|
| Match rate too low — most profiles not merging | Data quality issue — email case mismatch, phone format variation | Check Data Transform normalization — verify LOWER and REGEXP_REPLACE are running correctly |
| False merges — different customers merged incorrectly | Shared email/phone used as match key — or probabilistic threshold too low | Add shared contact exclusion filter in transforms, raise probabilistic threshold, add more distinguishing signals |
| Unified Individual count higher than expected | Rules are too strict — same customer creating multiple profiles | Review rule priority, check if phone normalization is working, add additional match fields |
| IR job failing or timing out | Too much data volume for one run, or DLO not fully refreshed | Run incremental instead of full run, verify DMO refresh completed before IR starts |
| Correct email match not firing | Contact Point Email DMO not mapped or Individual ID not set | Check DMO field mapping — email field must be mapped and Individual ID must be set |
| Name field on Unified Profile showing wrong value | Reconciliation Rule not configured for name field | Add Reconciliation Rule for first name and last name with Source Priority — CRM ranked first |
Mistake 1: Running Identity Resolution before Data Transforms are in place
The most damaging mistake. Running IR on unnormalized data produces incorrect matches that are difficult to reverse. Email case mismatches cause missed merges. Shared emails cause false merges. Once Unified Profiles are created from bad data, cleaning them requires resetting Identity Resolution and rerunning — which is time-consuming and may require downstream segment and insight rebuilding. Always complete and validate Data Transforms before touching Identity Resolution configuration.
Mistake 2: Using probabilistic rules before deterministic rules
Configuring probabilistic name-plus-city matching as the first rule — before email and phone deterministic rules. This causes probabilistic matching to attempt to merge ALL records including those that have perfectly matchable emails. Records that should be confidently merged by exact email match instead get evaluated probabilistically — with risk of wrong merges and missed correct merges. Always put deterministic rules first in priority order, probabilistic rules last.
Mistake 3: Not configuring Reconciliation Rules for important fields
Running Identity Resolution without setting Reconciliation Rules for key profile fields. When sources conflict and no reconciliation rule is configured, Data Cloud applies a default rule that may not match business requirements. CRM name may lose to an app-entered nickname because no Source Priority rule told Data Cloud that CRM is the master. Always configure explicit Reconciliation Rules for every field that appears in multiple sources.
Mistake 4: Not testing on a sample before full run
Running Identity Resolution on the full dataset on the first attempt without any sample validation. Incorrect rules discovered after a full run on 10 million records require resetting all Unified Individuals and rerunning — a process that takes hours and disrupts active segments and activations. Always run on a 5-10% sample first, review the output manually, validate merge decisions are correct, then run at full scale.
Mistake 5: Ignoring the Individual ID consistency requirement across sources
Identity Resolution links records via the Individual ID field in each DMO. If CRM uses Salesforce Contact IDs as Individual IDs and Marketing Cloud uses Subscriber Keys as Individual IDs — these are completely different values. Identity Resolution cannot link them without a match rule that bridges the gap. Either use a common customer identifier across all sources, or configure match rules that explicitly bridge different identifier systems using overlapping contact point data.
Q1: B — Email normalization missing | Q2: C — Probabilistic threshold too low | Q3: C — Source Priority with CRM first | Q4: C — Cascade — no other rules apply | Q5: B — Deterministic only on high-confidence fields
Identity Resolution is the process of recognizing that records from different source systems belong to the same real-world customer and merging them into a single Unified Customer Profile. It is the most important feature because without it every other Data Cloud capability is compromised. A customer who exists in CRM, Marketing Cloud, the website and the support system is four separate people inside Data Cloud without Identity Resolution. Their purchase history is split across profiles. Their LTV calculation is wrong. Segments target them multiple times. Agentforce sees incomplete context. The entire promise of Data Cloud — a complete 360 customer view — is only possible after Identity Resolution correctly merges fragmented records into one complete Unified Individual.
Deterministic matching is exact comparison — if two records share exactly the same value in a specified field like email or phone, they are the same person. It produces zero false positives and should always be configured first. Probabilistic matching computes a score across multiple fields — name, city, postal code, device — and merges records exceeding a threshold. It catches customers who have no shared unique identifiers but are clearly the same person from other signals. I always use deterministic rules first in priority order because they are the most accurate. Probabilistic rules are added as secondary fallback layers that only apply to records the deterministic rules could not match. The threshold for probabilistic matching depends on risk tolerance — financial services companies use very high thresholds to prevent false merges that could have compliance implications.
Reconciliation Rules define which field value goes on the Unified Customer Profile when multiple source records that have been merged disagree on a field value. After Identity Resolution merges a CRM contact and a Marketing Cloud subscriber into one Unified Individual, both source records might have different values for the customer's first name — John in CRM and Johnny in the app. Without a Reconciliation Rule Data Cloud applies a default logic that may not match your business requirements. With a Source Priority reconciliation rule that ranks CRM first, the Unified Profile shows John from CRM. The three reconciliation options are Most Recent which uses the value from the most recently updated source, Most Frequent which uses the value appearing in the most source records, and Source Priority which always uses the value from the highest-ranked source system. Most Recent is best for contact information that changes over time. Source Priority is best for fields where one system is the defined master.
This is almost always a data normalization problem. My diagnosis sequence would be as follows. First I query the Contact Point Email DMO directly and look at the actual email values for records from both sources. If CRM shows John@Gmail.com with capital J and Marketing Cloud shows john@gmail.com in lowercase — that is the root cause. Exact match is case-sensitive, so these two values are not equal. Second I check the Data Transform configuration for both Data Streams — is LOWER applied to email before it reaches the DMO? If no transform exists or LOWER is missing, I add it and rerun the transform. Third I verify the Contact Point Email DMO has Individual ID mapped for both sources — without Individual ID, email records in the DMO are orphaned and Identity Resolution cannot evaluate them. Fourth I rerun Identity Resolution after fixing the transform and check whether the match rate improves. If the issue was only normalization, match rate should increase significantly on the next run.
I would design a layered Identity Resolution strategy with three rules in priority order. Before any IR configuration I would ensure Data Transforms normalize all match fields — LOWER for email, REGEXP_REPLACE for phone, TRIM for names across all four source DLOs. Rule 1 would be deterministic email matching using the Contact Point Email DMO. Email is the highest confidence unique identifier and I would expect this to match 60 to 70 percent of the 20 million profiles since most customers use the same email across channels. I would add a transform to exclude known shared emails like family or corporate distribution addresses before this rule runs. Rule 2 would be deterministic phone matching using Contact Point Phone DMO after normalization. This catches customers who registered with different email addresses in different systems but the same phone — perhaps a work email in CRM and personal email in the app. Rule 3 would be probabilistic matching using first name, last name and postal code with a threshold of 75 points. This catches anonymous app users and website visitors who have never provided an email but whose name and location match a known CRM contact. I would test this strategy on a 10 percent sample first, review the output for false positives particularly in Rule 3, adjust the probabilistic threshold if needed, then run at full scale. I would schedule incremental runs to run daily after DMO refreshes to keep Unified Profiles current.