Salesforce Data Cloud Identity Resolution — Complete Guide 2026 | Module 06

📅 Data Cloud Course

Salesforce Data Cloud Identity Resolution Complete Guide 2026 | Module 06

☁ Data Cloud Complete Guide — Module 06

Identity Resolution
Complete Guide 2026

Master the most important and most tested topic in Data Cloud — how match rules, reconciliation rules and rulesets create one Unified Customer Profile from fragmented data

📅 Updated May 2026 ⏲ 22 min read 🎓 Beginner to Advanced 🆕 Module 6 of 15

Course Progress

Module 6 / 15

📚 What You Will Learn in This Module

What Is Identity Resolution and Why It Matters
How Identity Resolution Works — Step by Step
Deterministic Matching — Deep Dive
Probabilistic Matching — Deep Dive
Match Rules — Configuration and Best Practices
Reconciliation Rules — Who Wins When Sources Conflict
Rulesets — Combining Rules in Priority Order
The Unified Individual — What It Looks Like After IR
Setting Up Identity Resolution Step by Step
Real-World Identity Resolution Scenarios
Troubleshooting Identity Resolution Problems
Common Identity Resolution Mistakes
Quick Quiz
Interview Questions for This Module

📍 What Is Identity Resolution and Why It Matters

The process that turns fragmented data into a true customer 360

Identity Resolution is the process of recognizing that records from different source systems belong to the same real-world customer and merging them into a single Unified Customer Profile. It is the most critical and most technically complex feature in Data Cloud — and it is the most tested topic in every Data Cloud interview and certification.

Without Identity Resolution, a customer who exists in your CRM, Marketing Cloud, e-commerce platform and support system is four separate people inside Data Cloud. Their purchase history, email engagement and support cases are all siloed. Calculated Insights like LTV are computed on fragments. Segments target them multiple times. Agentforce sees incomplete history. The entire promise of Data Cloud — the 360-degree customer view — is impossible without Identity Resolution working correctly.

Identity Resolution does not guess. It applies configurable rules that you define — telling Data Cloud exactly which fields to compare, how strict the matching should be, and which value to keep when two sources disagree. The quality of your Identity Resolution output is entirely determined by the quality of your rules and the quality of your data going in.

💡 Real World Analogy

Identity Resolution Is Like a Detective Solving an Identity Case

Imagine a detective who receives four separate police reports all describing the same suspect but written by different officers. One report says John Smith, male, 35. Another says J. Smith, brown hair, New York. A third has john.smith@gmail.com and a New York address. The fourth has a phone number and an employer name.

A good detective looks across all four reports and says — the same email, overlapping name and same city. These are all the same person. The detective merges the four reports into one comprehensive profile of the suspect.

That is exactly what Identity Resolution does. It looks across all your DLOs — each one like a different officer's report — finds overlapping signals and merges matching records into one Unified Customer Profile with the complete picture.

📍 How Identity Resolution Works — Step by Step

The complete process from raw DMO records to Unified Individual

👥 Identity Resolution — Complete Process Flow

READ Individual DMO Records

Data Cloud reads all Individual records across every mapped DMO — CRM contacts, Marketing Cloud subscribers, app users, website visitors. These are the raw inputs to Identity Resolution.

↓

APPLY Match Rules

For each pair of records, apply configured Match Rules. Deterministic rules check for exact field matches — same email = same person. Probabilistic rules compute a match score across multiple fields and merge if score exceeds the threshold.

↓

GROUP Matching Records

All records that matched each other — directly or transitively — are grouped together. If CRM Contact matches Marketing Cloud subscriber who matches App User — all three are grouped as one customer cluster.

↓

APPLY Reconciliation Rules

When sources disagree on a field value — CRM says John, app says Johnny — Reconciliation Rules decide which value goes on the Unified Profile. Most Recent, Most Frequent or Source Priority determines the winner.

↓

CREATE Unified Individual

One Unified Individual DMO record is created per customer cluster — combining the best attribute values from all matching source records. This is the Unified Customer Profile used by segments, insights and Agentforce.

↓

LINK Source Records to Unified Profile

All source Individual records that were merged maintain their original data but are linked to the Unified Individual via a relationship. Source data is preserved — only the Unified Profile is used for downstream features.

📍 Deterministic Matching — Deep Dive

Exact matching — the most accurate and most commonly used approach

What Is Deterministic Matching?

Deterministic matching is an exact field comparison rule that says — if two records have exactly the same value in a specified field, they belong to the same customer. There is no ambiguity, no scoring, no probability threshold. If the values match exactly — they are the same person. If they do not match exactly — they are different people.

Deterministic matching is the gold standard for Identity Resolution. It produces zero false positives — you will never incorrectly merge two different customers using deterministic matching as long as your match fields are truly unique per person. This is why email and phone are the most common deterministic match fields — in theory, each belongs to only one person.

Common Deterministic Match Fields	Why They Work	Caveat
Email Address	Globally unique — one person per email	Shared family or company emails cause false merges
Phone Number	Unique per person after normalization	Shared landlines, work numbers can cause issues
CRM Record ID	System-generated unique identifier	Only works across systems that share the same ID
Loyalty Card Number	Issued uniquely per customer	Cards can be transferred between family members
National ID / Tax ID	Government-issued unique identifier	Sensitive — must comply with data regulations
Passport Number	Globally unique identifier	Changes when passport is renewed

How Deterministic Matching Works

CRM Contact record has email: john.smith@gmail.com

Marketing Cloud subscriber has email: john.smith@gmail.com

Identity Resolution compares the normalized email field from both records. The values are identical. The deterministic match rule fires. Both records are flagged as belonging to the same customer and merged into one Unified Individual.

This is why Data Transforms normalizing email to lowercase before Identity Resolution runs is so critical. If the CRM sends John.Smith@Gmail.com and Marketing Cloud sends john.smith@gmail.com — without LOWER transform the deterministic match fails even though these are clearly the same email.

📍 Critical Best Practice

Always run Data Transforms to normalize match fields before Identity Resolution. Email must be lowercased. Phone must have all non-numeric characters stripped. Names should be trimmed of whitespace. Deterministic matching is only as reliable as the consistency of the values being compared — and that consistency comes from transforms, not from Identity Resolution itself.

📍 Probabilistic Matching — Deep Dive

Statistical matching for when exact fields are not available

What Is Probabilistic Matching?

Probabilistic matching is a statistical approach that computes a match score across multiple fields and merges records if the score exceeds a configured threshold. Instead of requiring an exact field match, it says — these records share enough overlapping signals that they are probably the same person.

Probabilistic matching is essential for anonymous or partially identified customers. A website visitor who has never logged in has no email or phone on their web event records. But they might have a name, a city and a rough geographic location that, combined, suggest they are probably the same person as a known CRM contact. Probabilistic matching catches these cases that deterministic matching misses entirely.

The tradeoff is accuracy. Probabilistic matching can produce false positives — merging records that happen to share signals but are actually different people. Two people named John Smith in New York who both shop online might get incorrectly merged. The threshold setting controls the tradeoff between match rate and false positive rate.

✅ Deterministic Matching

Exact field value comparison

100% confidence when correct fields used

Zero false positives from the match itself

Misses customers without shared identifiers

Best for known, logged-in customers

Works on: email, phone, CRM ID, loyalty number

Use this FIRST in every implementation

📊 Probabilistic Matching

Statistical score across multiple fields

Confidence depends on threshold setting

Risk of false positives at lower thresholds

Catches customers without shared identifiers

Best for anonymous or cross-device matching

Works on: name + city + device + behavior

Use this SECOND as a secondary fallback layer

How Probabilistic Matching Works

Each field in the probabilistic rule contributes a weighted score. Exact first name match contributes 20 points. Exact last name match contributes 25 points. Same city contributes 15 points. Same device fingerprint contributes 30 points. If the total score exceeds the configured threshold — say 70 points — the records are merged.

A higher threshold means fewer merges but higher confidence. A lower threshold means more merges but higher false positive risk. The optimal threshold depends on your data quality and business risk tolerance — a financial services company would set a very high threshold to avoid incorrect profile merges that could have compliance implications.

📍 Match Rules — Configuration and Best Practices

Exactly how to configure rules that produce accurate results

📧

Match Rule 1 — Email Address (Deterministic)

Highest accuracy — configure first

Configuration: Field = Contact Point Email DMO → Email field. Match Type = Exact. Normalize = Yes (lowercase automatically applied).

Logic: If two Individual records share exactly the same normalized email address they are merged into one Unified Individual.

When to use: Always — this should be the first rule in every implementation. Email is the most reliable unique identifier for consumer matching.

Caveat: Add a transform to exclude known shared emails (family@, info@, admin@, noreply@) before this rule runs.

📞

Match Rule 2 — Phone Number (Deterministic)

High accuracy after normalization

Configuration: Field = Contact Point Phone DMO → Phone Number field. Match Type = Exact. Normalize = Yes.

Logic: If two Individual records share exactly the same normalized phone number (digits only) they are merged.

When to use: As second deterministic rule alongside email. Catches customers who have different emails in different systems but same phone.

Caveat: Requires REGEXP_REPLACE transform to strip all non-numeric characters before matching. +91-9876543210 and 9876543210 must become the same value.

👥

Match Rule 3 — Name + City (Probabilistic)

Secondary fallback for anonymous matching

Configuration: Multiple fields with weights. First Name (exact) = 25 points. Last Name (exact) = 30 points. City (exact) = 20 points. Postal Code (exact) = 25 points. Threshold = 70 points to merge.

Logic: Records that score 70 or above across name and location signals are probabilistically merged. This catches customers who have no shared email or phone but are clearly the same person from other signals.

When to use: As third rule after both deterministic rules. Only applies to records not already matched by email or phone.

Caveat: Common names in common cities (John Smith in New York) risk false merges. Set high thresholds and add additional signals (postal code, date of birth) to reduce false positive rate.

📍 Reconciliation Rules — Who Wins When Sources Conflict

When two sources disagree on a field value — which one goes on the Unified Profile?

After Identity Resolution identifies that two records belong to the same customer, it must decide which field values go on the Unified Profile. CRM says the customer's name is John. The app registration says Johnny. Marketing Cloud says Jonathan. All three are the same person. Which name appears on the Unified Profile?

This is the job of Reconciliation Rules. They define the logic for resolving conflicts between source field values. You configure a reconciliation rule for each field that might have conflicting values across sources.

🕑

Most Recent

Use the value from the most recently updated source record. Assumes newer data is more accurate.

Best for: Email, phone, address — fields that change over time

📊

Most Frequent

Use the value that appears most often across all matching source records. Assumes the majority is correct.

Best for: Name, city — fields that should be consistent

🏆

Source Priority

Always use the value from the highest-ranked source system. You define the priority order of sources.

Best for: Critical fields where one system is the master — always use CRM name over app name

Field	Recommended Reconciliation Rule	Why
Email Address	Most Recent	Email addresses change — newest is most likely current
Phone Number	Most Recent	Phone numbers change — newest is most likely active
First Name	Source Priority (CRM first)	CRM is typically the master system for name data
Last Name	Source Priority (CRM first)	CRM is typically the master system for name data
Mailing Address	Most Recent	Address changes — most recent update is most accurate
Marketing Consent	Most Recent	Consent decisions must reflect the latest customer choice
Customer Segment / Tier	Source Priority (CRM first)	CRM holds the official tier classification

📍 Rulesets — Combining Rules in Priority Order

How to organize multiple match rules into a complete identity strategy

What Is a Ruleset?

A Ruleset is a named collection of Match Rules and Reconciliation Rules that are applied together during one Identity Resolution run. You can have multiple Rulesets in a Data Cloud org — typically one per customer type or data domain.

Within a Ruleset, Match Rules are applied in priority order. If Rule 1 (email match) finds a match, the records are merged and Rule 2 is not applied to them. Rule 2 only applies to records that Rule 1 did not match. This cascading logic allows you to start with the highest confidence rules and fall through to less certain rules only for records that remain unmatched.

Example Ruleset — Consumer Retail Company

Rule 1 — Email Exact Match (Deterministic): Highest confidence. Matches 65% of records. Zero false positives.
Rule 2 — Phone Exact Match (Deterministic): Applied only to records not matched by Rule 1. Matches additional 15%. Very low false positives after normalization.
Rule 3 — Name + City + Postal Code (Probabilistic, threshold 75): Applied to remaining 20% of unmatched records. Catches cross-device anonymous visitors. Some false positive risk for common names.

This layered approach maximizes match rate while controlling false positive risk. The most accurate rules run first on all records. Less accurate rules only attempt to match the remaining unmatched population.

⚠️ Important Interview Point

Rules within a Ruleset are applied in cascading order — not all at once. A record matched by Rule 1 is not evaluated again by Rule 2. This is critical for preventing over-matching. If you accidentally set probabilistic Rule 3 as the first rule, it will attempt to match ALL records probabilistically including many that would have been matched more accurately by the deterministic email rule.

📍 The Unified Individual — What It Looks Like After IR

Understanding the output of Identity Resolution

Unified Individual DMO

After Identity Resolution runs, each customer cluster produces one Unified Individual DMO record. This is the master customer profile that all downstream Data Cloud features use. It does not replace the source Individual records — those still exist in their original DMOs. The Unified Individual is an additional layer that represents the merged view.

Aspect	Detail
Created By	Identity Resolution automatically after matching and reconciliation
Count	One Unified Individual per cluster of matched source records
Field Values	Determined by Reconciliation Rules — best value from all sources
Source Records	All matching Individual records linked back via relationship
Used For	Segments, Calculated Insights, Activation, Agentforce context
Updated When	Identity Resolution reruns — typically after each DMO refresh
Individual Count	Should be fewer than raw Individual records — duplicates eliminated
Match Rate KPI	Monitor Unified Individual count vs Individual count — higher merge rate = better resolution

🌎 Example Output

Before and After Identity Resolution — A Real Example

📝 Before Identity Resolution — 4 Separate Records

CRM Contact: John Smith, john.smith@gmail.com, +1-555-0101, New York. Marketing Cloud Subscriber: J Smith, john.smith@gmail.com, marketing opt-in. App User: johnsmith_user, john.smith@gmail.com, iOS device. Support System: John Smith, 5550101, 3 open cases. These are four records in four DLOs. No connections between them.

✅ After Identity Resolution — One Unified Individual

Unified Individual ID: UI-00001. Name: John Smith (Source Priority — CRM). Email: john.smith@gmail.com (matched across all 3 sources). Phone: 5550101 (from CRM and Support — Most Frequent). Location: New York (from CRM — Source Priority). Marketing Consent: Opt-in (from Marketing Cloud — Most Recent). Linked source records: 4. Open Cases: 3. Purchase History: linked via Sales Order DMO.

📍 Setting Up Identity Resolution Step by Step

The exact process for configuring Identity Resolution in Data Cloud

Verify Data Transforms are complete

Before touching Identity Resolution, confirm that all Data Transforms are configured and running. Email must be lowercased. Phone must be normalized. Test records must be filtered. Identity Resolution quality is directly determined by data quality — this prerequisite step is non-negotiable.

Navigate to Identity Resolution in Data Cloud Setup

From the Data Cloud app, go to Setup → Identity Resolution → New Ruleset. Give the Ruleset a descriptive name — Consumer_B2C_Ruleset or Account_B2B_Ruleset depending on your context.

Add Match Rules in priority order

Add your first Match Rule — Email Exact Match. Set the field as Contact Point Email DMO → normalized email. Set match type as Exact. Save. Add Rule 2 — Phone Exact Match. Repeat for any probabilistic rules needed. The order you add them determines the priority cascade during execution.

Configure Reconciliation Rules for each field

For each field that might have conflicting values, configure a Reconciliation Rule. Set email to Most Recent. Set name fields to Source Priority with CRM ranked highest. Set address to Most Recent. These rules determine what appears on the Unified Individual when sources disagree.

Run Identity Resolution on a sample first

Before running on all data, test on a sample — 10% of your Individual records. Check the output. How many Unified Individuals were created? What is the merge ratio? Are there any obviously incorrect merges? Fix rule issues before running at full scale. Incorrect merges on production data are difficult to reverse.

Run full Identity Resolution

Once the sample validates correctly, run Identity Resolution on the full dataset. Depending on data volume this may take minutes to hours. Monitor the job status in Data Cloud Setup. When complete check the Unified Individual count compared to raw Individual count — the ratio tells you your match rate.

Schedule ongoing Identity Resolution runs

Configure Identity Resolution to run automatically after each DMO refresh cycle. New data arriving must be evaluated for matches against existing Unified Profiles. An incremental run evaluates only new or changed records — much faster than a full re-run. Configure this schedule to keep Unified Profiles current.

📍 Real-World Identity Resolution Scenarios

How different industries configure Identity Resolution for their specific challenges

🌎 Real-World IR Scenarios

Identity Resolution Challenges Across Industries

🛒 Retail — Cross-Channel Customer Recognition

A global retailer had customers who shopped in-store with a loyalty card, online with an email account and via the mobile app with a social login. Three different systems — three different identifiers. The IR strategy used loyalty card number as Rule 1 (deterministic — highest confidence since they issued the cards). Email as Rule 2 (deterministic — catches online and app users). Name plus postal code as Rule 3 (probabilistic at threshold 80 — catches in-store cash buyers who refused loyalty card). Result: match rate improved from 45% to 87% of transactions linked to a known customer profile.

🏢 B2B Company — Contact-to-Account Matching

A B2B SaaS company had the same contact person appearing in CRM as the procurement contact, in Marketing Cloud as a newsletter subscriber and in the support system under a different name variation. The challenge was that B2B contacts often use their work email for marketing but a personal email for support. IR was configured with work email as Rule 1, then email domain plus company name as Rule 2 (deterministic — same email domain and same company = probably same org contact), then name plus employer as Rule 3 (probabilistic). This linked contact records that shared corporate affiliation even when personal details varied.

🏥 Financial Services — High Accuracy Requirement

A bank could not risk false positive merges — incorrectly linking two different customers could have regulatory and legal consequences. Their IR strategy was deliberately conservative. Rule 1: National ID number exact match (deterministic — 100% confidence). Rule 2: Account number exact match (deterministic). No probabilistic rules were configured at all. Match rate was lower — 72% — but accuracy was near perfect. Unmatched records were flagged for manual review by a data quality team rather than using probabilistic matching that could cause false merges.

📍 Troubleshooting Identity Resolution Problems

Diagnosing and fixing the most common IR failures

Problem	Likely Cause	How to Fix
Match rate too low — most profiles not merging	Data quality issue — email case mismatch, phone format variation	Check Data Transform normalization — verify LOWER and REGEXP_REPLACE are running correctly
False merges — different customers merged incorrectly	Shared email/phone used as match key — or probabilistic threshold too low	Add shared contact exclusion filter in transforms, raise probabilistic threshold, add more distinguishing signals
Unified Individual count higher than expected	Rules are too strict — same customer creating multiple profiles	Review rule priority, check if phone normalization is working, add additional match fields
IR job failing or timing out	Too much data volume for one run, or DLO not fully refreshed	Run incremental instead of full run, verify DMO refresh completed before IR starts
Correct email match not firing	Contact Point Email DMO not mapped or Individual ID not set	Check DMO field mapping — email field must be mapped and Individual ID must be set
Name field on Unified Profile showing wrong value	Reconciliation Rule not configured for name field	Add Reconciliation Rule for first name and last name with Source Priority — CRM ranked first

📍 Common Identity Resolution Mistakes

What goes wrong in real implementations — and how to avoid it

❌

Mistake 1: Running Identity Resolution before Data Transforms are in place

The most damaging mistake. Running IR on unnormalized data produces incorrect matches that are difficult to reverse. Email case mismatches cause missed merges. Shared emails cause false merges. Once Unified Profiles are created from bad data, cleaning them requires resetting Identity Resolution and rerunning — which is time-consuming and may require downstream segment and insight rebuilding. Always complete and validate Data Transforms before touching Identity Resolution configuration.

❌

Mistake 2: Using probabilistic rules before deterministic rules

Configuring probabilistic name-plus-city matching as the first rule — before email and phone deterministic rules. This causes probabilistic matching to attempt to merge ALL records including those that have perfectly matchable emails. Records that should be confidently merged by exact email match instead get evaluated probabilistically — with risk of wrong merges and missed correct merges. Always put deterministic rules first in priority order, probabilistic rules last.

❌

Mistake 3: Not configuring Reconciliation Rules for important fields

Running Identity Resolution without setting Reconciliation Rules for key profile fields. When sources conflict and no reconciliation rule is configured, Data Cloud applies a default rule that may not match business requirements. CRM name may lose to an app-entered nickname because no Source Priority rule told Data Cloud that CRM is the master. Always configure explicit Reconciliation Rules for every field that appears in multiple sources.

❌

Mistake 4: Not testing on a sample before full run

Running Identity Resolution on the full dataset on the first attempt without any sample validation. Incorrect rules discovered after a full run on 10 million records require resetting all Unified Individuals and rerunning — a process that takes hours and disrupts active segments and activations. Always run on a 5-10% sample first, review the output manually, validate merge decisions are correct, then run at full scale.

❌

Mistake 5: Ignoring the Individual ID consistency requirement across sources

Identity Resolution links records via the Individual ID field in each DMO. If CRM uses Salesforce Contact IDs as Individual IDs and Marketing Cloud uses Subscriber Keys as Individual IDs — these are completely different values. Identity Resolution cannot link them without a match rule that bridges the gap. Either use a common customer identifier across all sources, or configure match rules that explicitly bridge different identifier systems using overlapping contact point data.

🧠 Quick Knowledge Check

Test your understanding of Module 06 — answers are in the content above!

Question 01

Identity Resolution match rate is very low — only 20% of records are being merged even though many customers exist in both CRM and Marketing Cloud. What is the most likely cause?

A. Probabilistic threshold is set too high

B. Data Transforms are not normalizing email case — John@Gmail.com vs john@gmail.com not matching

C. Reconciliation Rules are not configured

D. Identity Resolution Ruleset has no rules configured

Question 02

Two customers named John Smith both live in New York and are being incorrectly merged into one Unified Profile. Which type of matching is causing this?

A. Deterministic email matching

B. Deterministic phone matching

C. Probabilistic name and city matching with too low a threshold

D. Source Priority reconciliation conflict

Question 03

CRM stores the customer name as John but the app registration used Johnny. The Unified Profile shows Johnny instead of John. Which Reconciliation Rule would fix this to always use the CRM value?

A. Most Recent — because CRM is updated more recently

B. Most Frequent — because John appears more often

C. Source Priority — with CRM ranked as the highest priority source

D. No rule needed — Data Cloud always uses CRM by default

Question 04

In a Ruleset with three Match Rules — Email (Rule 1), Phone (Rule 2), Name+City (Rule 3) — a customer is matched by the Email rule. Which other rules are applied to that customer?

A. All three rules are applied to every customer

B. Rule 2 and Rule 3 are still applied as additional verification

C. No other rules — once matched by Rule 1, Rules 2 and 3 do not apply to that record

D. Rule 3 is always applied regardless of earlier matches

Question 05

A financial services company wants the highest possible accuracy in Identity Resolution and cannot risk false merges. What is the best approach?

A. Use only probabilistic matching with a very high threshold of 95

B. Use only deterministic matching on high-confidence unique identifiers like National ID — accept lower match rate for higher accuracy

C. Use both deterministic and probabilistic with probabilistic first in priority

D. Run Identity Resolution without any rules and let Data Cloud decide automatically

✅ Answers

Q1: B — Email normalization missing | Q2: C — Probabilistic threshold too low | Q3: C — Source Priority with CRM first | Q4: C — Cascade — no other rules apply | Q5: B — Deterministic only on high-confidence fields

🎤 Interview Questions for This Module

Identity Resolution questions that come up in every Data Cloud interview

What is Identity Resolution in Salesforce Data Cloud and why is it the most important feature?

Identity Resolution is the process of recognizing that records from different source systems belong to the same real-world customer and merging them into a single Unified Customer Profile. It is the most important feature because without it every other Data Cloud capability is compromised. A customer who exists in CRM, Marketing Cloud, the website and the support system is four separate people inside Data Cloud without Identity Resolution. Their purchase history is split across profiles. Their LTV calculation is wrong. Segments target them multiple times. Agentforce sees incomplete context. The entire promise of Data Cloud — a complete 360 customer view — is only possible after Identity Resolution correctly merges fragmented records into one complete Unified Individual.

One-Liner: "Identity Resolution merges records from all sources that belong to the same customer into one Unified Profile. Without it every downstream feature — LTV, segments, Agentforce context — is based on fragments rather than the complete customer picture."

What is the difference between deterministic and probabilistic Identity Resolution? When would you use each?

Deterministic matching is exact comparison — if two records share exactly the same value in a specified field like email or phone, they are the same person. It produces zero false positives and should always be configured first. Probabilistic matching computes a score across multiple fields — name, city, postal code, device — and merges records exceeding a threshold. It catches customers who have no shared unique identifiers but are clearly the same person from other signals. I always use deterministic rules first in priority order because they are the most accurate. Probabilistic rules are added as secondary fallback layers that only apply to records the deterministic rules could not match. The threshold for probabilistic matching depends on risk tolerance — financial services companies use very high thresholds to prevent false merges that could have compliance implications.

One-Liner: "Deterministic is exact and zero false positive — same email means same person. Probabilistic is statistical — similar signals exceed a threshold. Always configure deterministic first, probabilistic as fallback for unmatched records."

What are Reconciliation Rules and why do you need them?

Reconciliation Rules define which field value goes on the Unified Customer Profile when multiple source records that have been merged disagree on a field value. After Identity Resolution merges a CRM contact and a Marketing Cloud subscriber into one Unified Individual, both source records might have different values for the customer's first name — John in CRM and Johnny in the app. Without a Reconciliation Rule Data Cloud applies a default logic that may not match your business requirements. With a Source Priority reconciliation rule that ranks CRM first, the Unified Profile shows John from CRM. The three reconciliation options are Most Recent which uses the value from the most recently updated source, Most Frequent which uses the value appearing in the most source records, and Source Priority which always uses the value from the highest-ranked source system. Most Recent is best for contact information that changes over time. Source Priority is best for fields where one system is the defined master.

One-Liner: "Reconciliation Rules decide which value wins when merged sources disagree — Most Recent for contact info that changes, Source Priority for fields with a master system, Most Frequent for stable attributes like name. Without them Data Cloud uses defaults that may not match your business rules."

Customers who exist in both CRM and Marketing Cloud are not being merged by Identity Resolution even though they have the same email address. Walk me through how you diagnose this.

This is almost always a data normalization problem. My diagnosis sequence would be as follows. First I query the Contact Point Email DMO directly and look at the actual email values for records from both sources. If CRM shows John@Gmail.com with capital J and Marketing Cloud shows john@gmail.com in lowercase — that is the root cause. Exact match is case-sensitive, so these two values are not equal. Second I check the Data Transform configuration for both Data Streams — is LOWER applied to email before it reaches the DMO? If no transform exists or LOWER is missing, I add it and rerun the transform. Third I verify the Contact Point Email DMO has Individual ID mapped for both sources — without Individual ID, email records in the DMO are orphaned and Identity Resolution cannot evaluate them. Fourth I rerun Identity Resolution after fixing the transform and check whether the match rate improves. If the issue was only normalization, match rate should increase significantly on the next run.

One-Liner: "Missed email match — first check: are email values actually identical in the DMO? Case mismatch is the most common cause. Check Transform has LOWER, verify Individual ID is mapped on Contact Point Email, then rerun IR."

How would you design an Identity Resolution strategy for a company with 20 million customers across CRM, Marketing Cloud, e-commerce and a mobile app?

I would design a layered Identity Resolution strategy with three rules in priority order. Before any IR configuration I would ensure Data Transforms normalize all match fields — LOWER for email, REGEXP_REPLACE for phone, TRIM for names across all four source DLOs. Rule 1 would be deterministic email matching using the Contact Point Email DMO. Email is the highest confidence unique identifier and I would expect this to match 60 to 70 percent of the 20 million profiles since most customers use the same email across channels. I would add a transform to exclude known shared emails like family or corporate distribution addresses before this rule runs. Rule 2 would be deterministic phone matching using Contact Point Phone DMO after normalization. This catches customers who registered with different email addresses in different systems but the same phone — perhaps a work email in CRM and personal email in the app. Rule 3 would be probabilistic matching using first name, last name and postal code with a threshold of 75 points. This catches anonymous app users and website visitors who have never provided an email but whose name and location match a known CRM contact. I would test this strategy on a 10 percent sample first, review the output for false positives particularly in Rule 3, adjust the probabilistic threshold if needed, then run at full scale. I would schedule incremental runs to run daily after DMO refreshes to keep Unified Profiles current.

One-Liner: "20M customer IR strategy: normalize all fields in transforms first, then Rule 1 email deterministic, Rule 2 phone deterministic, Rule 3 name plus postal code probabilistic at threshold 75. Test on 10% sample before full run — tune probabilistic threshold based on false positive review."

← Previous Module

Module 05: Data Transforms

SQL Cleaning and Data Quality

Next Module →

Module 07: Unified Customer Profile

Click Here

📚 Also Read — Related Interview Prep ☁ Salesforce Data Cloud Interview Questions — Top 40 🤖 Salesforce Agentforce Interview Questions — Top 50 ⚙️ Salesforce Apex Triggers Interview Questions 💻 Salesforce LWC Interview Questions — Top 40