The Complete Guide to Data Reconciliation
Why business data across systems never matches, and a practical framework for detecting, diagnosing, and resolving discrepancies.
What is data reconciliation?
Data reconciliation is the process of comparing data across two or more systems to identify and resolve discrepancies. In business contexts, this typically means verifying that customer counts, revenue figures, usage metrics, and other key data points match across CRM, billing, product, and analytics systems.
Reconciliation isn't just about finding errors. It's about building trust in your data by understanding why numbers differ across systems and establishing processes to keep them aligned.
Every growing company eventually faces reconciliation challenges. The question isn't whether your data will drift—it's whether you'll catch it before it causes problems.
Why data discrepancies are inevitable
Data discrepancies between systems aren't bugs to be eliminated—they're natural consequences of how modern businesses operate. Understanding the root causes helps you design better reconciliation processes.
1. Different definitions
The same concept often has different meanings across systems. "Customer" in your CRM might mean a company with a signed contract. In billing, it's anyone with an active payment method. In product analytics, it's any workspace with logged activity.
These aren't wrong—they're different perspectives on the same underlying reality. But they produce different counts.
2. Different timing
Systems update at different rates. Your product database might update in real-time. Your data warehouse might refresh hourly. Your CRM might batch-process changes overnight. Comparing them at the same timestamp means comparing different moments in time.
3. Different boundaries
Systems have different scopes. Billing includes all paying customers. Product might exclude beta users. CRM might include prospects who haven't converted yet. The populations being counted aren't identical.
4. Transformation errors
When data flows between systems—through ETL pipelines, sync tools, or manual exports—transformation logic can introduce errors. Currency conversions, timezone handling, field mapping, and type coercion are common culprits.
5. Human factors
Manual data entry creates inconsistencies. Sales updates the CRM when they remember. Support tags tickets differently. Finance categorizes revenue based on judgment calls. These human decisions accumulate into systemic drift.
Types of reconciliation
Different reconciliation scenarios require different approaches.
Count reconciliation
Verifying that entity counts match across systems. "We have 1,247 customers" should be consistent whether you check the CRM, billing system, or product database. Count discrepancies often indicate definitional differences or data quality issues.
Value reconciliation
Verifying that numerical values match. Revenue in billing should match revenue in the database. MRR calculated from subscriptions should match MRR reported to the board. Value discrepancies often indicate calculation differences or missing data.
Identity reconciliation
Verifying that the same entity is correctly linked across systems. Customer #1234 in billing should map to Company ABC in the CRM should map to Workspace xyz in the product. Identity errors cause attribution problems and duplicate counting.
Temporal reconciliation
Verifying that point-in-time snapshots are consistent. The customer count on January 1st should be the same whether you calculate it today or calculated it on January 2nd. Temporal discrepancies indicate retroactive data changes or snapshot inconsistencies.
A practical reconciliation framework
Effective reconciliation is a continuous process, not a one-time project. This framework provides a structured approach.
Step 1: Define canonical sources
For each critical data point, designate one system as the source of truth. Revenue: Stripe. Customer ownership: CRM. Usage: Product database. This doesn't mean other systems are wrong—it means you know which one wins when they disagree.
Document these decisions explicitly. "Revenue is defined as the sum of successful charges in Stripe, excluding refunds, converted to USD at daily rates." Ambiguity is the enemy of reconciliation.
Step 2: Establish tolerances
Perfect consistency is often impossible and always expensive to maintain. Define acceptable variance levels for each reconciliation point.
- Customer counts: Exact match or within 1%
- Revenue figures: Within 0.5% or $500, whichever is greater
- Usage metrics: Within 5% due to logging delays
Tolerances should reflect business impact. Revenue discrepancies matter more than usage discrepancies. Board-reported metrics need tighter tolerances than internal dashboards.
Step 3: Build reconciliation checks
Create automated checks that compare data across systems on a schedule. Each check should:
- Query both systems using consistent logic
- Compare results against defined tolerances
- Alert when tolerances are exceeded
- Produce evidence for investigation
Checks should run frequently enough to catch drift before it accumulates. Daily for revenue. Weekly for customer counts. Real-time for critical transactions.
Step 4: Investigate discrepancies
When checks fail, follow a structured investigation process:
- Quantify the gap: How big is the discrepancy? Is it growing or stable?
- Identify the contributors: Which specific records don't match?
- Trace the root cause: Is it definitional? Timing? Data quality?
- Classify the outcome: Is this an error to fix, or an expected variance to document?
Step 5: Maintain exception rules
Some discrepancies are expected and don't need alerts every time. A specific customer might be counted differently due to a known edge case. A particular revenue stream might have approved variance.
Document these exceptions explicitly with rationale and review dates. "Customer X excluded from count reconciliation because they're on a legacy contract with different terms. Review quarterly."
Step 6: Report and review
Reconciliation health should be visible to stakeholders. Regular reporting should include:
- Check pass/fail rates over time
- Trend in discrepancy magnitude
- Open exceptions and their review status
- Recent root causes and remediations
Common reconciliation scenarios
Revenue reconciliation: Billing vs. Database vs. CRM
Revenue figures often differ across systems because each serves a different purpose.
- Billing (Stripe): What was actually charged, including prorations, refunds, and currency conversions
- Database: Contracted revenue based on subscription records, regardless of invoicing
- CRM: Deal values that might include pipeline, one-time fees, or multi-year contracts counted upfront
Reconciliation approach: Define MRR calculation explicitly for each system. Compare monthly, with tolerance for timing differences. Investigate discrepancies larger than 1% immediately.
Customer count reconciliation
Customer counts differ because "customer" means different things.
- Billing: Anyone with an active subscription or payment method
- CRM: Companies with closed-won deals, regardless of payment status
- Product: Workspaces with any activity in the last 90 days
Reconciliation approach: Choose one definition as canonical. Map customers across systems using email or ID linking. Reconcile counts weekly, investigating any drift > 2%.
Churn reconciliation
Churn rates differ based on how churn is defined and measured.
- Logo churn: Customers who cancelled entirely
- Revenue churn: MRR lost to cancellations and downgrades
- Net revenue retention: Churn offset by expansion
Reconciliation approach: Track all three metrics. Reconcile cancelled customers in billing against churned accounts in CRM. Verify that revenue churn matches subscription changes.
Building a reconciliation culture
Tools and processes only work if the organization values data consistency.
Ownership
Someone needs to own reconciliation. This is typically RevOps, Finance, or a data team. They don't fix every discrepancy—they ensure discrepancies get noticed and routed to the right people.
Escalation paths
Define what happens when checks fail. Who gets alerted? What's the SLA for investigation? When does a discrepancy become a P1 incident? Clear escalation prevents reconciliation failures from being ignored.
Root cause tracking
Every discrepancy should be classified and tracked. Over time, patterns emerge. If 80% of revenue discrepancies trace to currency conversion, that's where to invest in improvement. Without tracking, you're fixing symptoms instead of causes.
Prevention investment
Reconciliation catches problems after they occur. Prevention stops them from happening. This means better integration design, clearer data definitions, and automated validation at data entry points.
Automation approaches
Manual reconciliation
Export data from each system, compare in spreadsheets, investigate manually. This works for occasional checks but doesn't scale. Most organizations start here and outgrow it within a year.
Custom scripts
Write Python or SQL scripts that query systems, compare results, and alert on discrepancies. More scalable than spreadsheets, but requires engineering time to build and maintain. Scripts tend to accumulate tech debt.
dbt tests
If you have a data warehouse, dbt's testing framework can validate data quality and consistency. This works well for warehouse-to-warehouse reconciliation but doesn't cover live SaaS systems directly.
Specialized tools
Platforms designed for data reconciliation automate the full workflow: connect to systems, define checks, run comparisons, generate alerts, and track exceptions. This is the direction most data-mature organizations move toward.
Key metrics for reconciliation health
- Check pass rate: Percentage of reconciliation checks passing. Target: >95%
- Mean time to detection: How long discrepancies exist before being caught. Target: <24 hours for critical data
- Mean time to resolution: How long to investigate and resolve. Target: <1 week for non-critical issues
- Exception count: Number of documented exceptions. Should be stable or decreasing
- Root cause distribution: Which causes drive the most discrepancies? Focus improvement there
Summary
Data reconciliation is essential for any organization that relies on data from multiple systems. The key principles:
- Accept that discrepancies are normal. Design processes to detect and manage them, not eliminate them entirely.
- Define sources of truth explicitly. When systems disagree, you need a documented tiebreaker.
- Automate detection. Manual reconciliation doesn't scale. Build checks that run continuously.
- Set tolerances based on business impact. Not all discrepancies matter equally.
- Track root causes. Patterns reveal where to invest in prevention.
- Make it visible. Reconciliation health should be a standing metric that leadership sees.
Companies that master reconciliation don't eliminate data problems—they catch them fast and fix them faster.
Related resources
Automate your reconciliation
Eru runs continuous checks across all your systems. Discrepancies surface in Slack before they become board-level problems.