← All posts
Dirty dataCRM hygieneApril 26, 2026Clint Research Team

9 Dirty Data Problems Hiding in Every Contractor CRM

44% of businesses lose more than 10% of annual revenue to inaccurate CRM data. These are the 9 dirty data problems hiding in your Jobber, Housecall Pro, or ServiceTitan database right now.

9 min read

Key takeaways

  • 44% of businesses lose more than 10% of annual revenue to inaccurate CRM data per multiple 2025 industry surveys
  • Companies lose 16 sales deals per quarter to poor-quality data per Validity's 2025 State of CRM Data Management report
  • B2B contact data decays 22.5-70.3% annually depending on the source, with 25-30% being the most cited industry range
Contents
  1. 011. Duplicate Customers
  2. 022. Missing Phone Numbers
  3. 033. Bounced or Invalid Emails
  4. 044. Inconsistent Address Formats
  5. 055. Stale "Last Activity" Dates
  6. 066. Unreconciled Names and Spouses
  7. 077. Tag Sprawl
  8. 088. Phantom Open Jobs
  9. 099. Lead Source That Says "Other" or "Unknown"
  10. 10What Clean Data Makes Possible
  11. 11Sources
  12. 12Frequently Asked Questions

44% of businesses lose more than 10% of annual revenue to inaccurate CRM data, according to research summarized by Plauti's hidden costs of poor CRM data analysis.

For a $4M home service contractor, that is a $400K annual leak hiding in fields nobody looks at. Most of it traces back to nine specific problems that show up in every contractor CRM eventually.

Here is what to look for.

1. Duplicate Customers

Duplicates are the highest-impact dirty data problem in any contractor CRM, and they are everywhere. The Validity 2025 report found 76% of organizations say less than half of their CRM data is accurate, with duplicates being the most cited cause.

Duplicates happen when a customer calls in, a CSR cannot find them by phone, and creates a new record. They happen when QuickBooks syncs a billing record that does not match the Jobber name. They happen when a tech adds a customer in the field while the office adds the same customer from the phone.

The damage is not the wasted record. It is the split job history, which makes lifetime value reports lie and reactivation campaigns target half a customer.

Housecall Pro has a native duplicate manager that auto-groups by name, phone, and email. ServiceTitan has a merge flow under the gear menu. Jobber does not let you merge clients, which is why one Jobber operator on the community forum reported spending 30 minutes per pair to copy data manually before deletion.

Text Clint: "find duplicate customers where the phone number or address matches"

2. Missing Phone Numbers

A surprising number of customer records have no phone number at all. Usually it is from web-form leads that never got converted to a real customer record, or from CSRs who skipped the field on a rushed intake.

Sort your customer list by phone. Empty cells, numbers under 10 digits, and entries like 555-555-5555 are all dead. Customers with completed jobs and no phone are the worst category, because you have already taken their money and you cannot reach them again.

Bad numbers also include the office mainline, the CSR's cell, and the boss's personal cell, all entered as placeholders. A pattern to spot is the same number appearing on 6+ different customers.

Industry guidance recommends E.164 standardization with all 10 US digits, special characters stripped, and no placeholders. Get there once and every SMS campaign downstream stops bouncing.

Text Clint: "list customers with completed jobs in the last 12 months that have no phone number on file"

3. Bounced or Invalid Emails

B2C contact data decays at 25-30% per year per Verum's analysis of multiple 2025 sources. That means a year-old database is already 30% wrong on email alone.

The dirty patterns to look for: typos at the domain (gmial.com, hotmial.com), customers who switched providers (the AOL accounts from 2008), and records where the email is actually a name in the email field.

Run the email column through NeverBounce, ZeroBounce, or any modern validator. Hard bounces become a dead-email tag and move out of the active marketing pool. Soft bounces stay but get flagged for verbal verification next time the customer calls.

44% of businesses lose more than 10% of annual revenue to inaccurate CRM data, which is what email decay does at scale to your reactivation list if you let it run unchecked.

Text Clint: "list customers without an email address but with a job in the last 24 months"

4. Inconsistent Address Formats

A house at 123 Main St is not the same record as a house at 123 Main Street, 123 main st, or 123 Main St., Apt 2 to your CRM. The address column will treat all four as separate entities even though three of them are one house and the fourth is a tenant change.

A contractor on the Jobber community forum said the most common duplicate trigger he sees is customers typing "NJ" while autofill writes "New Jersey." That single inconsistency creates two records every time.

Run the address column through USPS standardization or SmartyStreets. Output is one canonical format per household. Apartments belong in the unit field, not appended to the street.

A property manager with 30 addresses tied to one billing contact is not a duplicate. Leave those.

Text Clint: "show me customers with multiple addresses where the billing email matches"

5. Stale "Last Activity" Dates

Most contractor CRMs treat any new note as activity, which means a CSR's status update from 2022 still shows the customer as active in 2026.

Real activity is a completed job, a paid invoice, an answered call, or a confirmed reply. Notes are not activity. Tags are not activity. Email opens are not activity if the customer never replied.

Pull the customer list with last completed job date as the only signal. Anything older than 18 months is dormant, no matter what the system says. Reactivation closes 12-14x higher than cold acquisition, so dormant customers are valuable, but only if you label them honestly.

Text Clint: "show me every customer where the CRM 'last activity' is recent but the last completed job is over 18 months ago"

6. Unreconciled Names and Spouses

A residential CRM gets a husband's name on one record, a wife's name on another, and the same address tied to both. Or the kid who scheduled the service while mom and dad are on the title.

Name format chaos is its own subspecies: "Bob Smith", "Robert Smith", "Bob & Linda Smith", "Smith Residence", and "Bob Smith Sr" all describing the same household.

The fix is to pick a primary contact per address and store the secondary contacts in a notes field or a related-contact table if the CRM has one. The household, not the name, is the canonical entity. Insycle's CRM data quality checklist calls this out as one of the 15 most common issues across all CRMs.

7. Tag Sprawl

Most contractor CRMs end up with 80+ tags because every CSR adds a new one instead of finding the existing match. You will see VIP, VIP Customer, vip, VIP - High Value, and "VIP - Top 10%" all on different records, all meaning the same thing.

The downstream pain is filtering. When you try to pull "all VIPs" for a campaign, you miss two-thirds of them because they are tagged inconsistently.

Pull the tag list. Group obvious duplicates and pick one canonical version. Bulk update the rest. Kill any tag used on fewer than 5 customers, since it was a one-time idea that never became policy.

Housecall Pro's tag overview recommends a published tag taxonomy that the front office references before adding new tags.

Text Clint: "list every tag in my CRM with a count of how many customers use it"

8. Phantom Open Jobs

Jobs marked "pending", "in progress", or "scheduled" that should have been closed months ago are everywhere. They distort dispatch dashboards and mess up revenue forecasting.

The pattern: a tech finishes a job in the field, the customer pays in cash or check, the tech forgets to mark the job complete, and the office never circles back. Now you have a phantom open job sitting on the books.

Housecall Pro reports often miss the job-profitability angle on these because the labor and parts get assigned but the close-out never happens.

Pull every job in an active status with a last-activity date older than 30 days. Most of them are done jobs that need to be closed. The rest are real exceptions worth chasing down.

Text Clint: "show me every job tagged 'pending' or 'in progress' that's older than 30 days"

9. Lead Source That Says "Other" or "Unknown"

The lead source field is the foundation of every marketing decision a contractor makes. If 40% of records say "Other" or are blank, you are flying blind on which channels actually work.

The fix is upstream. Train the CSR script to ask "How did you find us?" on every call, and tighten the picklist so there are eight clean options instead of an open text field. Route Google Local Service Ads, Google Ads, Facebook, Yelp, Angi, referrals, repeat customers, and door hangers as separate values.

The KPIs every owner should track all start with reliable lead source data. Without it, cost-per-lead by channel is fiction.

For historical records with "Other" already in place, you cannot retroactively fix every one. But you can flag them and accept a smaller analyzable cohort going forward.

Text Clint: "rank lead sources by jobs in the last 12 months, with revenue per source and percent of jobs missing a lead source"

What Clean Data Makes Possible

The argument for cleaning these nine problems is not aesthetic. Validity's 2025 report found companies lose 16 sales deals per quarter due to poor data, and workers spend 13 hours per week hunting for basic information in their CRM.

Multiply 13 hours by your CSR loaded cost and the number of seats. The answer is usually north of $40K a year in time, before you count the deals you lost because you texted the wrong number or emailed a customer who switched providers two years ago.

Tommy Mello scaled A1 Garage Door to $220M+ on the back of a clean ServiceTitan database that the entire marketing operation depends on. The discipline he describes is not glamorous and it is not optional past a certain revenue level.

Sources

Frequently Asked Questions

6 questions home service owners actually ask about this.

  • 01Which dirty data problem should a contractor fix first?

    Duplicates. They distort every other report you run, and they are the foundation problem that makes lifetime value, retention, and reactivation lists unusable until they are resolved.

  • 02How do I find duplicate customers in Jobber when there is no native merge?

    Export your customer list, sort by phone, then by address, then by name. Anything that matches on two of three is a candidate. Then you have to manually copy notes and tags into the keeper before deleting the dupe, which Jobber forum users describe as a 30-minute exercise per pair.

  • 03What is a healthy duplicate rate for a contractor CRM?

    Insycle benchmarks recommend keeping the duplicate rate under 5%. Most uncleaned contractor CRMs sit between 8% and 20%, with worst cases over 30% after multi-year QuickBooks syncs.

  • 04Should I delete customers with no contact info?

    No. Tag them as incomplete-record and route them to the next CSR shift to call out. Deleting them removes job history that may still be tied to revenue you have already booked.

  • 05How fast does contractor CRM data decay?

    The most cited industry range is 25-30% annually for B2B, and consumer data sits in a similar band when you account for moves, number changes, and email switches. A residential database left untouched for 18 months will be over 40% wrong by the time you go to use it.

  • 06Do duplicates come back after I clean them?

    Yes, especially through QuickBooks sync, online booking forms, and tech-side customer creation. Set a weekly duplicate-scan habit so a fresh round never grows past 50 records before someone deals with it.

See Clint in action

Clint is the pre-built AI for home service shops. Connect your CRM, email, and phone system in minutes and the agents run on your real data.