
Perspectives
The Insurance Industry’s Data Blind Spot—And How to Fix It
Messy, inconsistent data is one of the biggest obstacles to data-powered underwriting, and entity resolution is the issue insurers can’t afford to overlook.

Kalepa
4 min
AI, data, and analytics can be extremely powerful for improving underwriting. However, the impact of all three hinges on one often-overlooked challenge: entity resolution. Without it, your models will lead you in the wrong direction and the results could be cataclysmic. But if you nail it, It can be a massive unlock for improving underwriting decisions and outcomes.
So what is entity resolution?
At a high level, it answers the question “how do I tell if this person/place/thing described here is the same as the person/place/thing over there?”
Sometimes this is easy, like noticing that “123 Main St” is the same as “123 Main Street”.
Other times, a company has changed their name and address, or a driver is listed by their middle name instead of their first name, or there are typos in one data source that means key information doesn’t line up. Sometimes, this is a hard problem even for humans to figure out.
In 2025, most insurers are puzzling over their technology roadmap, and many leaders are chomping at the bit to take advantage of the promise of AI.
But they have to remember the classic adage of “garbage in, garbage out.” Not resolving entities correctly is the quintessential example of “garbage-in”.
So when it comes to complex, scattered, semi-structured data - like so much of what underwriters are dealing with - getting entity resolution wrong is a recipe for total technology disaster.
Let’s talk about how to get it right.

What is the entity resolution problem?
When information comes from many different places, it rarely lines up precisely.
Entity resolution helps solve this, so insurers can automatically combine information from many data sources and formats into one place. Without it, you can’t be sure if you’re looking at the same business, location, person, or vehicle across:
- Submission emails
- Submission documents (ACORD forms, SOVs, loss runs, MVRs, applications, etc.)
- News articles
- Legal filings
- Permits
- Government datasets (FBI crime scores, OSHA violations, FEMA CAT info,
- Third party data vendors
- And hundreds of other data sources
Without high quality entity resolution, you could easily be missing info thanks to DBAs, related corporate entities, name changes, and businesses located on the same premises. This means underwriters are reviewing wrong or incomplete information - and making the wrong decisions on clearing, pricing, policy language, exclusions, and quoting.
Poor entity resolution is extremely costly. If you’ve ever cleared the same submission to multiple brokers, you know what we mean.
On the flip side, high quality entity resolution can be a massive unlock for accurate, efficient underwriting decisions.
This is the key reason Kalepa can perform automated submission clearing, or pull in thousands of third party data points on a business, or detect inconsistencies across data sources.
Entity resolution has helped us sort through many hidden exposures, including identifying a carwash that was also operating a daycare, a restaurateur who had previously burned down two restaurants, a chameleon carrier that popped up after too many FMCSA violations, and a contractor whose permits revealed they were doing asbestos remediation.
Why is entity resolution hard?
Let’s start with an example. These businesses are all the same:
- Crystl Elec. at 123 E Main St, NYC
- Crystal Electricians at 123 East Main Street, New York, NY
- Crystal Electricians DBA Value Electrical at 123 Main St, NY, NY
- Value Electrical of New York City
Most of this is obvious to humans, but even still, there is some ambiguity. Looking at only the first and last entries in the list, it’s not clear at all that “Crystl Elec.” is the same business as “Value Electrical”, especially without exact addresses.
More broadly, some of the specific challenges with entity resolution include:
- Data Variations: Inconsistent data, like misspellings or typos
- Missing Data: Incomplete entries missing key context
- Multiple Identifiers: DBAs or abbreviations
- Ambiguity: Cases where even a human would not know
- Scalability: Comparing all records in a large dataset is computationally expensive and time-consuming
How does Kalepa approach entity resolution for insurance data?
We’ve thought about this a lot. And based on years of continual refinement, here’s our high-level playbook for insurance data entity resolution.
First we have to start with our identifiers: the things we know about an entity. This includes names, addresses, coordinates, email addresses, websites, phone numbers, and government/3rd party identifiers (FEINs, USDOTs, DUNS, etc.). Automatically extracting these from submission documents and third party data sources gives us a massive data lake to draw from.
From there we use these identifiers in our pairwise AI-driven comparison algorithm that takes two entities at a time and asks “are these the same?” This lets us build a graph that establishes the relationships between various entities, including entities that appear to be different but are actually the same.
Since we help underwriters evaluate so many submissions / businesses / drivers / vehicles / locations / etc., our algorithms have to be extremely efficient. To power this, we create what’s called a vector embedding to represent and summarize all the information for each entity. From there, we use vector search to narrow down the list of potential candidates before we run the pairwise comparison algorithm.
Accuracy is extremely important for entity resolution - we saw what mistakes look like earlier - so we continually evaluate the accuracy of our approach with dedicated benchmarks based on real-world, ground truth data. In our most recent entity resolution tests, we achieved an industry-leading 99.6% accuracy, and we continuously look to improve this metric with additional model enhancements.
Once the entity resolution is in place, the real fun begins. From there, Kalepa’s Copilot can power submission clearance, pull in third party data, and give underwriters an accurate picture of every submission.
Want to learn more about how Kalepa can power better entity resolution and better underwriting? Get in touch at the link below.
Want to get in touch?
Sign up for our newsletter for the latest Copilot news,
features, and events, straight from the source
Hassle-free setup
Hands-on training & support
See the value on day one