The Insurance Industry’s Data Blind Spot

AI, data, and analytics can be extremely powerful for improving underwriting. However, the impact of all three hinges on one often-overlooked challenge: entity resolution. Without it, your models will lead you in the wrong direction and the results could be cataclysmic. But if you nail it, It can be a massive unlock for improving underwriting decisions and outcomes.

So what is entity resolution?

At a high level, it answers the question “how do I tell if this person/place/thing described here is the same as the person/place/thing over there?”

Sometimes this is easy, like noticing that “123 Main St” is the same as “123 Main Street”.

Other times, a company has changed their name and address, or a driver is listed by their middle name instead of their first name, or there are typos in one data source that means key information doesn’t line up. Sometimes, this is a hard problem even for humans to figure out.

In 2025, most insurers are puzzling over their technology roadmap, and many leaders are chomping at the bit to take advantage of the promise of AI.

But they have to remember the classic adage of “garbage in, garbage out.” Not resolving entities correctly is the quintessential example of “garbage-in”.

So when it comes to complex, scattered, semi-structured data - like so much of what underwriters are dealing with - getting entity resolution wrong is a recipe for total technology disaster.

Let’s talk about how to get it right.

‍What is the entity resolution problem?

When information comes from many different places, it rarely lines up precisely.

Entity resolution helps solve this, so insurers can automatically combine information from many data sources and formats into one place. Without it, you can’t be sure if you’re looking at the same business, location, person, or vehicle across:

Submission emails
Submission documents (ACORD forms, SOVs, loss runs, MVRs, applications, etc.)
News articles
Legal filings
Permits
Government datasets (FBI crime scores, OSHA violations, FEMA CAT info,
Third party data vendors
And hundreds of other data sources

Without high quality entity resolution, you could easily be missing info thanks to DBAs, related corporate entities, name changes, and businesses located on the same premises. This means underwriters are reviewing wrong or incomplete information - and making the wrong decisions on clearing, pricing, policy language, exclusions, and quoting.

Poor entity resolution is extremely costly. If you’ve ever cleared the same submission to multiple brokers, you know what we mean.

On the flip side, high quality entity resolution can be a massive unlock for accurate, efficient underwriting decisions.

This is the key reason Kalepa can perform automated submission clearing, or pull in thousands of third party data points on a business, or detect inconsistencies across data sources.

Entity resolution has helped us sort through many hidden exposures, including identifying a carwash that was also operating a daycare, a restaurateur who had previously burned down two restaurants, a chameleon carrier that popped up after too many FMCSA violations, and a contractor whose permits revealed they were doing asbestos remediation.

‍Why is entity resolution hard?

Let’s start with an example. These businesses are all the same:

Crystl Elec. at 123 E Main St, NYC
Crystal Electricians at 123 East Main Street, New York, NY
Crystal Electricians DBA Value Electrical at 123 Main St, NY, NY
Value Electrical of New York City

Most of this is obvious to humans, but even still, there is some ambiguity. Looking at only the first and last entries in the list, it’s not clear at all that “Crystl Elec.” is the same business as “Value Electrical”, especially without exact addresses.

More broadly, some of the specific challenges with entity resolution include:

Data Variations: Inconsistent data, like misspellings or typos
Missing Data: Incomplete entries missing key context
Multiple Identifiers: DBAs or abbreviations
Ambiguity: Cases where even a human would not know
Scalability: Comparing all records in a large dataset is computationally expensive and time-consuming

‍How does Kalepa approach entity resolution for insurance data?

We’ve thought about this a lot. And based on years of continual refinement, here’s our high-level playbook for insurance data entity resolution.

First we have to start with our identifiers: the things we know about an entity. This includes names, addresses, coordinates, email addresses, websites, phone numbers, and government/3rd party identifiers (FEINs, USDOTs, DUNS, etc.). Automatically extracting these from submission documents and third party data sources gives us a massive data lake to draw from.

From there we use these identifiers in our pairwise AI-driven comparison algorithm that takes two entities at a time and asks “are these the same?” This lets us build a graph that establishes the relationships between various entities, including entities that appear to be different but are actually the same.

Since we help underwriters evaluate so many submissions / businesses / drivers / vehicles / locations / etc., our algorithms have to be extremely efficient. To power this, we create what’s called a vector embedding to represent and summarize all the information for each entity. From there, we use vector search to narrow down the list of potential candidates before we run the pairwise comparison algorithm.

Accuracy is extremely important for entity resolution - we saw what mistakes look like earlier - so we continually evaluate the accuracy of our approach with dedicated benchmarks based on real-world, ground truth data. In our most recent entity resolution tests, we achieved an industry-leading 99.6% accuracy, and we continuously look to improve this metric with additional model enhancements.

Once the entity resolution is in place, the real fun begins. From there, Kalepa’s Copilot can power submission clearance, pull in third party data, and give underwriters an accurate picture of every submission.

Want to learn more about how Kalepa can power better entity resolution and better underwriting? Get in touch at the link below.

The Insurance Industry’s Data Blind Spot—And How to Fix It

‍What is the entity resolution problem?

‍Why is entity resolution hard?

‍How does Kalepa approach entity resolution for insurance data?

Want to get in touch?

See how Copilot can deliver better decisions

Home

Solution

Book a Demo

Company

Resources

Who It's For

The Insurance Industry’s Data Blind Spot—And How to Fix It

‍What is the entity resolution problem?

‍Why is entity resolution hard?

‍How does Kalepa approach entity resolution for insurance data?

Want to get in touch?

Related Posts

Three Questions with Ada Adamczak, Manager of Full Stack Engineering

2024: The Year of the Copilot

Three Questions with Vahini Patel, Manager - Customer Success

See how Copilot can deliver better decisions

Home

Solution

Book a Demo

Company

Resources

Who It's For