Why Foundational Data Integrity and Provenance Are Critical to Safe, Scalable AI in Insurance -

pexels-ali-2154532119-33317466 future outlook risk

Almost every insurer now has an AI strategy. Many have multiple pilots underway across claims handling, fraud detection, underwriting support, customer service, and reports automation. Boards are asking where AI can reduce costs. Regulators are asking whether firms understand the risks. Vendors are promising transformation at extraordinary speed.

However, too much of the AI conversation has focused on model risk: hallucinations, drift, explainability, bias, governance, and whether Generative AI can be trusted to keep data private in this highly regulated market. These are important concerns but they are not the initial problem most insurers building AI systems need to solve.

Why fixing data quality alone is not enough

Insurers often talk about improving data quality as part of AI system success but the challenge is much broader than cleansing inaccurate records.
The deeper issue is structural coherence. Many insurance data environments were originally designed for deterministic administrative processing rather than probabilistic AI systems. Traditional systems were built to issue policies, record claims, calculate premiums, support accounting and compliance.

They were not designed to support continuously learning systems operating across fragmented workflows and multiple data sources. This creates a structural mismatch between modern AI ambitions and legacy operational architecture.

As a result, insurers can end up trying to run advanced AI initiatives on top of data ecosystems that lack:

Authoritative customer records
Stable schemas
Event-level process visibility
Consistent metadata
Robust lineage
Governed access controls.

No amount of model sophistication fully compensates for those weaknesses.

Mind the ‘data reality gap’ before moving from the lab into production

The most common reason that AI Proof of Concept studies fail to make it into production is the ‘data reality gap’. In a PoC environment, models are trained on static, sanitised and carefully curated datasets. However, when moved into production, these models are suddenly exposed to much messier, more fragmented and incomplete real-time data streams. The models tend to break because the data infrastructure cannot sustain the quality required.

Consequently, an AI model which performs fantastically ‘in the lab’, starves in production because it simply cannot reliably access clean, unified data. Thus, promised insights and efficiencies often fail to materialise.

Insurers’ structural data problem

This problem is highly pronounced in the insurance sector where carriers and MGAs alike tend to be heavily burdened by decades of ‘technical debt’. Essentially, core legacy data lives in separate mainframes. There are separate systems for claims, underwriting, and policy administration that rarely talk to each other. We increasingly see insurers discovering that, what appeared to be an AI challenge is actually a structural data problem.

Insurance is a highly data-intensive industry, but that does not necessarily mean it is data-coherent. Most insurers evolved over decades through a combination of:

Product-line expansion
Mergers & Acquisitions
Outsourced administration arrangements
Regional operational differences
Legacy policy administration systems
Claims platform layering.

The result is that many insurers today operate with multiple competing versions of the same customer, policy or claim, all spread across disconnected systems. Customer information may exist differently across claims, underwriting, CRM and finance systems. Claims handlers often rely on free-text notes, email chains and PDFs to complete operational workflows. Key process steps may occur outside core systems altogether through spreadsheets, side processes or manual escalation routes.

Building in data structure and context

In these environments, the issue is not necessarily that the AI is ‘wrong’. It is that the AI lacks sufficiently reliable context to operate safely at scale. This can create a dangerous illusion inside organisations: models appear functional in pilot environments but become unstable once exposed to operational complexity.

That instability often manifests as:

High manual override rates
Inconsistent customer outcomes
False positives in fraud detection
Explainability problems during audit or compliance reviews
Reduced trust from claims handlers and underwriters, so that AI rollouts stall at pilot stage.

Yet insurers sit on a goldmine of unstructured data, PDFs, handwritten claim forms, doctor’s notes and email chains. To enable these to be readable by AI systems, insurers need to implement Intelligent Document Processing (IDP) at the very edge of their ingestion pipelines.
Instead of storing a PDF of a claim in a shared drive, they should use OCR (Optical Character Recognition) combined with NLP (Natural Language Processing) to instantly extract data ‘entities’ (e.g. names, dates, policy numbers, loss amounts etc) at the point of receipt. Furthermore, insurers need to adopt metadata tagging. Every piece of data entering the system must be tagged to give them the all-important context: showing which systems it came from, who uploaded it, event dates, level of confidentiality, and so on. Shifting from unstructured file storage to structured, query-able data in a non-negotiable for AI production readiness.

Enabling data provenance and lineage

One of the least discussed issues in insurance AI is provenance. Insurers must increasingly be able to answer these questions:

Where did this data originate?
Which systems contributed to this decision?
What transformations occurred along the way?
Which version of the data was used?
Was human judgement involved?
Can the decision path be reconstructed later?

These questions become particularly important in claims, fraud and underwriting environments where decisions may later face challenges from the customer, regulator, or even the courts.

AI frameworks can support insurers by enabling ‘glass-box’ architectures through the automatic embedding of metadata capture within data ingestion pipelines. So, if an AI systems makes a recommendation or a chatbot provides an answer to a customer query, the frameworks create an immutable audit trail supporting that response.

Under this approach, insurers can trace outputs back through the system, identifying the specific LLM used, the exact context provided, and the original source document it was pulled from. This ‘glass box’ transparency is what insurers need to ensure regulatory compliance. It also gives human teams the confidence to adopt AI technology.

Without this strong lineage and provenance controls, AI outputs become difficult to defend even when they are operationally useful. This is particularly relevant as regulators become more focused on AI governance. The recent FCA review of AI in Financial Services led by Sheldon Mills is examining how AI may affect markets, consumers and regulatory oversight over the coming years. Importantly, this examination extends to data governance, transparency and operational accountability.

Operational observability

To scale safely, observability must be seen as a key element, not an afterthought. There are three core steps to AI observability success:
Implement Data Quality Gates: Before data ever touches a model in production, automated checks must validate its schema, completeness, and formatting. If bad data enters the pipeline, it should trigger an alert rather than silently passing through to generate a bad prediction.
End-to-End Telemetry: We need to track the entire lifecycle of a request. This means logging not just the model’s output, but exactly what data it consumed, what version of the model processed it, and how long it took.

Decouple Infrastructure from Logic: Ensure that the data pipelines and the AI models are modular. If a model starts acting up, you need the observability tools to instantly isolate whether it’s a model degradation issue, or an upstream data pipeline failure.

AI system health monitoring

In terms of parts of the AI system to actively monitor to ensure its working well, and producing accurate results before scaling, insurers need to actively monitor three distinct layers:

Data Drift and Feature Drift: Is the incoming production data statistically similar to the data the model was trained on? In insurance, macro-economic changes, like inflation affecting repair costs can cause data drift which renders a pricing or claims model instantly inaccurate.

Model Confidence and Prediction Distributions: Monitor the confidence scores of the AI outputs. If the model is suddenly producing a high volume of low-confidence predictions, or if the distribution of its outputs skews heavily (e.g., automatically flagging 80 per cent of claims as fraudulent instead of the usual five per cent), an alert must be triggered.

Infrastructure Health: Standard latency, throughput, and error rates. If the AI takes too long to parse a 100-page medical record for a bodily injury claim, it creates a bottleneck that prevents scalability.

Why insurers need a ‘Data First’ AI Strategy

The insurers seeing the most sustainable value from AI are typically not the ones deploying the most experimental models. They are the ones investing early in data consolidation, governance and operational clarity.
This usually involves:

Establishing trusted ‘gold record’ entities
Consolidating customer and policy views
Reducing duplication across systems
Improving process visibility
Introducing stronger governance around data ownership
Ensuring lineage and provenance are fully traceable
Designing workflows where AI supports decision-making

Only once these foundations are in place, does scalable automation become realistic. This is particularly true in delegated authority and MGA environments, where carriers increasingly expect stronger operational oversight and more consistent data reporting from underwriting partners.

Walking before running

There is a growing recognition across the insurance market that AI adoption is not purely a technology challenge. It is about operational maturity. The insurers that succeed will not necessarily be those that adopt AI fastest. They are more likely to be the ones that develop the strongest data integrity, governance and operational discipline beneath their AI initiatives. Safe and scalable AI does not begin with the managing the model. It begins with ensuring the quality, consistency and provenance of the data your selected model depends on.

About the author

Seb Kirk is the CEO and Co-Founder of GaiaLens. He is a Chartered Financial Analyst with a mission to bring clarity and focus to complex systems with the assistance of AI, turning fragmented data into insight businesses can trust.

Seb worked as a Financial Analyst at a boutique corporate finance firm for five years, where he specialised in facilitating transactions in the sustainable and ethical food production industry. Seb holds an MSc (Distinction) in Data Science from City, University of London, and a BSc (Hons) degree in Natural Sciences from Newcastle University.

A regular commentator on issues surrounding ESG, Data Science, Digital Transformation and AI, Seb remains dedicated to bringing clarity and transparency, not only to sustainability, but wider conversations on compliance, document intelligence, and risk to better align decisions and organisation goals with financial success.