Insurance Fraud Is Winning, Most AI Is Fighting the Wrong War -

pexels-googledeepmind-18069697 agentic AI

This article is by Reuben John, CEO, True Aim

UK insurers detected £1.16bn in fraudulent claims in 2024, the latest in a decade-plus run above a billion. The largest single category – at £466m – was not staged crashes or organised rings. It was exaggerated loss: padded household lists, inflated repair invoices, refactored medical bills, the dull paperwork-level lying that goes on at scale. According to some estimates, for every pound of fraud the industry catches, two pounds slip past undetected.

That is the actual shape of the problem but it’s not the shape the industry’s AI investment has been built around.

The Fraud Headline Hides A Bigger Number

Deloitte’s 2025 fraud-AI analysis puts soft fraud at 60% of incidents, with current detection rates between 20% and 40%. Hard fraud, the staged and premeditated end, is the other 40%. The volume is in the soft band. The detection gap is in the soft band. So is most of the under-counted pound value.

EY’s claims practice puts total leakage – fraud plus error plus scope creep – at 7% to 14% of carrier claims spend. Aviva alone disclosed that motor damage and credit-hire fraud is up 275% since 2021, driven by exaggerated repair invoices and inflated hire bills. Yet the trade-press conversation on insurance AI for the past two years has been almost entirely about image-AI for staged accidents and graph analytics for organised rings.

Useful but not enough.

What Most Carriers’ AI Is Actually Built To Catch

From my research of thirteen publicly named, sourced carrier AI deployments for claims-fraud detection in the past three years only four lead with document-content analysis: Aviva’s special investigations unit, Zurich Switzerland’s German-language pilot, Swiss Re Corporate Solutions’ ClaimsGenAI, and NFU Mutual via Synectics’ SynDOC layer. The remaining nine appear to be image-based, network-based, voice-based or hybrid. Image-AI got the press; document-AI got the gap.

The carriers who do publish on document tampering are blunt about what they see. Aviva’s Peter Ward describes “a steady rise in manipulated images and documents supporting opportunistic claims.” Zurich UK’s Scott Clayton, on the same theme: “I never thought it would happen in my career, that a document or an invoice or a photograph in front of you is entirely fictitious.”

Not one major consulting firm seems to have sized fraud leakage by detection modality. Carriers writing cheques to fraud-AI vendors are buying without knowing where the leakage truly sits.

Generative AI Has Joined The Fraudsters’ Side

The arms race shifted in 2023. Allianz UK reported that “cases where apps were used to distort real-life images, videos and documents increased by 300 per cent” between 2022 and 2023. Admiral’s 2025 detected fraud rose 71 per cent on the prior year, partly attributed to AI-generated supporting evidence. BDO’s May 2026 FraudTrack report opens on the same theme: fraud is “becoming more technologically enabled, more adaptive and more resilient.”

The honest read of the defender-side academic literature is that we are losing. The CVPR 2023 DocTamper benchmark – the field’s reference test for document-tampering detection – produces F1 scores near 0.91 in-domain. The same class of detector drops to F1 around 0.04 on a different receipt dataset out-of-distribution. A 2026 study of diffusion-model edits to financial documents, AIForge-Doc, found the best zero-shot detector scored AUC 0.75 against modern generative tampering.

Vendor Numbers Don’t Survive The Cross-Domain Test

This is the bit a carrier procurement team should treat as a red-flag list.

A model trained on Chinese receipts will not generalise to a Munich body-shop invoice. A detector trained on English-language US documents will struggle on a German-language quote variant or an Italian medical bill. There is, at the time of writing, no peer-reviewed benchmark of document-fraud detection on European insurance documents in the languages European carriers actually run on. That is a publication gap. It is also a vendor gap. The two need to be priced separately at procurement.

The defensible position for a 2026 buyer: stop accepting in-distribution accuracy figures as evidence of production capability. Demand cross-domain evaluation in the languages of your book. Demand performance against AI-doctored documents, not just clean-scan templates, and recognise that the economics of training a fraudulent invoice generator are now lower than the economics of training the detector that catches it.

The Regulators Are Already There

Most commentary on this misses that the EU AI Act carves fraud-detection systems out of its high-risk regime under Annex III point 5(b), as explained in Recital 58, and that the remaining claims-relevant high-risk obligations, including life and health insurance pricing and risk assessment under point 5(c), are set to be deferred from August 2026 to 2 December 2027 under the Digital Omnibus political agreement reached in May 2026 and awaiting formal adoption.

The pressure carriers actually face is more enforceable. FINMA’s Guidance 08/2024, in force since December 2024 in Switzerland, names insurance claims determinations as the highest tier of explainability: the supervisor wants which inputs drove a score, how the model was validated, and who reviewed the output. The EIOPA Opinion of August 2025 explicitly covers claims management and fraud detection across the European insurance value chain. The NAIC Model Bulletin is now adopted in twenty-four US states. GDPR Article 22 already gives every European claimant the right to demand human review of a solely automated decision that affects them. The March 2026 Lokken v UnitedHealth discovery order is the canary: courts will compel algorithm documentation and “the vendor said it was accurate” does not survive subpoena.

The discipline that catches the boring, paperwork-level fraud – flagging, citing the rule, naming the human in the loop, producing a record a regulator can read – is the same discipline the regulators have already started asking for.

Build for the volume. Build for the audit trail. Build for an arms race the defenders have not yet won.

Reuben John is CEO and Co-founder of True Aim, a Swiss insurtech building auditable AI for claims decisioning.