The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
AI that extracts data from documents, forms, and handwritten materials using OCR and intelligent processing. Includes template-free extraction and handwriting recognition; distinct from multimodal document understanding which handles complex layouts and diagrams requiring vision-language models. Scope covers ML/AI-powered extraction and recognition; traditional template-based OCR and manual data entry are out of scope.
Intelligent document processing has crossed the threshold from promising technology to proven operational capability. ML-powered extraction from documents, forms, and handwritten materials — using OCR, NLP, and increasingly LLM-based reasoning — now runs in production across cloud platforms from Microsoft, Google, and AWS, with GA tooling, competitive pricing, and analyst-validated ROI. Gartner's inaugural Magic Quadrant for IDP (April 2026) names five leaders—ABBYY, Hyperscience, Infrrd, Tungsten Automation, UiPath—confirming ecosystem maturity. An AIIM survey of 600 enterprises found 78% operational with AI document automation, and the IDP market reached $8B in 2024. The practice question has shifted from "does it work" to "how to roll it out" — though that rollout is harder than vendors suggest. The market is now inflecting toward agentic orchestration and decision-acceleration; extraction capabilities are commoditized. Accuracy degrades sharply on handwriting, non-Latin scripts, and edge-case layouts. Production reliability incidents recur across platforms. The gap between pilot success and scaled deployment remains wide: 20-40% of real-world documents fall outside standard templates, requiring HITL architecture and enterprise-grade preprocessing. The frontier is now agentic processing and generative AI integration, but the binding constraint remains the same one it has been since 2017: field-level accuracy under real-world conditions, with silent corruption risk and benchmark-to-production gaps.
The market is consolidating around agentic and LLM-powered extraction, displacing the template-based systems that defined the previous generation. Gartner reports 67% of enterprises now evaluating agentic document processing, up from 23% two years prior. Platform vendors are shipping production-grade agentic architectures: AWS published reference architecture for multi-agent orchestration via Bedrock AgentCore (April 2026) with graph-based workflows, dual-path routing (known docs via Textract, complex/handwritten via Bedrock), and serverless deployment. Azure Document Intelligence v4.0 GA shipped searchable PDF output and incremental classification training in February 2026; UiPath's acquisition of WorkFusion signals vendor consolidation toward vertical AI platforms rather than horizontal extraction tools. Everest Group analyst assessment identifies commoditization of core OCR/extraction and market inflection toward orchestration and decision-acceleration, indicating maturity of the extraction layer.
Named deployments continue to deliver strong ROI at increasing scale. A KumoHQ-documented logistics firm cut processing time 87% (40+ hours to 5 hours weekly) at 94% accuracy with LLM-powered extraction and direct ERP integration, achieving 3.5-month payback. AND Digital's IDP Accelerator achieved $2M annual savings for a leading FinTech platform through smart cost routing (Textract for standard, Bedrock for complex). Esprigas processes 27,000 documents monthly with $73,800 in savings; Erewhon processes 20,000 invoices monthly at $45,000 savings; Disney Trucking eliminated 6 FTEs entirely through automation. EY runs a tax processing pipeline at scale with hundreds of extractors mixing OCR, custom models, and generative augmentation. Google uses Document AI internally for sustainability report processing. Tungsten Automation (formerly Kofax) serves 25,000+ customers across 70+ countries, including 8 of the top 10 global banks and 7 of the top 10 insurers, confirming enterprise-scale production deployment breadth. Industry statistics confirm the economics: 60-80% cost reduction per document, 70-90% processing time improvement, 6-18 month payback across lending, insurance, and BPO segments. However, critical accuracy threshold emerges: 96-99% accuracy is required for viable ROI; systems operating at 90-94% accuracy have ROI undermined by manual correction loops.
Progress on handwriting recognition accelerates with frontier MLLMs: latest models (Gemini 3.1, GPT-5.4, Claude Sonnet 4.6) achieve ~85% accuracy on structured handwritten medical forms with 90% weighted F1 scores, indicating viable pathway toward automated handwritten document processing. These successes coexist with deepening recognition of production barriers. Independent testing reveals frontier LLMs (GPT-4o, Claude, Gemini) cluster errors on scanned/low-resolution documents (6-8% failure rates) and multi-currency forms, with dangerous confidence calibration on complex documents—creating silent corruption risk. Practitioner analysis documents benchmark-to-production gap: 55+ percentage point performance variance across document types, and 20-40% of real-world documents fall outside standard templates requiring HITL architecture. Specific failure modes persist: column-order collapse in multi-column layouts, table flattening with merged headers, semantic errors (wrong field extracted but schema-valid), and context truncation across pages. The DOJ and House Oversight Committee released over 3 million PDFs with non-functional OCR, rendering them unsearchable—a reminder that extraction at scale still breaks in ways that matter. Azure Document Intelligence experienced custom classification training jobs stuck at "notStarted" status for weeks. A Vertesia survey of 1,500 IT executives found 96.8% cite ECM vendor roadmaps as significant barriers to AI implementation. Manual correction loops still account for 40% of input management costs, per Parashift's analysis, even as AI reduces error rates from 4% to 0.5%. Enterprise IDP deployments require 3-12 months and $50k-500k+ annual costs with steep learning curves and template maintenance burden, creating accessibility gaps for mid-market and SMB segments. The economics work — but only with sophisticated process redesign, OCR-first architecture (not LLM-only), enterprise-grade preprocessing, confidence scoring, HITL routing, and accuracy-matching strategies most organisations have not yet undertaken.
— Lleverage case studies showing manufacturing company reducing 4 FTEs to 1 with error rate cut from 7% to 0.5% (€375k annual savings, 375% ROI) and AI-native automation outperforming traditional OCR.
— ABBYY customer outcomes: 99% KYC compliance, 40% efficiency gain, 92% touchless processing, 140+ hours saved monthly. Backed by Gartner Magic Quadrant, Everest Group PEAK, IDC MarketScape recognition.
— Academic benchmark on 7,093 high-difficulty samples across 5 OCR tracks finds state-of-the-art LMMs exhibit substantial performance degradation in production, revealing gap between benchmarks and real-world effectiveness.
— UK government trial of 20,000 civil servants found AI saved ~2 weeks per person annually (26 min/day), with potential £45B public sector savings on 1B citizen transactions, 84% assessed as automatable.
— Everest Group PEAK Matrix 2026 identifies 10 IDP leaders (ABBYY, EdgeVerve, HCL, Hyperscience, Infrrd, Microsoft, Nanonets, Rossum, Tungsten, UiPath) across 32-vendor ecosystem; confirms maturity.
— InduOCRBench research proves high OCR accuracy does not guarantee downstream RAG effectiveness on industrial documents; structural and semantic errors cause retrieval failures despite low character/word error rates.
— Koncile guide with five independent deployment case studies: 12k invoices (65% reduction, €40k savings), 4k payslips (70% time cut, 2 FTE freed), 30k claims (halved reimbursement time).
— Independent evaluation of GPT-4o, Claude, and Gemini on 120 real financial documents reveals error clustering on scanned/low-resolution docs (6-8% failure), multi-currency (6-11 errors), and dangerous confidence calibration, requiring validation layer for production safety.