Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

Pick a role above to explore practices

BLEEDING EDGE

⌨️ SOFTWARE ENGINEERING
✍️ CONTENT & MARKETING
🔬 RESEARCH & KNOWLEDGE
⚖️ LEGAL, COMPLIANCE & RISK
🎧 CUSTOMER OPERATIONS
🏛️ AI GOVERNANCE & SAFETY
📊 DATA & ANALYTICS
🛡️ IT OPERATIONS & SECURITY
🎯 PRODUCT & DESIGN
💼 SALES & REVENUE
🎬 CREATIVE & GENERATIVE MEDIA
👁️ COMPUTER VISION & SENSING
💹 FINANCE & ACCOUNTING
🔄 OPERATIONS & PROCESS AUTOMATION
🚗 AUTONOMOUS SYSTEMS & VEHICLES
🦾 PHYSICAL AI & ROBOTICS
🎓 EDUCATION & LEARNING
PERSONAL EFFECTIVENESS

LEADING EDGE

⌨️ SOFTWARE ENGINEERING
✍️ CONTENT & MARKETING
🔬 RESEARCH & KNOWLEDGE
⚖️ LEGAL, COMPLIANCE & RISK
🎧 CUSTOMER OPERATIONS
🏛️ AI GOVERNANCE & SAFETY
📊 DATA & ANALYTICS
🛡️ IT OPERATIONS & SECURITY
🎯 PRODUCT & DESIGN
💼 SALES & REVENUE
🎬 CREATIVE & GENERATIVE MEDIA
👁️ COMPUTER VISION & SENSING
💹 FINANCE & ACCOUNTING
🔄 OPERATIONS & PROCESS AUTOMATION
👥 PEOPLE & TALENT
🚗 AUTONOMOUS SYSTEMS & VEHICLES
🦾 PHYSICAL AI & ROBOTICS
🎓 EDUCATION & LEARNING
PERSONAL EFFECTIVENESS

GOOD PRACTICE

⌨️ SOFTWARE ENGINEERING
✍️ CONTENT & MARKETING
🔬 RESEARCH & KNOWLEDGE
⚖️ LEGAL, COMPLIANCE & RISK
🎧 CUSTOMER OPERATIONS
🏛️ AI GOVERNANCE & SAFETY
📊 DATA & ANALYTICS
🛡️ IT OPERATIONS & SECURITY
🎯 PRODUCT & DESIGN
💼 SALES & REVENUE
🎬 CREATIVE & GENERATIVE MEDIA
👁️ COMPUTER VISION & SENSING
💹 FINANCE & ACCOUNTING
🔄 OPERATIONS & PROCESS AUTOMATION
👥 PEOPLE & TALENT
🚗 AUTONOMOUS SYSTEMS & VEHICLES
🦾 PHYSICAL AI & ROBOTICS
🎓 EDUCATION & LEARNING
PERSONAL EFFECTIVENESS

ESTABLISHED

⌨️ SOFTWARE ENGINEERING
✍️ CONTENT & MARKETING
🛡️ IT OPERATIONS & SECURITY
🎯 PRODUCT & DESIGN
💹 FINANCE & ACCOUNTING
👥 PEOPLE & TALENT

💹 Finance & Accounting

AI for financial operations, reporting, planning, and risk management. Over half the practices are good practice: fraud detection, expense management, invoice processing, and financial forecasting have mainstream adoption. Regulatory compliance and audit automation are advancing. The domain is tightly clustered around good-practice with minimal bleeding-edge — finance favours proven, auditable tools over experimental ones.

16 practices: 1 established, 9 good practice, 5 leading edge, 1 bleeding edge

Where AI Stands in Finance & Accounting

Finance is the domain where AI most clearly works and least clearly pays. The technical case is settled across nearly every practice we track: invoice processing runs at 99%-plus extraction accuracy and sub-$3 cost per document against $8-$15 manual; fraud detection at large banks delivers double-digit false-positive reductions and 2-4x detection gains; intercompany reconciliation automation cuts close cycles from ten days to three; credit models demonstrate measurable approval lifts over FICO. Vendors have, in effect, shipped the capability. What they have not shipped -- and cannot ship -- is the organisational readiness, governance scaffolding, and trust that turn a working pilot into a scaled, auditable production system. That gap is the single most important fact about AI in finance right now, and this scan saw it widen rather than close.

The numbers are stark and consistent across the major houses. Roughly 88% of agentic AI pilots never reach production; 95% of deployments produce no measurable profit-and-loss impact; only about 7% of CFOs report that their AI investment has had a strong business effect, against the 60% who have deployed something. Adoption figures are high and rising -- 97% of finance functions claim some AI use, up from 76% a year ago -- but they measure activity, not outcome. The constraint is almost never model capability. Forrester's framing is the cleanest: the gap is "orchestration, control, and trust," not intelligence. Root-cause analyses attribute pilot failure to unclear success criteria, fragmented data, insufficient tool access, and absent governance frameworks -- organisational and infrastructural deficits that a more capable frontier model does nothing to fix. A Cambridge Judge study puts only 23% of financial-services firms at a mature scaling stage; the data-quality, talent, and legacy-architecture barriers it names have been unchanged since 2020. Finance, more than most domains, is structurally inhospitable to probabilistic tools: it demands deterministic, repeatable, auditable outputs at 99% reliability, and large language models are non-deterministic by construction.

That tension produces an unusual shape. This is a tightly clustered, mature domain -- most practices sit at good-practice or established, with real mainstream adoption -- yet it is now almost entirely stalled. The vanguard (audit anomaly detection, credit scoring, intercompany automation, tax) is extracting genuine value, but the mainstream watches and waits, gated not by technology but by the cost of making AI defensible. The labour market has already begun to reprice around the technology faster than the technology has delivered returns: AI-related job postings in accounting have tripled since 2022 to roughly 7% of listings while traditional audit roles sit at 3%, and graduate hiring fell 44% year-on-year. The Big Four are the bellwether of both sides of this story -- collectively spending billions to standardise on governed AI platforms across more than a million professionals (KPMG embedding Claude across 276,000 staff in 138 countries, Grant Thornton UK committing £500M), while simultaneously providing this scan's most vivid cautionary tales of what happens when governance fails.

What's New, 2026-06-04 to 2026-06-18

The defining story of this cycle is consolidation around a hard truth. Two practices that had held a more optimistic posture -- cash flow prediction (previously advancing) and budget variance analysis (previously unrated for trend) -- moved to stalled, leaving the domain almost uniformly stalled, with audit anomaly detection the lone practice still advancing on the strength of hardening regulatory infrastructure: BDO's shift to trigger-based continuous monitoring and its positioning of GenAI anomaly detection as a 2026 audit-committee priority, Grant Thornton UK's £500M AI commitment with mandated audit trails, and the Pentagon's $49M agentic-auditor contract running through 2031. The trend changes are not a loss of capability -- cash forecasting still posts 94-97% short-horizon accuracy at the vanguard -- but a recognition that the bottleneck is organisational readiness. Only 12% of finance teams have machine-learning forecasting in full production and 53% do not use AI for forecasting at all; 79% of CFOs reject fully autonomous AI in finance workflows. Vendor capability has run well ahead of the data quality, governance, and trust required to use it.

Three threads dominated. First, the agentic rollout went category-wide on the vendor side: Oracle announced 600-plus embedded agents across Fusion Cloud ERP with 1,000-plus already in production, while BILL.com (1.2 million invoices automated, AI elevated to its number-one priority), AvidXchange, and Feedzai posted production-scale agentic deployments -- establishing autonomous finance agents as table stakes rather than differentiation. Second, the governance reckoning arrived in public and at the top of the profession: KPMG withdrew an AI-generated research report after UBS, the NHS, Swiss Railways, and Transport for London contradicted its claims (only 5 of 45 citations checked out, roughly half the assertions fabricated); Deloitte Australia refunded $291K of a $440K government report containing fabricated citations and non-existent court judgments; EY Canada retracted a report with 59% hallucinated citations whose contaminated data then leaked into Claude and ChatGPT outputs. Four major consulting firms were caught publishing hallucinated analysis within months. Third, regulators moved from signalling to binding: the EU AI Act's credit-scoring compliance deadline hardened to 2 December 2027 with mandatory conformity assessment and bias testing; the PCAOB formalised that AI touching numbers, estimates, journal entries, reconciliations, or disclosures now falls inside SOX control environments; six US states enacted laws restricting autonomous AI in health-insurance coverage decisions; and the Financial Stability Board issued the most authoritative cross-jurisdictional governance framework to date, with 12 sound practices across the AI lifecycle. The mitigation evidence matured alongside the failures: a study of 480 million AI outputs showed multi-model verification cuts financial-sector hallucination from 9.1% to 3.2% -- real architectural progress, but a residual rate still too high for assurance-grade reporting.

Key Tensions

  • Capability is solved; organisational readiness is the binding constraint. Across credit, close, intercompany, forecasting, and narrative generation, the same finding recurs: vendors ship production-ready tools, but 88-95% of deployments fail to scale or show no P&L impact, with root causes traced to data fragmentation (57% of firms unprepared), unclear success criteria, and absent governance -- not model performance. Only 7% of CFOs report strong impact despite 60% deploying. A better frontier model does not move this number; better data infrastructure and governance discipline do.

  • The non-determinism wall. Finance demands deterministic, repeatable, auditable outputs -- the same invoice or elimination entry must produce the same result twice -- but large language models are probabilistic by construction. This is why journal-entry automation sits below 5% adoption while high-volume accounts-payable runs at 37%, and why intercompany elimination requires deterministic code generation as an intermediate layer rather than naive LLM automation. It is a structural mismatch no amount of scaling resolves, only careful architecture and human oversight. A Princeton study of 14 agent models over two years found accuracy improving steadily while reliability -- stable, predictable behaviour -- improved far more slowly.

  • The governance reckoning is now public and expensive. The Big Four hallucination failures this scan -- KPMG, Deloitte Australia, EY Canada -- converted an abstract risk into named reputational and financial damage in regulated deliverables, including a $291K refund and source-data contamination that propagated into consumer AI tools. Stanford's 2026 index found hallucination rates of 22-94% across 26 models; 45% of AI-generated finance content contains inaccuracies, and 74% of enterprises now name inaccuracy as their top AI risk. The "trust tax" -- the cost of auditing AI output for compliance -- is precisely what traps finance and healthcare pilots short of scale.

  • Regulation is hardening from guidance into binding obligation. The EU AI Act's December 2027 credit-scoring deadline, the PCAOB's inclusion of AI-influenced figures within SOX/ICFR controls, six US states restricting autonomous AI in insurance coverage decisions, a Pennsylvania settlement with GEICO over an AI-driven policy cancellation, and the FSB's lifecycle framework collectively establish a regulatory floor. Governance defensibility -- not model accuracy -- is becoming the basis on which finance leaders select vendors, which is why the Big Four standardise on a single governed platform rather than chasing benchmark winners, and why Stanford's finding that no model "wins" on hallucination matters less than which vendor offers the cleanest audit trail.

  • The vanguard pulls value while the profession restructures around it. Forward-leaning firms run continuous population-wide audit analysis, agentic credit hubs serving 100-plus institutions, and multi-line straight-through claims processing at real scale (Allianz 65% automation, Markel 113% productivity gain), while only about a third of institutions have anomaly detection in production and the majority of multinationals still reconcile intercompany balances by hand. The same securities-litigation risk that constrains the leaders is visible in the ongoing Upstart Model 22 class actions, where even a market-leading production credit model is alleged to carry undisclosed macro-sensitivity failure -- a reminder that being first carries its own undisclosed model risk.

Top 10 Evidence Items

  1. KPMG Pulls AI Report After Hallucinated Claims About Major Organisations (news-coverage) — The defining governance failure of this cycle: only 5 of 45 citations survived fact-checking, UBS, NHS, Swiss Railways, and Transport for London publicly contradicted claims made in their names, and GPTZero confirmed AI origin — converting the abstract hallucination risk into named reputational damage at the top of the profession. https://theaiinsider.tech/2026/06/16/kpmg-pulls-ai-report-after-hallucinated-claims-about-major-organisations/

  2. EY Canada's AI Report Had 59% Fake Citations. Now AI Repeats Them. (news-coverage) — The downstream contamination story the other Big Four failures lack: EY Canada's hallucinated source material leaked into Claude and ChatGPT outputs, illustrating how a single governance failure in regulated publishing can corrupt the training-adjacent corpus that practitioners then query. https://byteiota.com/ey-canadas-ai-report-had-59-fake-citations-now-ai-repeats-them/

  3. Deloitte to Partially Refund Australian Government for Report with Apparent AI-Generated Errors (news-coverage) — The $291K refund on a $440K contract makes the governance reckoning financially concrete: fabricated citations and non-existent court judgments in a 237-page regulated deliverable, producing the clearest unit-cost signal yet for what the "trust tax" actually extracts. https://ground.news/article/deloitte-to-partially-refund-australian-government-for-report-with-apparent-ai-generated-errors_a45143

  4. Financial Stability Board: Sound Practices for Responsible Adoption of AI — Consultation Report (industry-report) — The most authoritative cross-jurisdictional governance framework to date, proposing 12 sound practices across the full AI lifecycle for financial institutions; its publication signals that regulators have shifted from signalling to drafting binding obligations and that governance defensibility is now the selection criterion for enterprise AI vendors. https://www.fsb.org/2026/06/sound-practices-for-responsible-adoption-of-artificial-intelligence-ai-consultation-report/

  5. EU AI Act High-Risk Classification for Credit Scoring — Binding Regulatory Milestone (industry-report) — The December 2, 2027 compliance deadline with nine mandatory obligations spanning risk management, bias testing, transparency, and conformity assessment establishes the regulatory floor the summary describes; the hardness of this date is what converts "governance guidance" into "vendor selection criterion." https://www.regulatoryai.eu/ai-credit-scoring/

  6. Six States Restrict AI Claim Denials as ABA Audits Tighten (adoption-metric) — With 84% of health insurers already deploying AI in utilization management, the six-state legislative wave restricting autonomous coverage decisions is the clearest evidence that regulation is arriving at production scale simultaneously with adoption — the governance-capability race the summary frames as a key tension. https://breakingnewsaba.com/policy/six-states-restrict-ai-claim-denials-as-aba-audits-tighten

  7. Pentagon Brings in Agentic AI to Address Their Audit Problems (adoption-metric) — The $49M Groundswell "Agentic Auditor" contract running through 2031 to achieve a congressionally mandated clean audit is the strongest public-sector signal that audit anomaly detection is advancing while the rest of the domain stalls; it also benchmarks the cost of deploying AI against an explicitly non-deterministic, multi-trillion-dollar problem. https://www.goingconcern.com/friday-footnotes-great-kpmg-got-the-whole-big-4-in-trouble-pentagon-brings-in-agentic-ai-to-address-their-audit-problems-6-12-26/

  8. Pomerantz Class Action Against Upstart Holdings for Model Governance Failures (news-coverage) — Multiple concurrent securities class actions alleging Model 22 overstated accuracy and undisclosed macro-sensitivity failures show that being the market-leading production credit model does not immunise against model-governance litigation; this is the vanguard's undisclosed risk the summary flags. https://www.prnewswire.com/news-releases/pomerantz-law-firm-announces-the-filing-of-a-class-action-against-upstart-holdings-inc-and-certain-officers--upst-302789574.html

  9. CFOs Funded the AI Revolution. Most Didn't Get One. (opinion) — The 84%-deployed versus 7%-strong-impact gap, with FP&A forecasting at only 12% full production and 53% not using AI for forecasting at all, is the clearest single-source quantification of the adoption-outcome disconnect the entire summary is built around. https://www.nexairi.com/article/Finance/finance-ai-adoption-impact-gap/

  10. Enterprise AI Hallucination Rates Drop 61% When Using Multi-Model Verification Architecture (research-paper) — A 480-million-output study demonstrating that architectural mitigation (multi-model verification) cuts financial-sector hallucination from 9.1% to 3.2% is genuine progress — but the residual 3.2% is still too high for assurance-grade reporting, which is precisely why the "trust tax" persists even as the technical baseline improves. https://natlawreview.com/press-releases/enterprise-ai-hallucination-rates-drop-61-when-using-multi-model