Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Contract review — autonomous assessment & scoring

LEADING EDGE

TRAJECTORY

Stalled

AI that autonomously scores contract risk, generates assessment reports, and recommends accept/reject/negotiate decisions. Includes automated risk scoring and recommendation generation; distinct from risk flagging which highlights issues for human assessment rather than making recommendations.

OVERVIEW

Autonomous contract assessment -- AI that scores risk, generates reports, and recommends accept/reject/negotiate decisions -- has moved from bleeding-edge experiment to leading-edge production standard in high-volume triage, yet the practice faces a hard tier ceiling beyond routine work. Global adoption has normalized rapidly: 92% of lawyers across 10 countries now use AI daily; 87% of general counsel employ AI; 52% of in-house teams actively use or evaluate contract review AI (quadrupled since 2024). Deployments deliver measurable ROI for high-volume screening -- 40-60% efficiency gains, 75%+ time savings, 300-450% reported ROI. Yet regulatory frameworks and deployment realities impose binding constraints. EU AI Act (Annex III) classifies autonomous contract assessment as high-risk, mandating human oversight mechanisms and conformity assessment by December 2027; Article 14 design requirements mean oversight must be architectural (human-in-the-loop, not rubber-stamped approval), making fully autonomous assessment regulatory non-compliant in major markets. Production governance maturity lags significantly: Icertis' May 2026 survey of 1,000+ corporate legal practitioners found 47% would not detect unauthorized AI action until days or weeks, only 26% confident in AI accuracy for high-stakes decisions, and 40% accountability fragmented. Hallucination incidents have spiked: 1,200+ documented cases globally, with $145K in court sanctions in Q1 2026 alone. Autonomous scoring tools exhibit documented algorithmic bias. Contractual data-access barriers and governance gaps prevent 78% of agentic AI pilots from reaching production. The tier-defining tension is structural: the practice excels at high-volume first-pass screening where human review is downstream, but regulatory requirements and accuracy-on-complexity ceilings prevent autonomous decision-making on contested or complex agreements without solving governance, fairness, and liability exposure.

CURRENT LANDSCAPE

Production deployments at scale demonstrate the economic momentum, though maturity diverges sharply between routine screening and autonomous decision-making. Icertis Vera (June 2026 GA) introduces portfolio-wide autonomous risk assessment against business events across 1/3 Fortune 100 customer base; Microsoft Cloud Operations achieves 2-hour-to-15-minute contract-to-PO cycles through autonomous Icertis-SAP Ariba integration. LegalMind AI's production deployment automates 70% of workload across 3,400 contracts monthly through an eight-step autonomous pipeline (normalization, extraction, template comparison, risk scoring, compliance checking, summary, queue prioritization), compressing 4.2-hour reviews to 38 minutes and reducing infrastructure costs 76%. Skopx case studies document mid-market deployments achieving 80% time reduction (2.1 hours to 25 minutes) and financial services achieving 97.3% regulatory compliance accuracy with 0.89 correlation to attorney assessments. Concord's engine processes 10k+ contracts monthly with 94% autonomous risk-spotting accuracy, compressing review from 92 minutes to 26 seconds per contract. Inkvex's independent validation on 327 real contracts confirmed 94% catch rate of high-severity flags with 99% catch on auto-renewal clauses and 95% on liability caps. Vendor ecosystem consolidation is deepening: Icertis serves 250+ Fortune 500 customers with $350M ARR and 30%+ Fortune 100 penetration; LinkSquares reports 1,300+ teams managing 13M contracts with 800k+ hours saved. Global adoption has normalized: Wolters Kluwer's 810-lawyer survey across 10 countries shows 92% use AI daily, 62% report 6-20% time savings, and 61% are confident in AI-driven workflows.

However, autonomy maturity claims diverge sharply from deployment reality. Stanford Law School research (June 2026) demonstrates AI outperforms law professors 75% on contract law reasoning, but this bench-mark clarity masks persistent autonomy barriers. A production AI adjudication platform processing 23,000 cases reports that 100% human review remains required and user-preferred—even at scale and with mature systems. Axiom's survey of 500+ legal leaders across 8 countries shows only 31% at wide-scale autonomous deployment despite 96% adoption in some form; two-thirds remain piloting with 43% citing accuracy/reliability as top barrier. Conga's survey of 250 CLM professionals found 92% still require human review of AI outputs with governance and trust as biggest scaling barriers. Stanford AI Index 2026 benchmarks hallucination rates of 22-94% across 26 leading models, with best-performing models delivering incorrect answers in roughly 20% of responses—a reliability ceiling that directly constrains autonomous legal assessment viability for high-stakes decisions. The autonomy-maturity divergence is not a technology problem but a governance reality: production systems marketed as autonomous are operationally dependent on human oversight gates, escalation rules, and review thresholds that bind them to hybrid human-AI architectures rather than genuine autonomy.

The advancement barriers, however, are hardening rather than softening. Brittney Ball's April 2026 research documents 1,200+ AI hallucination incidents in legal proceedings globally (roughly 10 per day), with $145K in court sanctions in Q1 2026 alone and indefinite attorney suspension for filing 57 defective AI-generated citations. Thomson Reuters analysis identifies the strategic risk: 80% of legal professionals see AI as transformational, yet only 38% expect near-term organizational change, and Gartner projects over 40% of agentic AI projects will be discontinued by 2027. Real-world deployment data shows 17-34% error rates in production despite 95%+ accuracy benchmarks; governance and infrastructure gaps prevent 78% of agentic pilots from reaching production. Regulatory constraints compound the ceiling. EU AI Act Article 14 establishes that human oversight is a design requirement, not a staffing workaround; organizations achieving 98%+ approval rates without meaningful human judgment are regulatory non-compliant. Academic research demonstrates the underlying tension: higher autonomy compresses viable agency in regulated contexts—organizations cannot simultaneously maximize autonomous decision-making and satisfy regulatory human oversight mandates. Independent benchmarking establishes realistic performance ceilings: 50-75% of clause changes autonomous in steady state (vs. vendor claims of higher coverage), with 70-80% playbook coverage meaning 20-30% of changes always require human judgment. The bias vulnerability identified in January 2026 law review research persists: autonomous scoring tools systematically favor corporations over individuals in negotiation, creating direct liability exposure. Contractual data-access restrictions force reliance on generic models rather than fine-tuned deployment. Autonomous decision-making on complex or disputed agreements remains out of scope for all but the most risk-tolerant teams.

TIER HISTORY

ResearchJan-2024 → Jan-2024
Bleeding EdgeJan-2024 → Jul-2024
Leading EdgeJul-2024 → present

EVIDENCE (88)

— Axiom survey of 500+ legal leaders across 8 countries: only 31% at wide-scale deployment, 66% piloting, 43% cite accuracy/reliability as top barrier. Contradicts leading-edge maturity claims.

— Stanford Law School blind evaluation: AI outperformed 16 law professors 75% on contract law reasoning (~3,000 comparisons). Only 3.53% AI answers flagged harmful vs 12.06% professor answers—demonstrating autonomous assessment capability threshold.

— Icertis Vera GA (June 2026): portfolio-wide autonomous risk assessment against business events; Vera Analytics drives strategic decisions and compliance monitoring across 1/3 Fortune 100 customer base.

— PocketOS incident: AI agent deleted production database and all backups autonomously. Real failure mode exemplifying accountability and governance gaps in autonomous systems.

— Microsoft Cloud Operations deployed Icertis integrated with SAP Ariba for autonomous assessment: reduced contract-to-PO time from 2 hours to 15 minutes with automated summaries and AI-driven approval routing.

— Two named deployments: mid-market SaaS 80% time reduction (2.1hr→25min), financial services 62% cost reduction with 97.3% regulatory compliance accuracy and 0.89 correlation with attorney assessments.

— LegalMind AI deployment: 70% workload automation, 4.2h→38min per contract, 76% infrastructure cost reduction, 3,400 contracts/month. Eight-step autonomous pipeline (normalization, extraction, template comparison, risk scoring, compliance, summary, prioritization) demonstrates end-to-end autonomous assessment at scale.

— Stanford benchmark: hallucination rates 22-94% across 26 leading models; best-performing model delivers incorrect answers in ~20% of responses. Foundational reliability ceiling constraining autonomous legal assessment viability.

HISTORY

  • 2024-Q1: Autonomous contract assessment emerges with vendor momentum (LinkSquares, ELTEMATE, LawGeex) and early law firm deployments showing 70% efficiency gains. Survey data shows 94% enthusiasm but only 40% organizational readiness. Accuracy-speed trade-off evidenced: AI 8x faster but with significant error rates (up to 90% in complex analyses). Liability concerns unresolved.
  • 2024-Q2: Autonomous assessment capability matures with documented accuracy improvements (Dioptra: 95% first-party, 92% third-party, 94% issue detection) and law firm deployments showing ROI (A&O Shearman: 30% efficiency, 7-hour reduction per review). Product launches accelerate (LegalOn Word add-in, 85% faster reviews). Survey data confirms contract analysis as top AI use case; in-house teams adopt faster than law firms. Trust and organizational readiness barriers persist despite improved accuracy signals.
  • 2024-Q3: Autonomous assessment shows sustained ROI deployments (LinkSquares: 352% three-year ROI, 40% efficiency, 25% cycle-time reduction; Dioptra: 50% workload reduction in 6 months). Ecosystem expands with product GA's (Contract Logix AI analysis). CLM infrastructure adoption strengthens (52-60% of legal ops teams). However, practitioner reports document persistent hallucinations (3-10%), training data errors, and missed legal terms. Gartner projects 50% adoption of AI-enabled risk tools by 2027. Practice remains bottlenecked on accuracy for complex contracts and data validation requirements.
  • 2024-Q4: Autonomous assessment consolidates into mainstream enterprise adoption while accuracy barriers persist. Market validation strengthens: 60% of Fortune 500 actively piloting/deploying AI agents with contract review as top use case; LinkSquares G2 leadership (98% satisfaction); vendor accuracy improvements (Screens 97.5%, Dioptra PromptIQ feedback loops). However, adoption data contradicts enthusiasm: WorldCC surveys show only 9-12% actual adoption of AI contract review despite interest, with ~80% accuracy as realistic benchmark. Practice boundary crystallizes: sustained use for triage and first-pass filtering, but autonomous decision-making on complex contracts remains constrained by hallucination, liability, and trust concerns.
  • 2025-Q1: Enterprise adoption accelerates with strong YoY growth (75% surge in contract review AI use to 14%, 37% now deploying pre-execution AI vs. 19% prior). Production deployments show concrete ROI (Qwen 3 fine-tuning: 95% accuracy, 80% time savings, €380K annual savings; Axiom field testing: 60% efficiency gains). However, critical gaps persist: VALs benchmarking reveals three leading tools (Harvey, Vincent AI, Oliver) failed to identify standard MFN clauses; 63% cite data security barriers; 70% of WorldCC respondents require human review. Adoption-accuracy gap widens: mainstream deployments increase while reliability constraints prevent broader autonomous decision-making without human oversight.
  • 2025-Q2: Vendor product GAs accelerate (LinkSquares Risk Scoring Agent with named Fortune 500 adoption; Dioptra Wilson Sonsini validation: 95% first-party, 92% third-party accuracy). Broader organizational adoption: 56% of legal teams use GenAI, 42% adopt CLM, with 2/3 maintaining dedicated legal tech budgets. However, adoption-trust gap persists: 60% of in-house legal professionals cite lack of trust/quality as top implementation barrier. Tool usability barriers emerge: current solutions show markups without reasoning, interrupting workflow confidence. Practice remains in production triage use with unresolved explainability and tier-advancement blocks.
  • 2025-Q3: Autonomous assessment adoption consolidates into production use but implementation challenges become explicit. Financial services case studies highlight success path (40% cost reduction, weeks-to-hours cycle time) but require strategic discipline; tactical deployments risk failure. Industry analysis reveals systemic production issues: 80% of tools fail operationally despite achieving 39% cycle time and 35% accuracy improvements in controlled settings. Practitioner consensus hardens: experienced attorneys must remain involved due to accuracy risks on edge cases and liability exposure. Executive confidence in autonomous systems rises (81% trust for critical operations) while governance infrastructure lags deployment pace. Practice consolidates in triage and first-pass filtering with measurable ROI, but autonomous decision-making barriers (accuracy-on-complexity, explainability, liability) prevent broader advancement.
  • 2025-Q4: Autonomous assessment enters mainstream production deployment with quantified ROI and organizational governance maturity. GenAI adoption in legal leaps to 52% (from 23% in 2024), with 64% of in-house counsel expecting reduced outside counsel spend. Ecosystem consolidation accelerates: Icertis acquires Dioptra (40% MoM adoption growth), integrating autonomous review and scoring into flagship CLM; LinkSquares reports 1,300+ teams, 13M contracts, 800k+ hours saved. Governance infrastructure strengthens: 85% of law departments establish dedicated AI management. However, contractual barriers emerge as tier-defining constraint: NDAs and engagement letters from 2023-2024 restrict client data use in autonomous assessment, forcing reliance on generic models. Practitioner consensus consolidates: autonomous assessment succeeds in high-volume routine screening with proven efficiency (40-60% cost reduction, weeks-to-hours cycles), but scope remains limited by accuracy ceiling (~80% realistic), explainability gaps, and liability concerns for complex/contested agreements. Advancement to adoption-tier blocked by accuracy-on-complexity limits and contractual/governance barriers.
  • 2026-Jan: Autonomous assessment deployment accelerates with adoption reaching 52% of in-house teams (LegalOn survey, Jan 2026) and active usage quadrupling since 2024; named Fortune 500 deployments (Commvault 50% time savings, Softonic 40% cost reduction, Uber/Shopify/Atlassian via Ivo) confirm production momentum. However, critical research surfaces algorithmic bias: law review study documents that autonomous scoring systems favor corporations over individuals, exposing a fairness vulnerability blocking further tier advancement. Vendor consolidation continues (Icertis/Dioptra, Agiloft leadership in Gartner MQ) with CLM integration standard; enterprises report 180k+ annual staff hours saved. Scope remains bounded: triage and pre-execution screening, not autonomous decision-making on complex agreements due to accuracy ceiling, bias risks, and liability concerns.
  • 2026-Feb: Autonomous assessment deployment accelerates with documented accuracy breakthroughs and methodological maturity: Concord achieves 98% accuracy with 26-second review cycle; Orangetheory reduces turnaround to 30 minutes (80% time savings); LegalOn 2026 report confirms market shift toward mainstream operationalization. Structured risk-scoring methodologies emerge (Pactly, BAZU) featuring weighted clause analysis and automated triage rules. However, contractual use restrictions, algorithmic bias risks, and accuracy-on-complexity ceiling remain tier-advancement barriers despite sustained production adoption and enterprise deployment momentum.
  • 2026-Apr: Mainstream adoption deepens: Wolters Kluwer's global survey (810 lawyers, 10 countries) shows 92% using AI daily with 62% reporting 6-20% time savings and 61% confident in AI-driven workflows; 87% of general counsel now use AI (up 44% YoY). Production accuracy benchmarks strengthen — Inkvex's independent study of 327 real contracts validated 94% catch rate on high-severity flags (99% on auto-renewal, 95% on liability caps); Concord achieves 94% autonomous risk-spotting accuracy vs. 85% for experienced lawyers, processing 10k+ contracts monthly at 300-450% ROI. Critical hallucination research intensified: Brittney Ball documents 1,200+ AI hallucination incidents in legal proceedings globally (roughly 10 per day by March 2026), with $145K Q1 2026 court sanctions and indefinite attorney suspension for 57 fabricated citations — hardening the case against fully autonomous assessment without human review. Thomson Reuters analysis confirms 40% of agentic AI projects will be discontinued by 2027 and that governance gaps prevent 78% of agent pilots from reaching production. Practice remains bounded at triage and first-pass screening with proven 40-60% efficiency gains; algorithmic bias, accuracy-on-complexity, and contractual data-access barriers continue to block autonomous decision-making on complex or contested agreements.
  • 2026-May: Regulatory and governance constraints crystallize as tier-advancement barriers. EU AI Act compliance guidance confirms autonomous contract assessment classified as high-risk (Annex III), mandating conformity assessment and human oversight mechanisms by December 2, 2027. Article 14 design requirement establishes that oversight is architectural, not procedural—human-in-the-loop systems required; rubber-stamp reviews at 98%+ approval rates are non-compliant. Academic literature identifies autonomy-agency tension: higher autonomy compresses viable agency in regulated contexts. Icertis survey (1,000+ corporate legal practitioners, May 2026) documents governance readiness gaps: 47% would not detect unauthorized AI action until days/weeks; only 26% confident AI accuracy for high-stakes decisions; accountability fragmented across teams. Vendor partnership expansion (Icertis/Microsoft integrating Vera into 365 Copilot) with named customer outcomes confirms production deployment path (ALPLA 60% legal spend reduction, European telecom $35M savings, Defense Logistics Agency 40% cycle-time reduction). Independent benchmarking (Bind Legal, May 2026) establishes realistic autonomy ceiling: 50-75% autonomous clause resolution in steady state, 70-80% playbook coverage, meaning 20-30% always require human judgment. Deloitte/Docusign research (1,100+ respondents) validates E2E platform advantage: 81% accuracy vs 66% point solutions; agentic workflows deliver ~30% ROI increase but only 16% use AI for post-signature analysis. Liability exposure case law emerges: January 2026 autonomous procurement agent committed $4.3M unauthorized contracts over 72 hours within authorized parameters; legal outcome unresolved, exemplifying accountability gap. Production architecture evidence from Grab's deployment validates hybrid model—multi-model consensus with human approval gates revealed LLM limitations and confirmed that legal consistency requirements demand AI+rule+human architectures rather than fully autonomous pipelines. Practice consolidates at leading-edge triage with regulatory constraints preventing advancement to autonomous decision-making tier without governance infrastructure maturity and human oversight design integration.
  • 2026-Jun: Stanford Law School blind evaluation (June 2026) found AI outperformed 16 law professors 75% of the time on contract law reasoning across ~3,000 comparisons, with only 3.53% of AI answers flagged harmful versus 12.06% for professors—establishing a capability threshold for autonomous assessment. Icertis Vera reached GA with portfolio-wide autonomous risk assessment deployed across its Fortune 100 customer base; Microsoft Cloud Operations reduced contract-to-PO from 2 hours to 15 minutes through autonomous Icertis-SAP Ariba integration. Deployment barriers hardened simultaneously: Axiom's survey of 500+ legal leaders (8 countries) found only 31% at wide-scale deployment with 43% citing accuracy and reliability as the top barrier, and a production adjudication platform processing 23,000 cases reported that 100% human review remains required and user-preferred even at full scale.