The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.
A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.
Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail
Structured assessment and classification of AI system risks and potential impacts on individuals, communities, and operations. Includes risk tiering frameworks and stakeholder impact mapping; distinct from AI incident tracking which responds to actual rather than potential risks.
AI risk assessment and impact evaluation has reached an uncomfortable plateau. The practice — structured identification, classification, and mitigation of potential harms from AI systems — now has mature standards (NIST AI RMF claims 40–60% Fortune 500 adoption), capable vendor tooling, and binding regulatory mandates across multiple U.S. states. Forward-leaning organisations in financial services, government, and life sciences are running real assessments. Yet most organisations deploying AI have not implemented formal risk frameworks, and project failure rates remain stubbornly high. The defining tension is no longer whether frameworks exist but whether organisations can translate risk evaluations into credible go/no-go deployment decisions. That translation problem — organisational, not technical — is what keeps this practice at the leading edge rather than good practice, and it shows no clear signs of resolving.
Regulation and market pressure have reached inflection point. State-level mandates and federal OMB M-26-04 now make risk assessment a contractual requirement across government and regulated sectors. Simultaneously, the market is signaling urgency: the AI Trust, Risk, and Security Management (AI TRiSM) platform market reached $3.1B in 2025 and is projected to hit $13.8B by 2030 (35% CAGR), driven by 95% of C-suite executives reporting AI incidents in the past two years; organizations without governance frameworks reported average losses of $4.4M per incident.
Vendor tooling has matured in response. Credo AI’s GAIA (general availability May 2026) automates risk identification and control recommendations from four years of enterprise deployments, addressing the governance bottleneck where teams review only 6 of 100+ AI requests; integrations with development platforms (Microsoft Azure AI Foundry) enable real-time risk evaluation within deployment workflows. Anthropic demonstrated production-grade risk assessment methodology for the 2026 US midterms, deploying automated detection systems, red-team stress testing (600+ evaluation prompts), and published benchmarks (95% political neutrality, 100% compliance). These are credible evidence of the practice operating at scale. Regulatory frameworks are now legally binding: the EU Commission released Article 6 high-risk AI classification guidance (May 19, 2026) defining which systems require Fundamental Rights Impact Assessments, with compliance penalties reaching 35 million euros or 7% global turnover.
Yet the critical-mass adoption signal is mixed. Deloitte’s April 2026 survey of 3,235 enterprise leaders found 74% of organizations expect agentic AI deployment within two years — but only 21% have mature governance models for autonomous agents, revealing a widening deployment-governance gap just as AI systems become more autonomous. The Stanford 2026 AI Index found 55% decline in model transparency (Foundation Model Transparency Index: 58→40 out of 100) paired with 55% surge in documented AI incidents (233→362), yet only 36% of organizations cite NIST RMF or ISO/IEC 42001 as influences on practice. On the ground: organizations struggle to translate risk assessments into deployment decisions. The frameworks and tools are mature. Organizational readiness to use them — and more critically, to act on negative assessments — remains fundamentally constrained.
Critical evidence of assessment failure emerged in April and May: independent audits revealed risk assessment methodology itself can systematically fail. One case documented an AI system assigning high-risk ratings to foreign-owned companies based on identity status rather than actual performance, creating a de facto double standard. A second case showed how traditional risk frameworks missed an adversarial data poisoning attack in a credit decision system — loan denials increased despite acceptable accuracy metrics. A systematic review of 77 healthcare AI governance frameworks found two-thirds address only 1–2 of 4 core assessment components, with only 13% implementing comprehensive coverage—revealing methodology fragmentation even in regulated sectors. Quantified enterprise failures (80% per RAND, 95% GenAI pilots producing zero ROI) traced root causes to inadequate data readiness (7% complete), undefined success metrics (73% of failed projects), and governance treated as an afterthought. These failures demonstrate that mature frameworks and vendor tooling cannot prevent risk assessments from missing real harms. The constraint is not the availability of assessment tools; it is organizational capacity to implement them rigorously, across diverse deployment contexts, and act decisively on adverse findings.
— UNESCO's operationalized impact assessment tool covering fairness, privacy, security, safety, transparency, accountability, human rights; six-phase methodology with facilitation support demonstrates structured assessment practice at organizational scale.
— Cloud Security Alliance expands RiskRubric v2 to assess MCP servers and agentic systems with multi-evaluator ecosystem; addresses critical scope expansion as risk assessment moves beyond models to autonomous agents.
— EU Commission May 19 guidance establishes 'material influence' test for high-risk AI classification; operationalizes impact assessment methodology under Article 27 Fundamental Rights Impact Assessment; Dec 2027 compliance deadline, €35M penalties.
— OWASP GenAI Security launches two-axis governance framework (6-level deployment vs. 4-level governance maturity) mapping agent autonomy against risk controls; addresses 71% of deploying enterprises lacking formal agentic-specific governance.
— Expert-driven Delphi methodology for AI risk prioritization: 272 international experts assessed 24 risks on probability/severity; 75% judged >10% catastrophic probability under business-as-usual; demonstrates systematic, multi-stakeholder risk assessment practice.
— Treasury-authorized Feb 2026 framework with 230 mapped control objectives spanning governance, data, model lifecycle, monitoring; only 18% of banking leaders confident in AI controls audit, quantifying readiness gap in regulated sector.
— Survey of ~1000 leaders: 78% cannot pass independent AI governance audit within 90 days while 74% deployed agentic AI to core processes—strongest direct signal of organizational readiness and risk assessment capability gaps.
— Enterprise risk assessment shift: 74% cite model inaccuracy/hallucination as top AI risk (up 14 points YoY), surpassing cybersecurity; hallucination benchmarks across 26 models show 22-94% error rates—signals maturation of reliability-focused risk assessment.
2022-H1: NIST AI RMF draft published March 2022, establishing foundational framework for sociotechnical risk assessment; White House endorsement signaled government priority. Fortune 100 case study showed design-thinking approach to risk assessment in logistics. Academic research revealed both methodology innovations (RADQ) and critical gaps (220+ tools covering only partial AI lifecycle). Practice remained pre-mature: standards-focused, with limited enterprise deployment and fragmented tooling ecosystem.
2022-H2: NIST AI RMF Playbook published (July), providing actionable guidance as framework moved toward January 2023 finalization. Credo AI demonstrated commercial traction with 3X customer growth in production governance platforms. Research continued addressing methodological gaps—IBM quantified challenges in quantitative risk assessment, Hitachi published ISO-based business process risk assessment methods, and NIST acknowledged incomplete harm classification and time-dependent risk evolution as fundamental limitations. Field at inflection point: standards solidifying, early deployment beginning, but no validated single practice yet.
2023-H1: NIST AI RMF 1.0 officially released January 2023, codifying socio-technical governance approach with four functions. Early operational adoption emerged: UK's CESIUM system demonstrated risk assessment for child safeguarding with 400% projected capacity gains; Northrop Grumman applied framework to unmanned vehicle governance. Research highlighted persistent evaluation gaps—Science journal identified weak reporting standards limiting assessment transparency; systematic mapping of 16 RAI frameworks revealed fragmentation in lifecycle and domain coverage. Framework formalization accelerated adoption but evaluation methodology maturity remained constrained.
2023-H2: Adoption accelerated across defense, public sector, and enterprise: Credo AI expanded platform deployments across financial services, life sciences, government; academia released complementary guidance (UC Berkeley Standards Profile for foundation models, GovAI analysis of safety-critical assessment techniques). Retool survey revealed mixed maturity: 75% of companies deploying AI but 50% at "fledgling" stage. Field recognized persistent tensions—breadth of assessment across lifecycle/stakeholders versus practical operationalization—and methodological gaps in domain-specific risk evaluation remained. Practice shifted from aspiration to deployment, but consistency and depth of implementation remained variable.
2024-Q1: Frameworks proliferated (Singapore Model AI Governance Framework 2024) but adoption gaps widened. Deloitte survey found only 25% of executives believed organizations highly/very highly prepared for AI governance and risk, despite three years of NIST RMF development. Technical researchers identified policy-tooling misalignment (Stanford), governance tools showed documented flaws (World Privacy Forum), and private-sector implementations remained sporadic and selective (maturity model research). Vendor investment continued—Credo AI Assist automated risk scenario/control recommendations—but organizations still lacked practical operationalization pathways. Core tension persisted: breadth of assessment versus operational feasibility and tooling adequacy.
2024-Q2: NIST released Generative AI Profile (April) with 12 novel risk categories for LLMs. International frameworks matured (UK AI Safety Institute interim report, Singapore Model AI Governance Framework update), explicitly acknowledging methodological limitations of existing risk assessment approaches. Vendor maturity advanced—Credo AI expanded Risk and Controls Library to 700+ scenarios with 400+ GenAI-specific controls—and institutional deployments emerged (University of Greenwich ARMS for academic integrity). However, organizational adoption remained constrained: Gartner survey showed only 48% of AI projects reach production and 9% of organizations focus on risk management capabilities, while industry measurement practices lacked standardization, limiting systematic risk assessment across the sector.
2024-Q3: NIST released draft misuse risk guidance for dual-use foundation models and Dioptra adversarial testing software (July). Federal government deployment accelerated with Booz Allen and Credo AI providing AI governance and risk assessment platforms to federal agencies for OMB M-24-10 compliance (September). Academic feedback (UC Berkeley) highlighted gaps in risk assessment for unacceptable harms and documentation practices. However, deployment-stage reality diverged sharply from policy ambition: Stanford reported AI incidents rose 56.4% to 233 in 2024, with McKinsey showing organizations lagged in implementing risk mitigation. Market corrections accelerated, with journalism documenting widespread enterprise failures—hallucinations, inaccuracy, liability concerns—indicating inadequate pre-deployment risk assessment and impact evaluation.
2024-Q4: Standards and tooling maturation continued with 100+ frameworks globally and NIST Generative AI Profile finalized, yet deployment-stage failures intensified, revealing the practice's core constraint. Board-level governance remained minimal (45% of boards had not addressed AI at all, 3% reporting organizational readiness). Concrete examples of assessment failure emerged: South Wales and London Metropolitan Police facial recognition systems unlawfully deployed to scan 500K+ people without consent; Rite Aid facial recognition causing false accusations with discriminatory impact. Organizational adoption bifurcated: 60% of large corporates established governance functions, yet 58% using genAI lacked controls (21-41% of users); 75% of corporate AI initiatives failed due to inadequate pre-deployment risk assessment. The field had generated comprehensive frameworks but organizations remained unable to translate them into deployment-stage go/no-go decisions.
2025-Q1: Risk assessment frameworks advanced toward operationalization: UC Berkeley published intolerable risk thresholds for frontier AI across eight risk categories; NIST finalized AI 800-1 misuse risk guidance with domain-specific extensions for cyber and CBRN; academic researchers published SAIF for systematic public sector risk evaluation. Enterprise deployments accelerated (Mastercard, others), yet adoption-governance gap persisted: Harris Poll showed 55% AI adoption vs. 42% formal policies; 49% of workers accessed company data via unsupervised tools. Public sector practitioners warned AI would fail without addressing foundational governance challenges. Standards maturity reached new levels but translation of risk assessments into deployment decisions remained the binding constraint.
2025-Q2: Vendor ecosystem matured with product integrations (Credo AI + Microsoft Azure AI Foundry enabling real-time risk evaluation and governance-to-code translation). Deployment-stage evidence emerged through Global AI Assurance Pilot (7 organizations implementing risk assessments across healthcare, finance, government sectors with use-case-specific metrics) and life sciences adoption (Castor three-tier framework aligned to EU AI Act). However, governance execution gap persisted: Pacific AI survey found 75% with policies but only 59% dedicated roles, 54% incident playbooks, 48% monitoring; only 30% deployed GenAI to production, indicating frameworks and tooling had achieved maturity but organizational readiness remained constrained. SaferAI released hierarchical methodology for operationalizing risk tiers quantitatively (harm-based and scenario-based thresholds), advancing standardization agenda. The field's core tension remained: risk identification had become routine; translating assessments into deployment decisions remained organizational bottleneck.
2025-Q3: Standards and vendor ecosystem reached maturity peak: Credo AI recognized as Forrester Wave Leader with 10x compliance acceleration; academic frameworks continued consolidating (UC Berkeley qualitative/legal risk assessment, SaferAI quantitative operationalization). However, a credibility crisis emerged as MIT Project NANDA reported 95% of GenAI investments yielded no ROI, directly implicating systemic failures in pre-deployment risk assessment. Pacific AI survey (July 2025, 351 respondents) confirmed governance-execution bifurcation: 75% policies, 59% dedicated roles, 30% production deployments. EU AI Act August 2026 deadline created regulatory urgency—compliance vendors detailed 32-56 week implementation timelines highlighting organizational readiness gaps. UC Berkeley's September analysis critiqued ROI metrics and proposed alternative evaluation frameworks, signaling debate over impact assessment methodologies. The field's core constraint hardened: mature frameworks and capable tooling could not overcome organizational inability to translate risk assessments into deployment decisions.
2025-Q4: Risk assessment transitioned from aspirational framework to infrastructure priority, yet execution gap widened into credibility crisis. Vendor ecosystem matured: Credo AI's 2025 deployments showed 2x revenue growth, 150% enterprise customer growth, 70% faster use-case reviews, 60% less manual compliance work. Regulatory drivers hardened: federal OMB M-26-04 mandate (March 2026), state laws (Colorado, California) made risk assessment contractually binding. However, independent assessment (FLI AI Safety Index, December 2025) revealed systematic governance deficiencies at leading AI companies (grades C+ to D-), confirming frameworks had outpaced organizational operationalization. Enterprise surveys documented persistent bifurcation: Deloitte (1,854 execs, October) reported minimal ROI and measurement challenges; AuditBoard showed only ~50% including risk oversight in board agendas; McKinsey: 72% with production AI but only 9% with mature governance. Practitioners described "compliance theater" where activity was tracked but real control and decision-lineage assurance remained absent. The central tension hardened into paradox: frameworks universal, vendor ROI clear, yet organizational readiness remained the immovable organizational bottleneck.
2026-Jan: Regulatory acceleration collided with operational limits. State-level risk assessment mandates took effect (California SB 53, Texas HB 149, Illinois employment AI rules, Colorado impact assessments), converting governance from aspirational to contractual requirement. Business risk awareness surged: Allianz Risk Barometer (3,300+ professionals) elevated AI from #10 to #2 global business risk, behind only cyber incidents. However, deployment failures continued: academic analysis documented 89% of AI investments producing minimal ROI, with specific failure cases (Workday discrimination lawsuits, Mount Sinai medical diagnostics bias, Air Canada hallucinations) revealing inadequate pre-deployment risk assessment across technical, operational, and stakeholder dimensions. Adoption metrics showed persistent barriers: Moody's survey of 600 risk/compliance professionals reported 53% adoption but only 30% significant benefit realization, with data quality, expertise gaps, regulatory uncertainty, and legacy system integration cited as obstacles. Real-world pilot evidence (Australian government Microsoft Copilot trial) highlighted gaps even in structured risk assessment—reliance on end-user review, overlooked team dynamics impacts, organizational readiness limitations. Infrastructure constraints remained binding: hackathon of 500+ builders revealed policy frameworks (EU AI Act) lacked practical verification systems and implementation tooling, with "policy frameworks without verification systems" characterizing the gap.
2026-Feb: Vendor tooling and governance frameworks matured while deployment-stage failures revealed assessment reliability gaps. Alan Turing Institute released practical AI Use Case Framework addressing real-world enterprise barriers to safe adoption. Credo AI launched GAIA (Govern AI Assistant), automating risk identification and control recommendations to address organizational scalability bottlenecks. Gallagher survey documented 63% AI operationalization but <47% formal risk frameworks, quantifying governance-execution bifurcation. Critical assessment data emerged: synthesis of 2026 failure statistics showed 80.3% overall project failure rate (95% GenAI pilot-to-production), indicating systemic pre-deployment assessment inadequacy. CRO frameworks for financial institutions (BIS/FSB standards) and NIST AI RMF independent analysis (40-60% Fortune 500 adoption, lacking quantitative ROI evidence) demonstrated maturation of assessment practices paired with growing skepticism about effectiveness. The field's defining paradox solidified: standards and tools achieved organizational infrastructure status, yet failure rates suggested risk assessments remained unreliable at deployment decision points.
2026-Mar: Major consortium deployment and regulatory codification accelerated adoption while revealing persistent operationalization gaps. Monetary Authority of Singapore (MAS) completed Project MindForge Phase 2 (March 20), publishing AI Risk Management Operationalization Handbook developed by 24 leading financial institutions (DBS, Julius Baer, Prudential, others) with two-level assessment (organization-level and use-case-level risk materiality), documenting structured deployment of risk frameworks at enterprise scale. EU AI Act requirements for mandatory Fundamental Rights Impact Assessments (Article 27) on high-risk systems entered regulatory force, with FRA research showing 'most organisations developing or using high-risk AI systems do not yet perform structured assessments that comprehensively address fundamental rights.' Market signals showed mainstream adoption: AI TRiSM platform market grew to $3.59B (2025) with projection to $46.8B (2034, 35% CAGR) driven by 80% Fortune 500 AI deployment and regulatory compliance mandates. Deloitte's 2026 survey (3,235+ leaders) documented paradoxical 'Readiness Deception': adoption accelerating (88% use AI in at least one function, 60% worker access) while governance (30% ready), infrastructure (43%), and data management (40%) readiness all declined—revealing execution gap widening as autonomous agent deployments planned for next 2 years. AICPA/CIMA survey (1,735 executives) showed 46% classify AI as Top-10 risk but only 24-27% possess adequate governance/talent/systems readiness; AI-transformed entities report escalating risk pressure. The field's core tension sharpened: regulatory mandates and market adoption drivers had standardized risk assessment frameworks (MindForge, EU AI Act, NIST RMF), yet organizational capacity to implement them and use assessments for deployment decisions remained the binding constraint.
2026-May: EU Commission published Article 6 high-risk AI classification guidance (May 19) defining which systems require Fundamental Rights Impact Assessments, with penalties up to €35M or 7% global turnover and a December 2027 compliance deadline. METR's third-party frontier risk report found internal agents at major labs plausibly had means-motive-opportunity for autonomous rogue deployment, while the Future of Life Institute AI Safety Index found none of seven major AI companies scored above D in existential safety and only three test for dangerous capabilities—confirming that risk assessment frameworks have not yet translated into credible organizational safeguards at the frontier.
2026-Apr: Assessment methodology failures and governance gaps converged with growing agentic deployment pressure. Deloitte's survey of 3,235 enterprise leaders found 74% expect agentic AI deployment within two years but only 21% have mature governance models, marking a critical widening of the deployment-governance gap at the moment autonomous systems become the dominant deployment mode. Two documented assessment failures exposed framework limits: adversarial data poisoning in a credit decision system caused loan denials to rise despite acceptable accuracy metrics, remaining invisible to traditional risk frameworks; independent audit found an AI risk scoring system applying systematically higher risk ratings to foreign-owned entities based on identity rather than performance, revealing discriminatory assessment design. Anthropic's production risk assessment for the 2026 US midterms demonstrated credible methodology at scale: automated detection plus red-team stress-testing across 600+ evaluation prompts achieved 95% political neutrality and 100% compliance on Opus 4.7, providing a published benchmark for systematic pre-deployment assessment. Stanford 2026 AI Index documented concurrent deterioration in model transparency (Foundation Model Transparency Index: 58→40) and surge in documented incidents (233→362), confirming that assessment practices have not kept pace with deployment scale.
2026-Jun: Risk assessment frameworks achieved regulatory and institutional codification while organizational implementation maturity remained constrained. EU Commission released Article 6 high-risk AI classification guidance (May 19) establishing "material influence" test for impact assessment; December 2027 compliance deadline with €35M/7% turnover penalties. Stanford AI Index 2026 (April) documented organizational shift in risk assessment priorities: 74% of enterprises now cite inaccuracy/hallucination as top AI risk (up 14 points YoY), surpassing cybersecurity; hallucination benchmarks across 26 models show 22-94% error rates. UNESCO launched operationalized Ethical Impact Assessment tool with structured six-phase methodology and facilitation network, demonstrating mature assessment practice frameworks at international scale. MIT's peer-reviewed Delphi study of 272 international experts established systematic risk prioritization methodology: experts assessed 24 risks on probability/severity with 75% judging >10% catastrophic probability under business-as-usual, validating multi-stakeholder risk assessment as standard practice. However, organizational readiness gaps persisted: Grant Thornton survey of ~1000 leaders found 78% cannot pass independent AI governance audit within 90 days while 74% already deployed agentic AI to core processes—revealing critical mismatch between assessment frameworks and organizational capacity. Cloud Security Alliance advanced RiskRubric v2 to assess MCP servers and agentic systems, addressing scope expansion as risk assessment moved beyond models to autonomous agents. Monetary Authority of Singapore published Project MindForge Phase 2 operationalization handbook from 24-institution consortium, documenting systematic organization-level and use-case-level risk assessment methodology in production financial services. OWASP released Enterprise Adoption Maturity Model addressing 71% of deploying enterprises lacking agentic-specific governance, with two-axis framework mapping agent autonomy against governance maturity. Treasury-authorized Financial Services AI RMF (Feb 2026) codified 230 control objectives spanning governance and data lifecycle, though only 18% of banking leaders reported confidence in audit readiness.