AI risk assessment & impact evaluation

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN

BLEEDING EDGEESTABLISHED

LEADING EDGE

TRAJECTORY— Stalled

Structured assessment and classification of AI system risks and potential impacts on individuals, communities, and operations. Includes risk tiering frameworks and stakeholder impact mapping; distinct from AI incident tracking which responds to actual rather than potential risks.

OVERVIEW

AI risk assessment and impact evaluation has reached an uncomfortable plateau. The practice — structured identification, classification, and mitigation of potential harms from AI systems — now has mature standards (NIST AI RMF claims 40–60% Fortune 500 adoption), capable vendor tooling, and binding regulatory mandates across multiple U.S. states. Forward-leaning organisations in financial services, government, and life sciences are running real assessments. Yet most organisations deploying AI have not implemented formal risk frameworks, and project failure rates remain stubbornly high. The defining tension is no longer whether frameworks exist but whether organisations can translate risk evaluations into credible go/no-go deployment decisions. That translation problem — organisational, not technical — is what keeps this practice at the leading edge rather than good practice, and it shows no clear signs of resolving.

CURRENT LANDSCAPE

Regulation and market pressure have reached inflection point. State-level mandates and federal OMB M-26-04 now make risk assessment a contractual requirement across government and regulated sectors. Simultaneously, the market is signaling urgency: the AI Trust, Risk, and Security Management (AI TRiSM) platform market reached $3.1B in 2025 and is projected to hit $13.8B by 2030 (35% CAGR), driven by 95% of C-suite executives reporting AI incidents in the past two years; organizations without governance frameworks reported average losses of $4.4M per incident.

Vendor tooling has matured in response. Credo AI’s GAIA and similar platforms now automate risk identification and control mapping at scale; integrations with development platforms (Microsoft Azure AI Foundry) enable real-time risk evaluation within deployment workflows. Anthropic demonstrated production-grade risk assessment methodology for the 2026 US midterms, deploying automated detection systems, red-team stress testing (600+ evaluation prompts), and published benchmarks (95% political neutrality, 100% compliance). These are credible evidence of the practice operating at scale.

Yet the critical-mass adoption signal is mixed. Deloitte’s April 2026 survey of 3,235 enterprise leaders found 74% of organizations expect agentic AI deployment within two years — but only 21% have mature governance models for autonomous agents, revealing a widening deployment-governance gap just as AI systems become more autonomous. The Stanford 2026 AI Index found 55% decline in model transparency (Foundation Model Transparency Index: 58→40 out of 100) paired with 55% surge in documented AI incidents (233→362), yet only 36% of organizations cite NIST RMF or ISO/IEC 42001 as influences on practice. On the ground: organizations struggle to translate risk assessments into deployment decisions. The frameworks and tools are mature. Organizational readiness to use them — and more critically, to act on negative assessments — remains fundamentally constrained.

Critical evidence of assessment failure emerged in April: independent audits revealed risk assessment methodology itself can systematically fail. One case documented an AI system assigning high-risk ratings to foreign-owned companies based on identity status rather than actual performance, creating a de facto double standard. A second case showed how traditional risk frameworks missed an adversarial data poisoning attack in a credit decision system — loan denials increased despite acceptable accuracy metrics. These failures demonstrate that mature frameworks and vendor tooling cannot prevent risk assessments from missing real harms. The constraint is not the availability of assessment tools; it is organizational capacity to implement them rigorously and act decisively on adverse findings.

TIER HISTORY

ResearchJan-2022 → Jul-2022

Bleeding EdgeJul-2022 → Oct-2024

Leading EdgeOct-2024 → present

EVIDENCE (96)

Anthropic Deploys Production Risk Assessment: 95% Political Neutrality, 600-Prompt BenchmarkCase Studies2026-04-25

— Vendor case study demonstrates systematic pre-deployment risk assessment methodology: automated detection, red-team stress-testing, published evaluation benchmarks (300 harmful, 300 legitimate), achieving measurable outcomes (95% neutrality, 100% compliance on Opus 4.7).

Traditional Risk Frameworks Fail for AI: Case Study of Silent Model Drift in Credit DecisionsCase Studies2026-04-23

— Real-world case study showing risk assessment blindness: traditional frameworks missed adversarial data poisoning attack (loan denials increased despite acceptable accuracy metrics), exposing gap between static testing and continuous model drift monitoring needs.

Deloitte's 2026 AI Enterprise Survey: 21% Governance Maturity Gap for Agentic AIAdoption Metrics2026-04-20

— Large-scale survey (3,235 leaders, 24 countries) documents critical gap: 74% of enterprises expect agentic AI deployment within 2 years, but only 21% have mature governance models, signaling need for systematic risk assessment.

Stanford 2026 AI Index: 55% Decline in Model Transparency, 362 Documented IncidentsAdoption Metrics2026-04-20

— Industry metrics show AI incidents rose 55% (233→362 in 2024-2025) while model transparency Index dropped 31%; 36% of organizations now cite ISO/IEC 42001 and NIST RMF as governance influences.

AI TRiSM Market Reaches $3.1B; 95% of C-Suite Report AI Incidents, Average Loss $4.4MAdoption Metrics2026-04-20

— Market-scale evidence: AI governance platform market growing 35%+ CAGR to $13.8B by 2030; 95% of C-suite experienced incidents, with 39% severe; organizations without governance frameworks averaged $4.4M loss per incident.

Independent Audit Exposes Structural Risk Assessment Flaw: Systematic Bias Against Foreign CompaniesCase Studies2026-04-20

— Critical audit reveals fundamental risk assessment methodology failure: AI system assigned high-risk ratings based on foreign identity status rather than performance, creating de facto double standard—demonstrates risk assessment frameworks can systematically fail.

MAS Partners Industry to Develop AI Risk Management Toolkit for Financial SectorAdoption Metrics2026-03-23

— Monetary Authority of Singapore completed Project MindForge Phase 2 with 24 financial institutions, producing AI Risk Management Operationalization Handbook covering scope, assessment, lifecycle management, and enablers—major consortium deployment of systematic risk assessment.

High-risk AI Fundamental Rights Assessment 2026 | Libertify.comIndustry Reports2026-03-21

— Comprehensive EU AI Act guidance on mandatory Fundamental Rights Impact Assessments (Article 27) for high-risk systems based on EU FRA research; shows regulatory drivers converting risk assessment from best practice to legal requirement.

HISTORY

2022-H1: NIST AI RMF draft published March 2022, establishing foundational framework for sociotechnical risk assessment; White House endorsement signaled government priority. Fortune 100 case study showed design-thinking approach to risk assessment in logistics. Academic research revealed both methodology innovations (RADQ) and critical gaps (220+ tools covering only partial AI lifecycle). Practice remained pre-mature: standards-focused, with limited enterprise deployment and fragmented tooling ecosystem.
2022-H2: NIST AI RMF Playbook published (July), providing actionable guidance as framework moved toward January 2023 finalization. Credo AI demonstrated commercial traction with 3X customer growth in production governance platforms. Research continued addressing methodological gaps—IBM quantified challenges in quantitative risk assessment, Hitachi published ISO-based business process risk assessment methods, and NIST acknowledged incomplete harm classification and time-dependent risk evolution as fundamental limitations. Field at inflection point: standards solidifying, early deployment beginning, but no validated single practice yet.
2023-H1: NIST AI RMF 1.0 officially released January 2023, codifying socio-technical governance approach with four functions. Early operational adoption emerged: UK's CESIUM system demonstrated risk assessment for child safeguarding with 400% projected capacity gains; Northrop Grumman applied framework to unmanned vehicle governance. Research highlighted persistent evaluation gaps—Science journal identified weak reporting standards limiting assessment transparency; systematic mapping of 16 RAI frameworks revealed fragmentation in lifecycle and domain coverage. Framework formalization accelerated adoption but evaluation methodology maturity remained constrained.
2023-H2: Adoption accelerated across defense, public sector, and enterprise: Credo AI expanded platform deployments across financial services, life sciences, government; academia released complementary guidance (UC Berkeley Standards Profile for foundation models, GovAI analysis of safety-critical assessment techniques). Retool survey revealed mixed maturity: 75% of companies deploying AI but 50% at "fledgling" stage. Field recognized persistent tensions—breadth of assessment across lifecycle/stakeholders versus practical operationalization—and methodological gaps in domain-specific risk evaluation remained. Practice shifted from aspiration to deployment, but consistency and depth of implementation remained variable.
2024-Q1: Frameworks proliferated (Singapore Model AI Governance Framework 2024) but adoption gaps widened. Deloitte survey found only 25% of executives believed organizations highly/very highly prepared for AI governance and risk, despite three years of NIST RMF development. Technical researchers identified policy-tooling misalignment (Stanford), governance tools showed documented flaws (World Privacy Forum), and private-sector implementations remained sporadic and selective (maturity model research). Vendor investment continued—Credo AI Assist automated risk scenario/control recommendations—but organizations still lacked practical operationalization pathways. Core tension persisted: breadth of assessment versus operational feasibility and tooling adequacy.
2024-Q2: NIST released Generative AI Profile (April) with 12 novel risk categories for LLMs. International frameworks matured (UK AI Safety Institute interim report, Singapore Model AI Governance Framework update), explicitly acknowledging methodological limitations of existing risk assessment approaches. Vendor maturity advanced—Credo AI expanded Risk and Controls Library to 700+ scenarios with 400+ GenAI-specific controls—and institutional deployments emerged (University of Greenwich ARMS for academic integrity). However, organizational adoption remained constrained: Gartner survey showed only 48% of AI projects reach production and 9% of organizations focus on risk management capabilities, while industry measurement practices lacked standardization, limiting systematic risk assessment across the sector.
2024-Q3: NIST released draft misuse risk guidance for dual-use foundation models and Dioptra adversarial testing software (July). Federal government deployment accelerated with Booz Allen and Credo AI providing AI governance and risk assessment platforms to federal agencies for OMB M-24-10 compliance (September). Academic feedback (UC Berkeley) highlighted gaps in risk assessment for unacceptable harms and documentation practices. However, deployment-stage reality diverged sharply from policy ambition: Stanford reported AI incidents rose 56.4% to 233 in 2024, with McKinsey showing organizations lagged in implementing risk mitigation. Market corrections accelerated, with journalism documenting widespread enterprise failures—hallucinations, inaccuracy, liability concerns—indicating inadequate pre-deployment risk assessment and impact evaluation.
2024-Q4: Standards and tooling maturation continued with 100+ frameworks globally and NIST Generative AI Profile finalized, yet deployment-stage failures intensified, revealing the practice's core constraint. Board-level governance remained minimal (45% of boards had not addressed AI at all, 3% reporting organizational readiness). Concrete examples of assessment failure emerged: South Wales and London Metropolitan Police facial recognition systems unlawfully deployed to scan 500K+ people without consent; Rite Aid facial recognition causing false accusations with discriminatory impact. Organizational adoption bifurcated: 60% of large corporates established governance functions, yet 58% using genAI lacked controls (21-41% of users); 75% of corporate AI initiatives failed due to inadequate pre-deployment risk assessment. The field had generated comprehensive frameworks but organizations remained unable to translate them into deployment-stage go/no-go decisions.
2025-Q1: Risk assessment frameworks advanced toward operationalization: UC Berkeley published intolerable risk thresholds for frontier AI across eight risk categories; NIST finalized AI 800-1 misuse risk guidance with domain-specific extensions for cyber and CBRN; academic researchers published SAIF for systematic public sector risk evaluation. Enterprise deployments accelerated (Mastercard, others), yet adoption-governance gap persisted: Harris Poll showed 55% AI adoption vs. 42% formal policies; 49% of workers accessed company data via unsupervised tools. Public sector practitioners warned AI would fail without addressing foundational governance challenges. Standards maturity reached new levels but translation of risk assessments into deployment decisions remained the binding constraint.
2025-Q2: Vendor ecosystem matured with product integrations (Credo AI + Microsoft Azure AI Foundry enabling real-time risk evaluation and governance-to-code translation). Deployment-stage evidence emerged through Global AI Assurance Pilot (7 organizations implementing risk assessments across healthcare, finance, government sectors with use-case-specific metrics) and life sciences adoption (Castor three-tier framework aligned to EU AI Act). However, governance execution gap persisted: Pacific AI survey found 75% with policies but only 59% dedicated roles, 54% incident playbooks, 48% monitoring; only 30% deployed GenAI to production, indicating frameworks and tooling had achieved maturity but organizational readiness remained constrained. SaferAI released hierarchical methodology for operationalizing risk tiers quantitatively (harm-based and scenario-based thresholds), advancing standardization agenda. The field's core tension remained: risk identification had become routine; translating assessments into deployment decisions remained organizational bottleneck.
2025-Q3: Standards and vendor ecosystem reached maturity peak: Credo AI recognized as Forrester Wave Leader with 10x compliance acceleration; academic frameworks continued consolidating (UC Berkeley qualitative/legal risk assessment, SaferAI quantitative operationalization). However, a credibility crisis emerged as MIT Project NANDA reported 95% of GenAI investments yielded no ROI, directly implicating systemic failures in pre-deployment risk assessment. Pacific AI survey (July 2025, 351 respondents) confirmed governance-execution bifurcation: 75% policies, 59% dedicated roles, 30% production deployments. EU AI Act August 2026 deadline created regulatory urgency—compliance vendors detailed 32-56 week implementation timelines highlighting organizational readiness gaps. UC Berkeley's September analysis critiqued ROI metrics and proposed alternative evaluation frameworks, signaling debate over impact assessment methodologies. The field's core constraint hardened: mature frameworks and capable tooling could not overcome organizational inability to translate risk assessments into deployment decisions.
2025-Q4: Risk assessment transitioned from aspirational framework to infrastructure priority, yet execution gap widened into credibility crisis. Vendor ecosystem matured: Credo AI's 2025 deployments showed 2x revenue growth, 150% enterprise customer growth, 70% faster use-case reviews, 60% less manual compliance work. Regulatory drivers hardened: federal OMB M-26-04 mandate (March 2026), state laws (Colorado, California) made risk assessment contractually binding. However, independent assessment (FLI AI Safety Index, December 2025) revealed systematic governance deficiencies at leading AI companies (grades C+ to D-), confirming frameworks had outpaced organizational operationalization. Enterprise surveys documented persistent bifurcation: Deloitte (1,854 execs, October) reported minimal ROI and measurement challenges; AuditBoard showed only ~50% including risk oversight in board agendas; McKinsey: 72% with production AI but only 9% with mature governance. Practitioners described "compliance theater" where activity was tracked but real control and decision-lineage assurance remained absent. The central tension hardened into paradox: frameworks universal, vendor ROI clear, yet organizational readiness remained the immovable organizational bottleneck.
2026-Jan: Regulatory acceleration collided with operational limits. State-level risk assessment mandates took effect (California SB 53, Texas HB 149, Illinois employment AI rules, Colorado impact assessments), converting governance from aspirational to contractual requirement. Business risk awareness surged: Allianz Risk Barometer (3,300+ professionals) elevated AI from #10 to #2 global business risk, behind only cyber incidents. However, deployment failures continued: academic analysis documented 89% of AI investments producing minimal ROI, with specific failure cases (Workday discrimination lawsuits, Mount Sinai medical diagnostics bias, Air Canada hallucinations) revealing inadequate pre-deployment risk assessment across technical, operational, and stakeholder dimensions. Adoption metrics showed persistent barriers: Moody's survey of 600 risk/compliance professionals reported 53% adoption but only 30% significant benefit realization, with data quality, expertise gaps, regulatory uncertainty, and legacy system integration cited as obstacles. Real-world pilot evidence (Australian government Microsoft Copilot trial) highlighted gaps even in structured risk assessment—reliance on end-user review, overlooked team dynamics impacts, organizational readiness limitations. Infrastructure constraints remained binding: hackathon of 500+ builders revealed policy frameworks (EU AI Act) lacked practical verification systems and implementation tooling, with "policy frameworks without verification systems" characterizing the gap.
2026-Feb: Vendor tooling and governance frameworks matured while deployment-stage failures revealed assessment reliability gaps. Alan Turing Institute released practical AI Use Case Framework addressing real-world enterprise barriers to safe adoption. Credo AI launched GAIA (Govern AI Assistant), automating risk identification and control recommendations to address organizational scalability bottlenecks. Gallagher survey documented 63% AI operationalization but <47% formal risk frameworks, quantifying governance-execution bifurcation. Critical assessment data emerged: synthesis of 2026 failure statistics showed 80.3% overall project failure rate (95% GenAI pilot-to-production), indicating systemic pre-deployment assessment inadequacy. CRO frameworks for financial institutions (BIS/FSB standards) and NIST AI RMF independent analysis (40-60% Fortune 500 adoption, lacking quantitative ROI evidence) demonstrated maturation of assessment practices paired with growing skepticism about effectiveness. The field's defining paradox solidified: standards and tools achieved organizational infrastructure status, yet failure rates suggested risk assessments remained unreliable at deployment decision points.
2026-Mar: Major consortium deployment and regulatory codification accelerated adoption while revealing persistent operationalization gaps. Monetary Authority of Singapore (MAS) completed Project MindForge Phase 2 (March 20), publishing AI Risk Management Operationalization Handbook developed by 24 leading financial institutions (DBS, Julius Baer, Prudential, others) with two-level assessment (organization-level and use-case-level risk materiality), documenting structured deployment of risk frameworks at enterprise scale. EU AI Act requirements for mandatory Fundamental Rights Impact Assessments (Article 27) on high-risk systems entered regulatory force, with FRA research showing 'most organisations developing or using high-risk AI systems do not yet perform structured assessments that comprehensively address fundamental rights.' Market signals showed mainstream adoption: AI TRiSM platform market grew to $3.59B (2025) with projection to $46.8B (2034, 35% CAGR) driven by 80% Fortune 500 AI deployment and regulatory compliance mandates. Deloitte's 2026 survey (3,235+ leaders) documented paradoxical 'Readiness Deception': adoption accelerating (88% use AI in at least one function, 60% worker access) while governance (30% ready), infrastructure (43%), and data management (40%) readiness all declined—revealing execution gap widening as autonomous agent deployments planned for next 2 years. AICPA/CIMA survey (1,735 executives) showed 46% classify AI as Top-10 risk but only 24-27% possess adequate governance/talent/systems readiness; AI-transformed entities report escalating risk pressure. The field's core tension sharpened: regulatory mandates and market adoption drivers had standardized risk assessment frameworks (MindForge, EU AI Act, NIST RMF), yet organizational capacity to implement them and use assessments for deployment decisions remained the binding constraint.
2026-Apr: Assessment methodology failures and governance gaps converged with growing agentic deployment pressure. Deloitte's survey of 3,235 enterprise leaders found 74% expect agentic AI deployment within two years but only 21% have mature governance models, marking a critical widening of the deployment-governance gap at the moment autonomous systems become the dominant deployment mode. Two documented assessment failures exposed framework limits: adversarial data poisoning in a credit decision system caused loan denials to rise despite acceptable accuracy metrics, remaining invisible to traditional risk frameworks; independent audit found an AI risk scoring system applying systematically higher risk ratings to foreign-owned entities based on identity rather than performance, revealing discriminatory assessment design. Anthropic's production risk assessment for the 2026 US midterms demonstrated credible methodology at scale: automated detection plus red-team stress-testing across 600+ evaluation prompts achieved 95% political neutrality and 100% compliance on Opus 4.7, providing a published benchmark for systematic pre-deployment assessment. Stanford 2026 AI Index documented concurrent deterioration in model transparency (Foundation Model Transparency Index: 58→40) and surge in documented incidents (233→362), confirming that assessment practices have not kept pace with deployment scale.