Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Multimodal content generation

GOOD PRACTICE

TRAJECTORY

Stalled

AI that generates integrated multi-format content combining text, images, and layout in a single workflow. Includes newsletter generation and social card creation; distinct from content repurposing which adapts existing content rather than generating multimodal output from scratch.

OVERVIEW

Multimodal content generation -- AI systems that orchestrate text, image, video, and layout production within a single workflow -- has consolidated as a foundational production capability with enterprise adoption, governance infrastructure, and systemic risk assessment now defining the practice's maturity boundary. The practice sits at the inflection between consumer adoption (88% of organizations, 46.9% of creators generating visual content) and enterprise risk management. Two complementary forces now shape deployment: agentic orchestration enabling multi-format workflows at speed (Adobe Journey Optimizer, Firefly AI Assistant; Google Gemini Omni), and regulatory enforcement (EU AI Act August 2026, FTC disclosure rules, platform-level compliance frameworks) forcing operational accountability. Hallucination and deepfake risk remain material adoption barriers. Independent testing shows Gemini Omni generates misleading synthetic content in 70% of adversarial test cases; hallucination doubled 2024-2025 (18% to 35%); consumer trust collapsed from 60% to 26%. The defining tension is now production velocity versus governance liability. Organizations generating hundreds of campaign variations weekly face compliance verification bottlenecks: content authenticity watermarking (C2PA, SynthID), IP/copyright validation, disclosure labeling across platforms. Firefly AI Assistant demonstrates quality limitations in production workflows (poor object composition, inconsistent blending), while economic drivers remain compelling ($3 per AI asset vs $150-400 human). The practice's next frontier is not capability expansion but governance automation, safety verification, and liability mitigation infrastructure for production-scale deployments.

CURRENT LANDSCAPE

Platform orchestration and team-scale economics define the mid-2026 frontier. Adobe's Firefly AI Assistant (April 2026, public beta April 27) consolidates Photoshop, Premiere Pro, Lightroom, Illustrator, Express, and 30+ partner models (Kling 3.0, Veo 3.1, ElevenLabs v2) into conversational workflows. May 2026 extensions: Firefly Creator team tiers (Pro, Pro Plus, Premium) remove Creative Cloud subscription barriers for SMB deployment; Adobe's Gemini connector (May 19) exposes orchestration to hundreds of millions of users. Parallel momentum: Google Gemini Omni (May 20, 2026) demonstrates production-ready unified multimodal architecture—text, image, audio, video in single forward pass—with physics reasoning and character consistency across turns. Rollout to 900M+ Gemini users plus enterprise API (June 2026). Both platforms shift from tool-centric to outcome-centric workflows (natural-language briefs), removing workflow friction. Adobe Journey Optimizer AI Assistant (June 2026 GA) brings multimodal generation (text + image variants) directly into marketing automation for email, push, web, and SMS channels—demonstrating orchestration at enterprise scale.

Competitive models converge rapidly on capability parity while production quality gaps persist. Luma Uni-1 (May 2026) introduces spatial reasoning, multi-reference generation, and culture-aware aesthetics (photorealistic, manga, webtoon). ByteDance Inset scales interleaved text-image generation with 15M synthetic samples. JoyAI-Image achieves state-of-the-art on instruction-guided editing. However, independent testing reveals critical quality barriers: The Verge documents Firefly AI Assistant output as "mediocre design intern" quality with poor object composition and inconsistent blending; Gemini Omni testing shows 70% success rate on adversarial prompts designed to produce misleading deepfakes. Creator ecosystem specialization persists—Midjourney (aesthetics), Firefly (brand-safe production), FLUX (multi-reference)—signaling mature market segmentation by use case.

Adoption breadth now spans enterprise mainstream to creator economy baseline. Enterprise: WFA study (April 2026) documents 78% of multinational brands deploy AI-generated multimodal content in production (87% product images, 80% copy, 77% backgrounds). Estée Lauder Companies (100+ brands, 150 countries) operationalized Firefly Services APIs for batch asset generation (resizing, formatting, localization) across hundreds of thousands of annual assets. Creator economy: 57.3% daily AI usage, 46.9% generate visual content, 84% use AI for email marketing (550 creators surveyed). Market scale: Firefly $250M+ ARR (Q1 2026), 22B assets generated by April 2025; multimodal market $3.6B (2026) → $11.7B (2033, 18.2% CAGR).

Production governance infrastructure now gates adoption more than capability maturity. Regulatory framework crystallized: EU AI Act (effective August 2, 2026) mandates machine-readable and visible labeling for synthetic media; FTC enforcement (Operation AI Comply) targets undisclosed synthetic content ($51.7K per instance minimum); California, Texas, China all enforce disclosure requirements. Platform enforcement: Meta flags 73% of US DTC ads with multimodal compliance framework; TikTok, YouTube, X, LinkedIn enforce AI-generated content disclosure. Deepfake and safety risk documented: Gemini Omni generates misleading synthetic video in 70% of test cases; hallucination doubled 2024-2025 (18% → 35%); consumer trust collapsed 60% → 26%. Production engineering bottlenecks: VRAM constraints (Flux 24GB+), NSFW classifier false-positives (85%, 2-3x demographic bias), compliance watermarking overhead (10-50ms per image), long-sequence generation degradation. Yet cost economics remain compelling for risk-tolerant organizations: $3 per AI asset versus $150-400 human production delivers 60-75% time savings and $3.70 ROI per dollar invested. The adoption ceiling is now governance liability and verification infrastructure, not capability; compliance automation, AI detection, and content authenticity verification remain critical gaps blocking enterprise risk-averse sector adoption.

TIER HISTORY

ResearchJun-2023 → Jul-2023
Bleeding EdgeJul-2023 → Jul-2024
Leading EdgeJul-2024 → Apr-2026
Good PracticeApr-2026 → present

EVIDENCE (125)

— Enterprise-scale multimodal deployment: 151 print assets, 173 videos across 24 screens, 100+ Firefly images coordinated through unified visual system with 1.19TB data; demonstrates production maturity at event scale.

— Adobe advances agentic multimodal orchestration with brand kit generation, product video creation, storyboard-to-video, and persistent creative context across sessions, demonstrating workflow-level maturity.

— Market report segments multimodal content generation as distinct category within AI content market ($6B 2026 → $45B 2033, 35% CAGR); validates analyst recognition of multimodal as established market segment.

— Adobe survey of 16,000+ creators: 87% report faster business/audience growth, 75% say creative AI essential to workflow; validates creator-economy adoption at scale with measurable business outcomes.

— Technical architecture: unified native multimodal (single backbone) vs prior cascading approach; produces synchronized output (video, audio, echo) in single API call; conversational editing maintains physics consistency.

— Independent specification mapping: Omni Flash caps at 720p/10s on app, 4K/10s on API; real usage ceiling ~5-6 full generations/day; deepfake risk prevents audio editing of existing video.

— Market synthesis: content marketing $524.73B (2025) → $989.84B (2030); enterprise GenAI adoption 73%, cost reduction 68%; video shifting to 45% of budgets signals multimodal shift at enterprise scale.

— Enterprise multimodal market $7.8B (2025) → $108.4B (2034), 38.2% CAGR; documents 55-70% automation rates in BFSI/healthcare workflows replacing single-modality systems; production infrastructure maturation signal.

HISTORY

  • 2023-H1: Adobe integrated Firefly into Express and Creative Cloud apps, enabling unified multi-format content workflows. Multimodal LLMs reached research maturity with emerging applications in education. Systematic failure analysis revealed reliability challenges in orchestrating multiple generative modalities.

  • 2023-H2: Adobe expanded Express with multimodal AI features (Generative Fill, Generate Template, Translate, TikTok integration) serving millions. Text-to-image generation reached 150B+ annual production (Adobe Firefly alone hit 1B assets in 3 months). However, only 10% of organizations achieved production GenAI deployments; infrastructure, compliance, and IP risk remained barriers despite 12% ROI in content marketing early adopters.

  • 2024-Q1: Enterprise adoption accelerated: 65% of enterprises adopted generative AI (vs. 11% in 2023); Firefly reached 6.5B images; 83% of creative professionals reported using generative AI. Multimodal market projected to reach $19.85B by 2032 (34.4% CAGR). However, governance emerged as primary constraint: academic research and WHO/Microsoft analyses documented hallucination failures, bias risks, and unintended harms requiring transparency, provenance standards, and regulatory frameworks. Data security and IP risk remained top adoption barriers for risk-averse sectors.

  • 2024-Q2: Real-world deployment validation: Midjourney-powered newsletter reached $30K annual revenue; Amorepacific confirmed cost/time efficiency with Firefly for product marketing. Vendor competition intensified with Firefly Image 3 and DALL-E 3 refinements, but quality variance persisted. Governance barriers hardened: academic benchmarks revealed persistent failures in scientific visualization (text, spatial, numeric errors); research documented exacerbated bias in models like CLIP and Stable Diffusion. Content authenticity standards and regulatory clarity remained undefined; smaller vendors (DALL-E 2) discontinued as market consolidated.

  • 2024-Q3: Multimodal content generation transitioned to mainstream production deployment. Adobe expanded Firefly to video generation (announced September 2024) and launched Content Analytics for measuring AI-generated content performance. Enterprise adoption accelerated with 49% of Australian businesses creating social media content with AI (projected 61% by 2026). Cloud platforms published reference architectures normalizing deployment patterns. Governance remained the primary barrier: compliance, content authenticity standards, and IP risk persisted as constraints despite technology maturity.

  • 2024-Q4: Multimodal content generation solidified as production baseline with 79% of marketers using GenAI for content tasks; Firefly reached 13+ billion images generated; video generation entered beta. However, production reliability challenges surfaced: academic research documented persistent hallucinations and object composition failures across multimodal models, while enterprise platform integration issues (DALL-E 3 API errors on Azure) constrained adoption in regulated sectors. Compliance uncertainty remained the primary barrier despite widespread capability maturity.

  • 2025-Q1: Adobe expanded Firefly Services with APIs and Custom Models for enterprise personalized content production at scale (March 2025). Market validation: multimodal AI market reached $1.6B with sustained growth; 89% of AI search queries incorporate visual elements, confirming production-scale adoption. Vendor services consolidation shifted from point tools to enterprise platforms enabling multi-format workflows at scale.

  • 2025-Q2: Multimodal content generation reached 16+ billion generated artifacts as production infrastructure standard. Creator adoption continued advancing with 83% of content creators incorporating AI (up from 79% marketer adoption in Q4 2024). Adobe reported 700M+ monthly active users with Digital Media ARR growth to $4.35B+ (12% YoY). However, production monetization headwinds emerged: Firefly video (beta since September 2024) faced user backlash over aggressive paywall structure, quality issues (temporal artifacts), and free competitor advantages (Pika, Luma). Consumer sentiment remained mixed: 55% uncomfortable with AI-generated media; 33% of creators feared replacement. Governance and IP frameworks remained undefined across jurisdictions.

  • 2025-Q4: Multimodal content generation solidified as critical infrastructure with 20B+ total Firefly generations and 70M+ freemium users (+35% YoY). Adobe achieved $5B+ AI-influenced ARR with Firefly 4 launch (10x faster), custom models enabling $7M+ enterprise revenue per customer, and December 2025 Microsoft integration into ChatGPT. Enterprise adoption reached 82% weekly AI usage with 72% measuring ROI, but monetization headwinds persisted (video paywall backlash) and reliability constraints remained (object composition failures, 76% error rate in multi-object tasks). Competitor emergence: Google Gemini and Veo gained creator preference, though larger orgs stayed on Adobe/Midjourney. Compliance and authenticity standards remained primary adoption barriers despite enterprise copyright indemnification.

  • 2026-Jan: Adobe Firefly Foundry launched with Fortune 100 partnerships (Disney, CAA, B5 Studios) for brand-specific model tuning. Enterprise deployment accelerated: 65–78% of large enterprises testing/deploying multimodal AI, 34M images daily, 72% of companies integrating AI tools into marketing with 30% sales impact. Cloud platforms embedded multimodal as default (AWS Bedrock GA, Google Gemini expansion, ChatGPT integration). However, quality bifurcation widened: Adobe/Midjourney delivered reliable commercial output while DALL-E regressed in artistic capability and composition reliability declined. Consumer trust remained fragile (55% uncomfortable with AI media). Production constraints for scientific, spatial, textual content remained unsolved despite accelerating enterprise adoption.

  • 2026-Feb: Adobe expanded Firefly to unlimited generations, signaling platform maturity for Creator baseline (86% use creative AI daily). Market validation confirmed: AI content production market reached $1.5B (2025) with 17.3% CAGR to $5.4B by 2033; 40% of digital content outputs now AI-generated across advertising and e-commerce. However, architectural limitations emerged: multimodal LLMs suffer from "mismatched decoder" problems where non-textual data treated as noise (64-71% variance removed improved performance), hallucination surveys documented persistent failures across image/video/audio modalities, and annotation data quality barriers remain critical. Real-world deployment analysis revealed three critical hurdles: token cost explosions (multi-step pipelines unsustainable at scale), latency (6-15+ seconds harming UX), and accuracy bottlenecks (hallucinations create liability in high-stakes applications). Newsletter automation case study (85% cost reduction, 320% revenue gains) validated niche multimodal workflows while broader reliability remained constrained.

  • 2026-Mar: Adobe Q1 FY2026 confirmed the practice at production scale: Firefly ARR exceeded $250M (+75% QoQ), video generative actions grew 8x year-over-year, and audio doubled, while enterprise adoption reached 60% of applications combining two or more modalities. Production-ready multimodal agents now deliver integrated campaigns (copy, image, video, audio from a single brief) at $50-200 per video versus $5-15K agency baseline, with 60-75% production time reduction. Research identified a long-sequence reliability constraint in unified multimodal models — text-image interleaving quality degrades as sequences grow — and multimodal AI market projections put 2026 at $2.83B growing to $8.24B by 2030 at 30.6% CAGR.

  • 2026-Apr: Agentic orchestration milestone: Adobe launched Firefly AI Assistant (April 15, 2026), consolidating Photoshop, Premiere, Lightroom, Illustrator, Express, and 30+ partner models into a single conversational interface that orchestrates multi-step workflows; NBCUniversal deployed it to 2,000+ creatives, compressing brief-to-campaign from 3 weeks to under 10 minutes. WFA study documents 78% of multinational brands using AI-generated multimodal content in production; global AI content generation market reached $26.9B (2026), projected $168.7B by 2034 (25.8% CAGR). However, hallucination doubled year-over-year to 35% (2025) with frontier models spiking to 18.7% on legal queries; 85% of consumers report uncanny-valley reactions to AI-generated content and trust dropped from 60% to 26%, establishing quality assurance as the defining ceiling for the practice's next adoption wave.

  • 2026-May: Platform and model capabilities advanced on two fronts. Adobe expanded Firefly for team-scale SMB deployment with three Creator team tiers (Pro/Pro Plus/Premium) that remove the full Creative Cloud subscription requirement, directly addressing the cost barrier for non-enterprise adopters; Adobe Brand Intelligence (April 20) added on-brand multimodal assembly validation. Competitor models converged rapidly: Luma's Uni-1 introduced spatial reasoning, multi-reference consistency, and culture-aware generation (photorealistic, manga, webtoon); ByteDance's Inset scaled interleaved text-image generation with 15M synthetic training samples; JoyAI-Image reached SOTA on visual understanding and instruction-guided editing. Google's Gemini Omni (announced May 20, 2026) demonstrated multimodal deepfake capability at scale, raising governance stakes alongside physics-aware generation and SynthID watermarking. Fal.ai's independent market analysis documented that 88% of organizations deployed AI in at least one function in 2025, image models at production-quality benchmarks, video generation now table-stakes with physics simulation, and audio production-ready at scale — vendor-independent confirmation of practice-wide production maturity. Market sized at $3.6B (2026) projected to $11.7B by 2033 (18.2% CAGR). Platform-scale enforcement confirmed adoption stakes: Meta's multimodal compliance framework (visual semantics, NLP, audio fingerprinting) flagged 73% of US DTC brand ads, demonstrating that compliance infrastructure is now a live operational constraint rather than a future planning item. Production engineering friction remained the structural ceiling: Flux requires 24GB VRAM, NSFW classifiers show 85% false-positive rates with 2–3x demographic bias, and compliance watermarking adds 10–50ms per-image overhead — confirming that deployment infrastructure, not model capability, is the binding constraint for scaled production adoption.

  • 2026-Jun: Agentic orchestration and governance enforcement define the period. Adobe Firefly AI Assistant advances (June 18) with brand kit generation, product-to-video workflows, storyboard automation, and persistent creative context across sessions; Cannes Lions 2026 deployment confirmed production scale (151 print deliverables, 173 videos across 24 screens, 100+ Firefly assets, 1.19TB coordinated through a unified visual system). Adobe Journey Optimizer AI Assistant confirmed GA (June 5) embedding multimodal generation in enterprise marketing automation (email, push, web, SMS). Google Gemini Omni reached general availability (June 2026) with 900M+ user rollout plus enterprise API; native any-to-any single-backbone architecture (text, image, audio, video in synchronized output) replaces cascading pipelines, though real-world limits include 720p/10s on app and ~5-6 full generations/day. Market segmentation crystallized: Coherent Market Insights segments multimodal as a distinct category within the $6B (2026) AI content market growing to $45B (2033, 35% CAGR); enterprise multimodal foundation model market $7.8B → $108.4B (2034). Creator adoption: a 16,000+ creator survey shows 87% report business growth with AI, 75% call it essential. Critical adoption barriers remain: NewsGuard testing shows Gemini Omni generating misleading deepfakes in 70% of adversarial cases; The Verge assessed Firefly AI Assistant as "mediocre design intern" quality; hallucination doubled YoY to 35%; consumer trust collapsed 60% → 26%. Governance framework crystallized: EU AI Act enforcement begins August 2, 2026; FTC Operation AI Comply targets undisclosed synthetic content ($51.7K minimum per violation); Meta flagged 73% of US DTC brand ads under its multimodal compliance framework. Production velocity versus governance liability is the binding constraint — compliance automation, content authenticity verification (C2PA, SynthID), and safety infrastructure remain critical barriers for enterprise risk-averse adoption, not capability gaps.

TOOLS