Perly Consulting │ Beck Eco

The State of Play

A living index of AI adoption across industries — where established practice meets the bleeding edge
UPDATED DAILY

The AI landscape doesn't move in one direction — it lurches. Some techniques leap from experiment to table stakes in a single quarter; others stall against regulatory walls, technical ceilings, or organisational inertia that no amount of hype can dislodge. Knowing which is which is the hard part. The State of Play cuts through the noise with a rigorously maintained index of AI techniques across every major business domain — classified by maturity, evidenced by real-world adoption, and updated daily so you always know where you stand relative to the field. Stop guessing. Start knowing.

The Daily Dispatch

A daily newsletter distilling the past two weeks of movement in a domain or two — delivered to your inbox while the index updates in the background.

AI Maturity by Domain

Each dot marks the weighted maturity of practices within a domain — hover for a brief summary, click for more detail

DOMAIN
BLEEDING EDGEESTABLISHED

Audio production — editing, podcasts & sound design

LEADING EDGE

TRAJECTORY

Advancing

AI that removes noise, enhances audio quality, automates podcast production workflows, and generates sound effects and designs. Includes automated mastering and AI-assisted foley; distinct from music generation which creates melodic and harmonic compositions.

OVERVIEW

Audio production and editing powered by AI is mature and mainstream, with practical adoption across podcast and content production workflows, yet professional quality-critical work remains fundamentally hybrid. The practice encompasses noise reduction, dialogue cleaning, podcast production automation, sound effect generation, and AI-assisted mastering and foley. By late May 2026, a core tension persists: AI excels at specific technical tasks (noise removal, transcription at 90%+ accuracy, loudness correction) and delivers measurable efficiency gains (50-80% time reduction in routine editing), but cannot replicate creative judgment, emotional understanding, context awareness, or performance authenticity. Consumer testing documents the gap: NYU focus groups rated AI-generated podcasts 2.3/5 despite technical viability, recognizing synthetic hollowness and rejecting further listening. Platform responses signal ecosystem maturity—Spotify introduced AI voice cloning verification and policies recognizing that podcasting's dependence on creator-audience trust differs from music. Professional and practitioner consensus solidifies around hybrid human-AI workflows as the sustainable model—full automation in audio production is neither operationally viable nor economically rational for quality-critical content.

CURRENT LANDSCAPE

By May 2026, AI audio production consolidated as production-scale infrastructure with segmented adoption patterns and emerging challenges around authenticity. Market trajectory: AI in podcasting market grew from $4.06B (2025) to $5.36B (2026) at 32% CAGR, with projection to $16.12B by 2030, driven by major vendor investment (Spotify, Adobe, Acast, Descript, Podbean, Riverside.fm). Largest signal: Amazon launched Alexa+ AI podcast generation on May 18, 2026, with full automation (200+ licensed newsroom partners including AP, Reuters, Washington Post, Time, Forbes) producing finished episodes in minutes with two synthetic co-hosts reading from real journalism—evidence of platform-scale full automation capability in podcasting. Professional survey data (1,200+ music/audio creators, 70%+ with 10+ years experience) established baseline: 20% regular AI users, 50% experimenting, <20% with no interest; efficiency remains primary perceived benefit while creativity concerns (33%), ethics (30%), and quality doubts (27%) persist as adoption barriers. April 2026 case studies validated efficiency metrics: independent production testing measured 55 hours editing time saved across 22 client projects (2.5 hours per 10-minute interview), and corporate podcast deployment documented 75% time reduction (4 hours to 50 minutes per 30-minute episode). Professional audio post-production deployment: LA and London studios automated Pro Tools session-building workflows, reducing setup from several hours to 30 minutes per episode; European localization company achieved 40% time savings on international release preparation. Creator adoption remains strong: 86% of creators surveyed (16,000+ across 8 countries) now use generative AI, with Descript identified as leading podcast-specific tool. Platform responses signal ecosystem maturity around authenticity risks: Spotify implemented verified badges, banned unauthorized AI voice cloning, and removed 75M+ spam tracks—indicating platforms recognize podcast authenticity differs from music, where creator-audience voice recognition drives loyalty. Practitioner consensus: AI excels at mechanics (noise removal, 90%+ transcription, dead-air trimming, filler detection) but cannot engage in pacing, comedic timing, or emotional context. Peer-reviewed research (Garcia & Reiss, May 2026) documents practitioner preference for task-specific assistive tools over generative systems, with AI effective for podcasts/fast-consumption media but insufficient for high-end sound design. Full automation remains neither operationally viable nor economically rational for quality-critical content; hybrid human-AI workflows remain dominant across all segments.

TIER HISTORY

ResearchJan-2019 → Jan-2019
Bleeding EdgeJan-2019 → Jan-2023
Leading EdgeJan-2023 → present

EVIDENCE (133)

— Technical distinction: only RoEx's Automix handles true multi-track mixing from stems; LANDR offers stereo mastering only. Establishes capability segmentation by production scale, signaling market maturity differentiation beyond one-size-fits-all mastering.

— IMARC market projection: $28.2B (2025)→$191.3B (2034) at 23.71% CAGR. Specific AI impact: transcription reduces post-production 60-70%, voice translation preserves speaker characteristics across languages, algorithmic personalization reshaping discovery.

— Professional podcast production agency (350+ shows since 2013) documents market segmentation and AI tool positioning: 'The edits are mechanical, not editorial. Useful as part of a workflow, weak as the whole workflow'—critical negative signal on full automation.

— Survey of 16,000+ creators (8 countries) shows 87% report AI accelerates growth, 75% rate AI as integrated/essential; critical limitation: 57% say outputs require moderate/extensive editing before sharing—confirms operational maturity with quality ceiling.

— Survey of 3,000 professional creators shows 94% already use AI, 72% plan increased usage; 83% believe human-made sound creates stronger emotional connections than AI alternatives—signals adoption scale with authenticity preference ceiling.

— Six-stage audio processing pipeline analysis: capture, cleanup, recognition, diarization (where consumer tools 'quietly fail'), structuring (actionable artifacts vs. paraphrase summaries), indexing. Framework clarifies adoption maturity and identifies diarization as persistent bottleneck in podcast workflows.

— Comprehensive guide covering fully automated generation (Jellypod, Wondercraft) and post-production (PodcastAI, Adobe Podcast v3, ElevenLabs). Adobe Podcast v3 'Room Modeling' feature addresses prior over-processing criticism—signals ecosystem responding to quality feedback.

— Descript product velocity: 70 tickets shipped 48 hrs, Tone Tags for ElevenLabs v3, Underlord Opus 4.8 co-editor, MCP connectors live in Claude/ChatGPT—signals continued ecosystem integration and agentic assistant advancement.

HISTORY

  • 2019: Foundational research on noise reduction and speech enhancement published; Spotify launches Soundtrap for Storytellers with AI-assisted podcast editing; professional tools like Audionamix IDC see deployment in audio post-production for dialogue cleaning and audio extraction.

  • 2020: Academic research formalizes domain benchmarks (INTERSPEECH DNS Challenge, NeurIPS speech denoising); Descript adoption grows among independent podcasters; COVID-19 remote recording surge accelerates demand for audio cleanup tools. Persistent challenge: synthetic-data models degrade on real recordings, limiting professional engineer trust.

  • 2021: Cloud-based tools reach critical adoption mass with iZotope, LANDR, and Descript Studio Sound driving marketplace competition. Audionamix IDC v1.5 expands platform support (VST, AU, AAX) for professional post-production. Practitioner testing confirms AI software outperforms hardware for podcast cleanup. Deployment ceiling remains defined by ML model generalization failures on real-world audio, documented in research literature.

  • 2022-H1: LANDR evolves into full creator ecosystem with All Access Plan (AI mastering, 150+ distribution, sample library, DAW plugins). ICASSP 2022 Deep Noise Suppression Challenge expands to fullband datasets and mobile scenarios. RTC platforms (Agora, RongCloud) advance AI noise reduction techniques for transient noise. Professional skepticism persists: mastering studios view AI as market-expanding complement, not replacement; core tension shifts from capability to workflow positioning.

  • 2022-H2: No significant new product launches or major research publications documented; market consolidation and refinement of existing tool ecosystems continued with minimal paradigm shifts.

  • 2023-H1: Audionamix deploys AudioShake's AI for professional film/TV audio separation, validating industrial deployment. Critical practitioner feedback emerges: podcast producers and audio engineers document inconsistent results, robotic artifacts, and missed nuances in fully automated tools, reinforcing requirement for human validation. Core tension crystallizes: AI handles early-stage cleanup and separation efficiently, but professional quality assurance and creative decisions remain human domain.

  • 2023-H2: LANDR Mastering Plugin reaches production with 86% user satisfaction and industry recognition ("most innovative plugin of 2023"); Sound on Sound validates consistent performance across genres. Marketing AI Institute documents 20x ROI with Descript (4.8k to 100k downloads, 3-4hr editing time reduction). Practitioner skepticism remains sharp: EDM Sauce reviews document LANDR's over-compression and dynamic loss on acoustic music; audio engineer Michael Wynne argues "AI mastering just doesn't work" for competitive professional output. Market segmentation solidifies: podcast production shows highest adoption velocity with clear efficiency gains; mastering remains contested due to persistent quality concerns and professional skepticism of full automation.

  • 2024-Q1: Independent podcast creators report 50% production time reduction with AI-driven workflows (Cast Magic, Descript, Riverside.fm automation). MusicRadar and blind mastering tests confirm LANDR plugin quality but document limitations vs. professional engineers. Ohio State research advances speech enhancement with perceptual learning. AI-coustics startup emerges with €1.9M funding and 5 enterprise customers, signaling continued innovation. Professional skepticism persists: forum discussions and audio engineer assessments show blind tests favour human mastering, with AI rated "good enough" for demos and quick uploads only.

  • 2024-Q2: Market signals strengthen: Nielsen reports podcasts capture 20% of daily ad-supported audio (driving production demand), and market analysts forecast $125.8B AI audio market by 2031. LANDR mobile app launches with AudioShake stem separation; Soundtrap deploys at scale in schools (Ames HS case study). Professional podcast editors document active adoption of Adobe Enhance Speech and Supertone Clear in production; consumer app reviews show practical limitations in export workflows. Educational deployment expands, signaling audience broadening beyond independent creators. Professional mastering skepticism remains: AI rated "good enough" for demos, not final deliverables.

  • 2024-Q3: ai-coustics and Supertone secure funding; BosePark Productions (200+ podcast shows on Spotify/Audible) deploys ai-coustics for automatic audio enhancement across remote guest recordings, validating professional deployment. Practitioner feedback crystallizes core limitation: AI tools handle foundational cleanup and noise removal with proven efficiency (50% time reduction), but emotional tone, context, and accent understanding remain human domain. Adoption barriers persist—mechanical-sounding edits and lack of nuance prevent full automation in quality-critical workflows.

  • 2024-Q4: Market consolidation and maturation signal: 40% year-over-year AI mastering adoption in North America and 200% tool growth since 2020. RoEx study of 200,000 DIY tracks reveals quality ceiling—80% exceed Spotify loudness standards, 57% clip, showing broader adoption outpacing improvement. MASV hands-on testing of 7 cleanup tools confirms effectiveness with caveats on over-processing artifacts. Podcast editors adopt Studio Sound and Enhance Speech with documented efficiency (10x gains on editing tasks), but professional skepticism hardens on transcription (documented inaccuracies) and mastering (rated "good enough for demos only"). Practitioners converge on AI as task-specific tool in hybrid workflows, not replacement. Core tension remains unchanged: efficiency proven, but quality-critical decisions and creative work remain human domain.

  • 2025-Q1: Market growth accelerates: AI in podcasting projects to $12.25B by 2029 (31.8% CAGR), with podcast production services market at $171.84M growing to $494.14M by 2032. Practitioner deployments scale—podcast strategists managing 40+ shows achieve 50% editing time reduction using Riverside, Opus Clip, and CastMagic automation. Ecosystem maturity evident in curated tool guides showing AI-assisted editing (Descript, Auphonic) as standard practice. Mixed signals persist: independent reviews acknowledge practical utility for tight deadlines and efficiency, but professional expertise and creative decision-making remain non-replicable. Critical practitioner guidance emphasizes AI excels at technical tasks (noise reduction, voice clarity) but struggles with context-aware and creative work, reinforcing hybrid human-AI workflow positioning.

  • 2025-Q2: Professional adoption accelerates sharply: 78% of professional podcasters now use AI tools (up from 34% in 2023), with named case studies documenting measurable outcomes—Relu Consultancy produced 300 dynamic podcasts in 3 months using AI, achieving 52% retention increase and 79% click-through improvement. Podcast dialogue editing reduced from 3-5 hours to under 30 minutes per hour of content with AI cleanup; 66% of AI-using creators report quality improvement. Listener adoption grows to 57% using AI-powered features; market trajectory remains strong with 28.3% CAGR in AI podcasting tools. Critical signal: generic mastering tools show limitations, driving emergence of specialized AI models for genre-specific workflows (Valkyrie AI competing with LANDR on hip-hop/rap specificity). Practitioner warnings persist on over-automation risks and robotic artifacts, affirming that quality-critical and creative work remain human-centric domains. Podcast production and audio cleanup solidify as the leading adoption category with clearest ROI; mastering matures toward hybrid model with emerging specialization.

  • 2025-Q3: Listener resistance to AI-generated voices solidifies as significant adoption barrier: 47% of podcast listeners would reject favorite shows with AI voice replacements (Sounds Profitable survey), with strongest resistance among highly educated audiences. Creator-side adoption remains cautious: 25% actively using AI tools with 58% open to experimentation, but 16% avoid due to authenticity concerns. Practitioner assessments document persistent quality limitations—AI editing remains fast but mediocre despite vendor promises, reinforcing that technical execution (90%+ transcription accuracy, noise removal) works well while creative and editorial decisions require human judgment. Educational deployment continues expanding with high school and K-8 adoption of Soundtrap. Mastering services market accelerates digitization with AI reducing turnaround from days to minutes, though professional skepticism on sound quality persists. By Q3 2025, practice remains in leading-edge tier with clear segmentation: podcast production/audio cleanup show strongest adoption and ROI metrics; mastering and voice generation show significant listener/practitioner resistance; hybrid human-AI workflows remain dominant across all use cases.

  • 2026-Jan: Practitioner consensus solidifies around hybrid human-AI workflows as essential in audio production. Industry reports confirm AI tools are "no longer experimental" in podcast production, with adoption accelerating in transcription, editing, and repurposing. However, critical voices persist: audio engineers document AI limitations in mixing (lacks creative intent, revision consistency), mastering (genre-aware quality ceilings), and editorial judgment. Broader AI adoption research shows 95% of GenAI pilots lack measurable ROI, reinforcing that full automation is operationally unviable for quality-critical content. LANDR mastering noted to deliver 90% quality for beat-driven genres but fails on dynamically complex music. Practitioner assessments from experienced editors emphasize that AI handles noise removal and transcription effectively (90%+ accuracy) but cannot replace emotional context and audience awareness. Mastering turnaround improves (days to minutes) but professional skepticism hardens on creative quality. By January 2026, leading-edge status affirmed by strong adoption in podcast workflows combined with realistic understanding of AI's complementary rather than replacement role.

  • 2026-Feb: Producer adoption survey (1,100+ creators) shows 35% using AI for production tasks (mastering, stem separation), with 46% concerned about loss of originality and ethical training data issues. Parallel artist survey reports 87% of creators using AI in workflows, 79% for technical audio tasks with year-over-year tool adoption acceleration. Spotify's AI podcast production system shows production scale: reduces podcast production from 48+ hours to under 2 hours, processes 200k devices monthly, creates subtitles in 26 languages automatically, achieving 52% cost reduction and 4M hours audio in Q1. Critical assessment emerges: LANDR mastering plugin documentation shows effectiveness but notes problematic subscription model and bias toward specific mixing decisions. Platform integration accelerates: RSS.com API enables end-to-end workflow automation with Auphonic audio cleaning and PodFlowStudio marketing content generation. Data compilation shows 60% of musicians using AI in production, 35% for mastering and stem separation, with strong adoption signals offset by 77% fearing AI devaluation of human-made music. By February 2026, practice remains in leading-edge tier with clear evidence of production-scale deployment (Spotify case study), strong creator adoption metrics, but persistent limitations in creative judgment and genre-specific mastery requiring specialist models.

  • 2026-Mar–Apr: Professional adoption survey (1,200+ music/audio creators, 70%+ with 10+ years experience) establishes baseline: 20% regular AI users, 50% experimenting, <20% with no interest; creativity concerns (33%), ethics (30%), and quality (27%) emerge as top adoption barriers. Consumer adoption accelerates: podcast consumption reaches all-time highs (80% ever listened, 58% monthly, 45% weekly), with critical AI correlation showing users have 87% audio engagement vs. 61% non-users. A key listener resistance signal emerged: 48% of audio-first podcast listeners would reduce listening if AI-generated voices were detected (Sounds Profitable), identifying a harder adoption ceiling than previous surveys. Practitioner benchmarking documents 43% editing time reduction in AI-assisted workflows with genre-specific performance variation (beat-driven content highly effective, acoustic struggling); B2B adoption metrics showed AI-powered editing reducing enterprise podcast production time by 70% with 50% of B2B marketers increasing podcast investment. Production case studies reinforced efficiency gains: 75% time reduction per episode documented in corporate podcast workflows, and Descript's text-based editing was assessed as commoditized (matched by Premiere, Final Cut, CapCut), with Studio Sound emerging as the genuine differentiator. AI mastering established clear adoption boundaries: 80-90% of professional quality for pop/EDM genres, insufficient for classical/jazz or major-label releases, with experienced mastering engineers confirming AI cannot understand creative intent or dynamic phrasing. Market projection: $4.06B (2025) → $5.36B (2026) at 32% CAGR. Practitioner consensus unchanged—AI handles mechanics (90%+ transcription, noise removal, dead air) but cannot engage in pacing, comedic timing, or emotional context; hybrid workflows remain dominant for quality-critical work.

  • 2026-May: Production workflow maturity consolidated around a measurable efficiency benchmark: AI-assisted podcast production cycles compressed from 14 hours (2021) to under 2 hours (2026), with 40% of podcasters (67% of professionals) using AI at 70%+ time savings; Descript's 6M+ creator base (NPR, NYT, HubSpot) confirms mainstream deployment at scale. A peer-reviewed practitioner study (Garcia & Reiss, 76 surveyed, 20 interviewed) found sound designers prefer task-specific assistive tools over generative systems, with AI adequate for podcasts and fast-consumption media but insufficient for narrative-heavy or high-end sound design—the clearest research-grade confirmation of the hybrid ceiling. Spotify introduced verified badges and policies targeting unauthorized AI voice cloning, signaling platform-level recognition that creator-audience trust in podcasting requires authenticity guarantees music streaming did not need. LANDR limitations documented by professional mastering engineers (cannot hear intent, no revision feedback, vinyl-incapable) reinforced that AI mastering remains appropriate for demos and rough cuts but not quality-critical releases.

  • 2026-Jun: Creator adoption data consolidated and ecosystem maturity solidified. Epidemic Sound's survey of 3,000 professional creators establishes near-universal baseline: 94% already using AI in workflows, 72% plan increased usage over 12 months, yet 83% believe human-made sound creates stronger emotional connections than AI alternatives—confirming adoption is mainstream but with persistent authenticity preference ceiling. Adobe's 16,000+ creator survey (8 countries) documents 87% report AI acceleration and 75% rate it as integrated/essential; critical limitation: 57% say AI outputs require moderate-to-extensive editing before use. Descript's June changelog (Tone Tags, Underlord Opus 4.8, MCP integrations in Claude/ChatGPT) signals continued investment in agentic/assistive rather than replacement capabilities. Market trajectory: IMARC projects podcasting $28.2B (2025) growing to $191.3B (2034) at 23.71% CAGR, with transcription reducing post-production 60-70% as the primary AI mechanism. Technical segmentation refined: RoEx distinguishes true multi-track stem mixing from stereo mastering (LANDR's ceiling), showing market differentiation by production scale; Linnk's pipeline analysis identifies diarization—not speech recognition—as the "quiet failure" point in consumer tools on overlapping audio. Podcast Engineers (13+ years, 350+ shows) hardened practitioner consensus: "edits are mechanical, not editorial—useful as part of a workflow, weak as the whole workflow." The AI toolchain has crossed into standard practice for noise removal, filler detection, and loudness compliance; listener trust and editorial judgment remain adoption ceilings full automation cannot overcome.

TOOLS