The Accountability Gap: When AI Delegation Meets Human Responsibility

Since 2023, a pattern has emerged across industries that reveals a systematic accountability gap. In June 2023, New York lawyers were sanctioned for submitting legal briefs citing six entirely fabricated court cases generated by ChatGPT. Throughout 2023, CNET required corrections on 41 of 77 AI-generated articles, whilst Sports Illustrated was exposed for publishing product reviews under fake author identities with AI-generated profiles. In February 2024, Air Canada was held liable when its chatbot provided incorrect bereavement fare information. The tribunal rejected the airline’s remarkable claim that the chatbot was “a separate legal entity responsible for its own actions.” And, in October 2025, Deloitte Australia refunded the government for a report containing fabricated academic citations, non-existent references, and fake judicial quotes. Errors that one Senator described as “the kinds of things that a first-year university student would be in deep trouble for.”
What began as isolated incidents has become systematic and the gap continues to widen: dedicated trackers have logged over 120 documented court decisions worldwide involving AI-hallucinated citations, with broader incident databases exceeding 500 global cases. Researchers note a shift from two cases a week to two or three a day by spring 2025. In November 2025, a California prosecutor blamed AI hallucinations for errors in a criminal firearms case. Courts are showing growing impatience with AI errors, with judges declaring that monetary sanctions are proving ineffective at deterring false AI-generated statements.
These aren’t isolated incidents of technological failure. They reveal a systematic governance gap: organisations delegating work to AI without establishing the verification capability to ensure accountability for outputs. Research from Tenable found that 63% of organisations lack any AI governance policies. And Gartner predicts AI regulatory violations will result in a 30% increase in legal disputes for tech companies by 2028. This pattern exposes a critical misunderstanding about AI deployment. McKinsey’s latest research shows 88% of organisations now using AI in at least one function — up from 78% earlier in 2025 — yet Boards are approving this accelerating adoption for legitimate efficiency objectives whilst failing to recognise that delegation of work does not equal delegation of accountability.
Understanding AI Accountability
At both individual and organisational levels, a common misconception has emerged - if AI produces the work, someone else is accountable for its quality. Individuals distance themselves from AI-generated outputs, whilst organisations treat efficiency gains as automatic justification for headcount reduction. Both stem from the same flawed assumption: that delegation of work equals delegation of accountability.
The most useful way to understand AI’s role is to think of it as a brilliant graduate trainee with remarkable productivity but limited real-world experience, requiring expert supervision. When organisations hire talented junior staff, they expect high productivity producing large volumes of work quickly. But they also expect limited contextual judgement and domain expertise, with senior expert review taking place before the work reaches clients. Someone must check the homework before it’s handed in.
AI has precisely the same profile. It produces work at extraordinary speed but lacks contextual judgement, requiring expert supervision to verify outputs and needing quality assurance before delivery to clients or stakeholders. Yet organisations are deploying AI at scale whilst reducing the senior expertise needed to validate it.
Legal systems, regulatory frameworks, and commercial relationships consistently reject the “AI did it” argument as a valid defence. Courts have made clear that organisations remain responsible for all information they provide, regardless of whether it originates from static content or AI systems. Lawyers cannot escape responsibility for citations by pointing to their tools. Consultancies cannot disclaim accountability for fabricated references in their deliverables.
The pattern is clear across the documented cases now accelerating globally: courts, regulators, and clients hold humans accountable for AI-assisted work. The technology might be new, but the accountability framework remains unchanged.
In my previous article on agentic AI, I explored how organisations are transferring agency for decision-making from humans to AI systems. This article addresses the governance corollary: while agency can be transferred, accountability cannot. Someone must remain answerable for outcomes, and that someone is invariably human.
Why AI Amplifies Expertise Needs
Boards face a counterintuitive reality: AI amplifies rather than reduces the need for subject matter expertise. The more you delegate to AI, the more expertise you need to verify its outputs. AI doesn’t eliminate the requirement for domain knowledge, but intensifies it.
Recent evaluations still find substantial hallucination rates even for advanced models. In specialised domains like law, Stanford research found that general-purpose large language models (LLMs) hallucinate on legal queries 58-82% of the time. Yet most organisations report having no formal verification process at all.
This creates a headlong rush where deployment outpaces capability. Consider what verification actually requires across different domains: verifying chatbot responses about fare policies demands expertise in those specific policies. Recognising fabricated legal citations requires experience with actual case law and court procedures. Spotting factual errors in journalism demands editorial judgement built over years of reporting. Validating technical claims in legal filings needs deep understanding of the underlying subject matter.
This expertise doesn’t come from using AI tools. It comes from years of doing the work yourself - conducting literature reviews, verifying citations, checking sources, building pattern recognition about what’s plausible versus what’s fabricated. Even when organisations have institutional AI policies in place, those policies prove insufficient without the domain expertise to apply them effectively.
The common thread across documented cases isn’t organisational failure, but the systematic underinvestment in the verification capability that AI deployment demands.
The Investment Gap
Whilst Boards approve AI deployment — 78% of organisations now using it in at least one function — they’re not approving corresponding investments in verification capability and expertise development. The Anaconda Developer Survey found that only 34% of organisations have formal policies for AI-assisted coding. Broader governance studies reveal that most organisations lack systematic verification frameworks that scale with AI volume.
And they’re not developing the analytical and critical thinking capability to spot AI errors systematically, despite the persistent risk of hallucination even in domain-specific tools. Instead, many Boards are seeing AI as a way to reduce investment in expertise. Recent research found that 43% of companies plan to replace roles with AI, including 37% targeting entry-level positions and 58% in operations functions. These organisations are simultaneously increasing AI output volume whilst decreasing verification capacity.
This is strategically backwards.
Organisations are racing to deploy AI without pausing to ask: “Who has the expertise to verify these outputs?” and “How are we investing in their capability?”. Just as senior professionals check the work of junior staff before client delivery, organisations need senior experts checking AI outputs. But if you’re reducing senior expertise whilst increasing AI-generated volume, the mathematics don’t work.
The competitive opportunity is clear. PwC’s AI Jobs Barometer found that AI-exposed industries achieve 3x revenue growth per employee—but only with skilled oversight. The same research shows a 56% wage premium for AI skills, demonstrating that the market rewards verification capability. Without it, errors proliferate and potential gains vanish.
The Pipeline Nobody’s Protecting
The accountability gap has an even more insidious dimension that Boards aren’t seeing: the destruction of expertise development pipelines. This isn’t theoretical — it’s already measurable in the employment data.
How Expertise Actually Develops
The traditional model across professional services, law, journalism, and consulting has juniors doing foundational work whilst seniors review and correct it. Through repetition, juniors develop domain knowledge, pattern recognition, and contextual judgement. Over time — typically 5-10 years — juniors become the senior experts who train the next generation. The cycle continues, creating sustainable expertise pipelines.
When organisations replace junior roles with AI to capture efficiency gains — and 43% of companies now plan to do exactly this — they face a strategic choice with diverging outcomes.
Pure replacement offers immediate gains: faster work production, lower junior staff costs, improved productivity metrics. But it creates long-term losses: the pipeline creating future senior experts disappears, organisational knowledge development ceases, internal capability building stops, and future verification capacity vanishes.
You don’t develop expertise by reviewing AI outputs. You develop it by doing the work yourself repeatedly. Legal researchers learn to verify citations by conducting legal research hundreds of times. Consultants learn to spot fabricated references by producing reports and having them corrected. Journalists learn what good looks like by writing and reporting, not editing AI content.
Two Diverging Paths
Organisations face a strategic choice:
- Path A: Pure Replacement means replacing junior roles entirely with AI for maximum short-term efficiency, destroying the expertise development pipeline, and facing talent shortages in 5-10 years when current experts retire. This is already happening with research based on payroll data from the most AI-exposed segments of professional services suggesting a 13% decline in AI-exposed entry-level roles across sectors.
- Path B: Strategic Augmentation uses AI to augment junior staff with proper frameworks for effective use, accelerating their learning whilst maintaining expertise pipelines and gaining efficiency. This means providing not just the tools, but also guidance on when and how to apply them, supervision to develop judgement, and structured learning that builds genuine expertise rather than dependence. A BCG experiment with 750 participants found 20-40% productivity gains when AI augmented junior and senior staff, demonstrating that efficiency doesn’t require elimination of the roles that build expertise.
The Long-Term Consequences
Organisations pursuing Path A face consequences that create a capability gap taking a decade or more to repair, if it can be repaired at all. Spending 5-10 years replacing junior roles with AI means forgoing a decade of expertise development that cannot suddenly be recreated. Senior experts who don’t exist because nobody developed those skills cannot be hired from a depleted market. Institutional knowledge that was never created cannot be rebuilt retroactively.
Picture this Board conversation in 2030:
The CEO reports that AI-generated work keeps failing verification and asks for more senior experts. The CHRO responds that they can’t hire them because the talent market doesn’t have them. When asked why, the answer is stark: organisations across the sector replaced junior roles with AI in 2024-2026. Nobody has been developing junior staff for five years. The expertise pipeline dried up across the industry.
This isn’t just an organisational problem, but an industry-wide systemic risk. If entire sectors replace junior roles simultaneously, they collectively destroy the expertise development ecosystem. No individual organisation can solve this alone once the damage is done.
The Strategic Framework Boards Need
This accountability challenge isn’t new territory for boards, as it maps to familiar governance concerns around ownership, measurement, and capability development. The patterns we’re observing reflect challenges I’ve previously identified — particularly around Ethical and Legal Responsibility (where Boards must ensure clear accountability for AI-assisted decisions) and Risk Management (where focusing solely on efficiency metrics whilst ignoring capability degradation creates systematic vulnerabilities).
The remedy requires treating verification as a core capability alongside deployment itself. Organisations cannot sustainably scale AI without simultaneously scaling their ability to ensure output quality. This means emphasising the development of people and expertise, not just tools and technology. From a maturity perspective, early-stage organisations that rush to deploy without establishing verification capability create the systematic failures we’re observing, whilst high-maturity organisations invest in verification alongside deployment, achieving sustainable competitive advantage through governance rather than despite it.
The wrong question is: “Can AI do the work of our junior staff efficiently?”
The right question is: “If we deploy AI here, where will our senior experts come from in 10 years, and who has the expertise to verify outputs today?”
Five Essential Investments
When approving AI deployment, Boards must simultaneously approve:
Verification Capacity — Sufficient subject matter expertise to check AI outputs at the volume being generated, especially given that 63% of organisations currently lack any AI governance policies.
Skills Development — Training programmes specifically for critical evaluation of AI-generated work, recognising that the market rewards this capability with a 56% wage premium.
Pipeline Preservation — Strategic choices about which roles to augment with AI, developing expertise, versus replace with AI, destroying expertise development.
Quality Assurance — Systematic processes for validating AI outputs before delivery to clients or stakeholders.
Accountability Frameworks — Clear ownership and responsibility structures when AI-assisted work fails.
The Essential Questions
Before approving any AI deployment, Boards should require answers to:
“Who has the expertise to verify these AI outputs?” given the substantial hallucination rates that persist even in advanced models.
“How are we investing in their capability?”
“Are we augmenting or replacing our expertise development pipeline?”
“How will we maintain verification capacity at scale?”
“Where will our senior experts come from in 2030?”
The Measurable Divergence
Evidence from 2025 demonstrates this isn’t theoretical, as organisations are already separating into distinct performance categories.
PwC’s AI Jobs Barometer found that AI-exposed industries achieve 3x revenue growth per employee, but only with skilled oversight ensuring output quality. Multiple consulting studies estimate 20-30% efficiency gains when AI is deployed with strong governance frameworks, versus little or no measurable impact where governance is weak.
These high-performing organisations deploy AI safely at scale without the accountability failures now documented in over 120 court cases. They maintain client trust and regulatory compliance. They achieve productivity improvements whilst preserving expertise pipelines. The market rewards them with a 56% wage premium for AI skills—tangible recognition that verification capability creates competitive advantage.
In contrast, organisations investing only in deployment experience the patterns we’re observing: financial penalties exemplified by Deloitte’s partial refund, joining the 42% abandoning AI projects due to unaddressed gaps, and attracting regulatory scrutiny that Gartner predicts will increase legal disputes by 30% by 2028.
The Strategic Choice
The failures across legal, media, professional services, and technology aren’t isolated incidents — they’re early warnings of what happens when organisations delegate work to AI without investing in the verification capability needed to ensure accountability. The consequences extend beyond immediate financial penalties to long-term capability erosion that can take a decade or more to repair.
The patterns documented throughout this article reveal two diverging paths. One leads to systematic failure through deployment without verification, pipeline destruction without expertise development, and efficiency gains that vanish into quality failures. The other leads to sustainable competitive advantage through augmentation over replacement, verification capability alongside deployment, and compound value from preserved expertise pipelines.
The divergence between these paths is accelerating, not narrowing. Organisations making the wrong choice today are creating capability gaps that cannot be repaired when the market shifts, when experts retire, when verification crises emerge. Those making the right choice are building advantages that compound over time — advantages their competitors cannot replicate once the talent pipeline has dried up across the industry.
Boards face a defining choice. Those who invest in verification alongside deployment, who choose augmentation over pure replacement, who preserve expertise pipelines whilst capturing efficiency gains will be the organisations that establish dominant competitive positions. The window for making this strategic choice is narrowing, but those who act now are already pulling ahead.
The question isn’t whether to adopt AI. It’s whether to adopt it sustainably, with the verification capability and expertise development that turns delegation into genuine competitive advantage rather than systematic failure.
The strategic choice is yours.
Let's Continue the Conversation
Thank you for reading about the accountability gap between AI delegation and human responsibility. I'd welcome hearing about your board's experience navigating this challenge - whether you're building verification capability alongside AI deployment, wrestling with the strategic choice between augmentation and replacement of junior roles, or discovering ways to preserve expertise pipelines whilst capturing efficiency gains. Perhaps you're finding that the accountability frameworks your organisation needs differ significantly from what traditional governance provides, or you're seeing the competitive divergence play out in your sector as some organisations invest in verification whilst others pursue pure efficiency.
About the Author
Mario Thomas is a Chartered Director and Fellow of the Institute of Directors (IoD) with nearly three decades bridging software engineering, entrepreneurial leadership, and enterprise transformation. As Head of Applied AI & Emerging Technology Strategy at Amazon Web Services (AWS), he defines how AWS equips its global field organisation and clients to accelerate AI adoption and prepare for continuous technological disruption.
An alumnus of the London School of Economics and guest lecturer on the LSE Data Science & AI for Executives programme, Mario partners with Boards and executive teams to build the knowledge, skills, and behaviours needed to scale advanced technologies responsibly. His independently authored frameworks — including the AI Stages of Adoption (AISA), Five Pillars of AI Capability, and Well-Advised — are adopted internationally in enterprise engagements and cited by professional bodies advancing responsible AI adoption, including the IoD.
Mario's work has enabled organisations to move AI from experimentation to enterprise-scale impact, generating measurable business value through systematic governance and strategic adoption of AI, data, and cloud technologies.