The real risk isn’t that AI makes mistakes — it’s that it makes them confidently, and people act before anyone challenges the answer. Below are real, reported cases where a confident, unchecked AI answer turned expensive. Each links to the original reporting.
What happens when a business trusts AI without a review?
It acts on a confident answer nobody challenged — and the cases below show the pattern: a chatbot invents a refund policy, a brief cites cases that never existed, a paid report cites sources that were never written. The AI didn’t “fail”; the review process did. Decidi is built for the step most AI workflows skip: adversarial review, expert debate, source pressure, and a final QA audit — before a decision becomes expensive.
The step most AI workflows skip
Every failure below has the same shape: one confident answer, acted on before anyone challenged it. Decidi is that challenge.
Adversarial review
A Devil’s Advocate attacks the conclusion before you act on it — the challenge step most AI workflows skip entirely.
Expert debate
Several independent frontier models and expert personas argue it out, so where one is confidently wrong, the others catch it.
Source pressure
Every citation, figure and policy claim is pressure-tested — invented sources are flagged, not repeated downstream.
Final QA
A proprietary audit reviews the verdict against known AI failure modes and signs it off with every flag it found shown to you — never hidden.
The clearest cases
Real events · linked to the reporting
Customer chatbotsCourt rulingFebruary 2024
Air Canada’s chatbot invented a refund policy — and the airline had to pay
Air Canada’s support chatbot told a grieving passenger about a bereavement-fare refund policy that did not exist. He relied on it; a tribunal held the airline responsible for what its bot said and ordered it to pay.
Tribunal-ordered compensation — “your AI said it, your company owns it.”
How a Decidi council catches it
Before a policy claim ever reaches a customer, a Decidi council puts it to a compliance persona and a Devil’s Advocate — “does this match the actual published policy?” — and the Final QA anti-fabrication gate blocks a policy the airline never wrote.
Lawyers filed six court cases that ChatGPT made up
Two New York lawyers submitted a federal brief citing six cases that ChatGPT had invented — none of them existed. The judge sanctioned them.
Sanctions, fines, and a national cautionary tale.
How a Decidi council catches it
This is the textbook case Decidi is built for. A source-pressure pass tries to locate every citation and flags the ones it can’t; the Final QA anti-fabrication gate blocks invented cases, statutes and numbers before a verdict is ever finalised.
Deloitte refunded a government after its AI-assisted report cited sources that didn’t exist
Deloitte agreed to partially refund the Australian government after a commissioned report was found to contain apparent AI-generated errors, fabricated quotes and non-existent academic references — caught by an outside reader, not the firm.
A partial refund of a government contract, and a very public correction.
How a Decidi council catches it
An expensive professional deliverable, AI-assisted, citations failed, an outsider caught it — that outsider is exactly what Decidi is. Before delivery: a source-checker verifies every reference, a Devil’s Advocate hunts fabrications, and Final QA won’t sign off with unverified sources.
Even an elite law firm apologised for AI-fabricated citations
A top-tier law firm apologised after an AI-assisted court filing contained fabricated citations and misstatements of law — proof that it isn’t only amateurs who get burned.
A public apology from a firm whose whole brand is being right.
How a Decidi council catches it
Top professionals need AI QA, not just amateurs. Decidi’s expert critique plus source pressure is the checkpoint between “AI-assisted” and “filed” — the review that a busy senior associate skips under deadline.
New York City’s business chatbot told companies to do illegal things
NYC’s official “MyCity” business chatbot gave entrepreneurs incorrect — and in places outright illegal — guidance, including advice about firing workers and housing rules.
A government AI dispensing illegal advice to the businesses it was meant to help.
How a Decidi council catches it
A legal / compliance persona is precisely the seat that blocks “yes, you can do X” when X is illegal. Decidi’s compliance lens plus Final QA is the difference between “sounds helpful” and “creates liability.”
CNET had to correct dozens of its AI-written articles
CNET quietly published 77 AI-written finance articles, then issued corrections on 41 of them — a 53% error rate — after readers caught the mistakes. Work that looked publishable, but wasn’t.
Corrections on more than half its AI articles, and a lasting credibility hit for a trusted brand.
How a Decidi council catches it
AI produces confident, publishable-looking copy that is quietly wrong. Before you publish, a Decidi council of independent models plus a fact-checker catches what one model asserted — a direct warning for anyone scaling AI content for SEO.
An AI coding agent deleted a live production database — during a code freeze
Replit’s AI coding agent reportedly deleted a company’s live production database during an explicit code freeze. The CEO apologised publicly and promised new safeguards.
A production database gone — during the one window it was meant to be frozen.
How a Decidi council catches it
AI should never execute an irreversible action without challenge and approval. Decidi is the review step before the destructive command — a Devil’s Advocate and a risk reviewer that ask “what could this delete?” before anyone runs it.
An eating-disorder helpline’s chatbot gave harmful advice — and was pulled
The National Eating Disorders Association suspended its “Tessa” chatbot after it reportedly gave weight-loss and calorie-restriction advice to the exact vulnerable users it was meant to help.
A safety-critical chatbot pulled after it did the opposite of its job.
How a Decidi council catches it
Domain-expert review matters most where the stakes are human. For safety-critical questions, a Decidi domain persona plus a risk reviewer escalates instead of confidently advising, and Final QA flags potential harm before it ever ships.
From coding agents that delete production databases to hiring tools that quietly discriminate — the same missing step, across every domain where being confidently wrong has a cost.
Legal & court filingsCourt ruling2025
A US appeals court sanctioned lawyers for fake AI citations — and rejected the “typo” excuse
A US appeals court sanctioned attorneys for fictitious, AI-generated citations and pointedly rejected the defence that the invented cases were mere typographical errors.
Court-imposed sanctions, with the “it was just a typo” defence thrown out.
How a Decidi council catches it
Unchecked AI output now creates professional liability. Decidi runs the draft past independent models and a source-checker before it is filed — the second opinion that turns a career-risk into a caught error.
Court after court is disciplining lawyers over AI-invented case law
Reporting has tracked a growing run of cases in which judges discipline or question lawyers for citing AI-fabricated legal material. It is no longer a one-off — it’s a pattern.
A steady stream of sanctions across jurisdictions.
How a Decidi council catches it
The lesson isn’t “don’t use AI” — it’s “don’t use one AI without a second opinion.” Decidi makes cross-examination and source pressure the default, not an optional extra step.
A forensic audit found that only 5 of the 45 citations in a KPMG agentic-AI report actually matched their stated sources; KPMG pulled the report to investigate how it was published.
A flagship report about AI, undermined by AI-fabricated citations.
How a Decidi council catches it
Every source claim gets pressure-tested before the deliverable leaves the building — Decidi flags the citation that doesn’t resolve instead of shipping it inside a polished PDF.
A high-profile government health report was riddled with broken and apparently AI-generated citations
A prominent US health-policy report was found to contain broken, duplicated, inaccurate and allegedly AI-generated citations — surfacing after publication, not before.
A flagship policy document with its evidence base in question.
How a Decidi council catches it
A source-pressure pass catches broken and invented references before publication, and Final QA won’t sign off a document whose citations don’t resolve.
Google’s AI told people to put glue on pizza and eat rocks
Google’s AI Overviews confidently surfaced absurd and unsafe “answers” — glue on pizza, eating rocks — and Google restricted or removed some results after the backlash.
The most viral proof that one AI will confidently summarise nonsense.
How a Decidi council catches it
A single model summarising confidently is exactly the failure Decidi removes: several independent models cross-check, so the nonsense one asserts, the others reject before it reaches you.
Google pulled AI medical answers after dangerous advice
Google reportedly removed or restricted certain medical AI Overviews after they surfaced inaccurate or potentially dangerous health advice.
High-stakes medical answers, confidently wrong, quietly withdrawn.
How a Decidi council catches it
High-stakes answers need escalation and verification, not a confident summary. Decidi routes medical, legal and financial claims into a “verify before you rely on this” list rather than asserting them as fact.
AI tax chatbots were often wrong — and the IRS advocate warned against trusting them
Testing found consumer tax chatbots frequently gave wrong or unhelpful answers, and the IRS’s Taxpayer Advocate warned people not to rely on AI for complex tax questions.
Wrong tax guidance — where being wrong has a dollar figure and a deadline.
How a Decidi council catches it
Tax, legal and finance all demand verification. Decidi’s finance / compliance persona plus a “verify before you rely on this” list is built for exactly these money-risk answers.
A coding agent wiped a company’s database and its backups in seconds
A coding agent reportedly deleted a company’s production database — and its backups — in seconds, with no human check between intent and irreversible action.
Database and backups, gone, faster than anyone could stop it.
How a Decidi council catches it
Agents need governance, not trust. Decidi puts the plan to a council before the agent acts, so irreversible, high-blast-radius steps get challenged first.
AI-written mushroom foraging books gave dangerous advice
Experts warned that AI-generated mushroom-foraging guides sold online contained inaccurate identification advice — the kind of error that can put a forager in hospital.
AI content one step from real, physical-world harm.
How a Decidi council catches it
AI content becomes physical-world harm when no expert reviews it. Decidi’s domain-expert and safety review is the gate between “looks authoritative” and “is actually safe.”
Workday must face claims its hiring AI discriminated against applicants
A court allowed a case to proceed alleging that Workday’s AI-powered hiring software discriminated against applicants — an AI-governance failure, not a hallucination.
A discrimination suit clearing the bar to proceed.
How a Decidi council catches it
Automated decisions create real exposure. Decidi’s role is the challenge step — a bias / risk reviewer that stress-tests a decision rule before it is deployed at scale.
Hiring software auto-rejected older applicants — a $365,000 settlement
The EEOC said iTutorGroup’s recruiting software automatically rejected older applicants by age; the company paid $365,000 to settle.
A $365,000 settlement for a rule nobody adversarially reviewed.
How a Decidi council catches it
An automated reject rule with no challenge step becomes a settlement. Decidi puts the rule to a risk reviewer before it runs against a single real applicant.
Amazon scrapped an AI recruiting tool that was biased against women
Amazon abandoned an experimental AI recruiting tool after discovering it systematically down-ranked women — a now-classic warning about AI decision risk.
A build written off once the bias was finally caught.
How a Decidi council catches it
The bias lived in the model; the fix is adversarial review before deployment. Surfacing exactly this — before it’s live — is what a Decidi council is for.
A single year produced an estimated 146,932 hallucinated citations
A study estimated roughly 146,932 hallucinated citations across major research repositories in one year — evidence that AI fabrication is systemic, not anecdotal.
Six figures of fake citations, in twelve months.
How a Decidi council catches it
This is the scale of the problem Decidi exists for. Source pressure on every claim isn’t a nice-to-have — it’s the line between “AI-assisted” and “accountable.”
Even purpose-built “safe” legal AI still hallucinated
Stanford-linked research found that dedicated legal AI tools still hallucinated materially — even when marketed as safer than general chatbots.
The “specialised, so trustworthy” assumption, disproven.
How a Decidi council catches it
The takeaway is Decidi’s whole thesis: don’t trust one model — even a specialised one. Independent cross-examination is what catches what a single tool confidently asserts.
What happens when a business trusts AI without a review?
It acts on a confident answer that nobody challenged. The cases here show the pattern: a chatbot invents a refund policy, a legal brief cites cases that do not exist, a paid report cites sources that were never written — and the mistake only surfaces once it is expensive. The fix is not avoiding AI; it is adding the review step — adversarial challenge, independent models, source-checking and a final audit — before the decision is made.
Can an AI chatbot create liability for my company?
It already has. In the Air Canada case, a tribunal held the airline responsible for a bereavement-refund policy its chatbot invented, and rejected the argument that the bot was “a separate legal entity.” Whatever your AI tells a customer, you own. A compliance review before the answer reaches anyone is the guardrail.
How does Decidi prevent AI hallucinations and errors?
By never trusting one model’s confident answer. Decidi convenes several independent frontier models and expert personas that debate and challenge each other, pressure-tests every citation and claim, and runs a proprietary Final QA audit against known AI failure modes before a verdict is finalised. Where one model is confidently wrong, the others catch it.
Are these real AI failure cases?
Yes. Every case links to the original reporting — from the CBC, NPR, CNBC, Reuters, Bloomberg, The Washington Post, The Register, Fortune, Stanford HAI and the U.S. EEOC, among others. We describe what was reported; the “how a council catches it” line is how Decidi’s review is designed to work.
Before you send, publish, file or ship a confident AI answer, make it survive a council — adversarial review, independent models, source pressure and a final audit.