Your AI is confidently wrong.

Jun 29, 2026

Claude wrote "Outlook only renders the first GIF frame" into a client's email copy, and it read like a sensible technical caveat, the kind a careful developer would add, except it's also false: modern Outlook plays GIFs. The line was lifted from a stale doc that stopped being true years ago, and it sat in an approved draft that was one click from going out to the client's full list. Nobody flagged it, because nothing about it looked wrong.

The failure worth understanding is that the model doesn't break loudly, it stays fluent and plausible and just quietly out of date. On a single email you might catch it, but at production volume you won't.

Why volume defeats the human skim

Run AI across a real client roster and it generates dozens of factual claims a week: deliverability thresholds, ESP behavior, rendering quirks, benchmark numbers. Most are fine, but a few are stats that were accurate in 2023 and aren't now, hallucinated platform specifics, or numbers pulled from training data that no longer holds. The model has no way to know what your brand has actually verified, it only knows what sounds right, and out-of-date facts sound exactly as right as current ones.

The real danger is the plausible-but-retired fact, not the hallucination so wrong any reviewer catches it: it reads cleanly, it matches what a lot of people still believe, and it slides past a human skim because skimming is just pattern-matching against what looks reasonable, and a retired fact looks completely reasonable, which is why you can't eyeball your way past it. The reviewer who approved the GIF line wasn't careless; the claim simply passed every test a human applies at reading speed.

What a claims registry actually is

The fix is to stop relying on the model's memory and the reviewer's recall, and give both a brand-owned source of truth. A claims registry is a single file the AI has to defer to, and it has three sections.

Verified facts: claims confirmed against a named source, with the year and the reference attached, so a fact is never just asserted. Forbidden claims: statements that are verified false or brand-prohibited, each with an explicit reason, so the same retired fact can't creep back in next quarter. Pending verification: claims that showed up in generated output but haven't been confirmed yet, parked until someone checks them instead of shipped on optimism.

## Verified facts
- SPF authentication reduces domain spoofing risk
  Source: Cloudflare documentation 2025

## Forbidden claims
- "Outlook only renders the first GIF frame"
  Reason: false, modern Outlook plays GIFs
- "42x email ROI" without a 2025+ citation
  Reason: unverifiable as stated

## Pending verification
- "Gmail clips HTML past 102KB"
  Status: context-dependent, needs scoping

The two entries under forbidden claims aren't facts I'm asserting. They're labeled examples of exactly what the registry exists to block: the GIF line, false; the recycled ROI figure, unverifiable as written. The registry's whole job is to keep those out of generated copy, by name.

Two gates, fail-closed at both ends

A registry sitting in a folder changes nothing on its own; it earns its place by wiring into two points of the pipeline. Pre-generation, the registry is injected at the prerequisites layer of the prompt, so the model draws from the verified list and refuses to make claims outside it. The prompt won't run a factual brief without the registry present. That's the first fail-closed gate: no source of truth, no generation.

Post-generation, a validator hook checks the written file against the forbidden list before it queues for send, so any match blocks the file and logs the violation before the model can mark its own work as clean. That's the second fail-closed gate, and it's the one that would have caught the GIF line: the claim was already written and already approved by a human, and the validator would have stopped it at the queue anyway. Two gates, because one is never enough when the cost of a miss is a send to the whole list.

This is the registry that gates Remint's own content, a JSON file with named sources and retrieval dates, verified claims with confidence levels and expiry dates, and a list of forbidden-claim patterns a validator runs on every build before anything ships. The "first GIF frame" line is in it, as a forbidden pattern, so the exact mistake that nearly went to a client can't reach this page either.

One registry per client

For an agency running Claude across multiple accounts, the registry isn't shared. Each client carries their own: their verified facts, their brand-specific prohibitions, their pending list. This is the part generic "fact-check your AI" advice ignores. A single house registry guarantees cross-contamination, where one client's approved claim or banished phrase leaks into another client's copy. Separate files per client make that structurally impossible. Facts and prohibitions stay isolated, because what's true and sayable for one brand isn't automatically true and sayable for the next.

Email is where this lesson gets learned, because email is unforgiving. ESP rendering facts and deliverability thresholds date fast, and a wrong one ships to the entire list at once with no recall. But the mechanism isn't really about email. Any AI workflow that makes factual claims on behalf of a customer needs a brand-owned source of truth sitting between the model and the output: verified facts in and forbidden claims blocked, with pending claims held until someone confirms them. The model supplies fluency, while the registry supplies the part the model can't have, which is knowing what's actually true for this client, this year.

Discussion about this post

Ready for more?