Why your AI emails don't sound like you

It is not the model. Your voice file was written from memory, not from approved copy. Here is the extraction method that fixes it.

Jun 05, 2026

Why your AI emails don't sound like you

Every voice setup I have seen that starts with a self-description drifts. The rules are accurate: they describe the voice the client thinks they have. Not the voice their approved emails actually demonstrate. That gap is where first drafts go wrong and revision rounds pile up.

Why Extracted Voice Beats Described Voice

In my experience, voice setups that start with a self-description drift. Not because the rules are wrong. Because the rules describe the voice you think you have, not the voice your approved emails actually demonstrate.

When you describe your voice in a document ("warm but direct, benefit-led, no corporate jargon"), you are writing in human language about writing behavior. Claude has to interpret that description and translate it into generation behavior. The interpretation step introduces error. "Warm but direct" means something specific in your head and something different when the model generates against it.

When you extract voice rules from approved output, something different happens. The rules come back in terms that describe writing patterns structurally: sentence rhythm, opener structure, recurring phrase patterns. When you paste that file into the next prompt, the model is reading a structural description of the voice, not a human description it has to interpret. From experience, the gap between instruction and output is smaller.

Where Voice Calibration Sits in the Pattern

The extraction prompt is governed by the same four-mechanism wrapper we apply to every prompt we publish: prerequisites (approved samples and brand context), self-validation (every rule cites a sample number), human gate (you approve the VOICE.md before it lands on disk), and improvement proposal (recurring violations from drift detection become operator-committed rule additions).

VOICE.md is then the declared prerequisite of every other prompt in the system. Subject line prompts, segment rewrites, pre-send QA: each one refuses to run unless VOICE.md is provided. The voice file is not optional context. It is a required input enforced at the prerequisites layer.

The Two Layers Email Voice Actually Has

The other failure mode in most voice setups is treating all email types as interchangeable. A welcome email and an abandoned cart email are not the same register. The same brand writes them differently: different warmth level, different urgency, different lead style. If you extract voice from five promotional emails and use that file for welcome copy, you get welcome emails written in a promotional register. Structurally correct, tonally wrong.

Email production voice has two distinct layers:

Brand identity layer: what stays consistent across every email type. Core vocabulary, sentence rhythm, prohibited phrases, opener DNA. This is extracted once and lives in the VOICE.md.
Contextual register layer: how far each flow type moves on warmth, urgency, and lead style. Welcome sits at different coordinates than cart recovery for the same brand. This is documented separately in REGISTER.md and referenced per send.

Building both is a one-time setup. Using them is one extra line in every prompt. From experience, the difference shows up in revision rounds: extracted voice files cut the back-and-forth on tone consistency from the first draft.

Extraction Prompt: The Brand Identity Layer, Full Wrapper

Use one representative approved email per primary flow type as input. Welcome, abandoned cart, promotional, and re-engagement if you have all four. The prompt extracts what is consistent across all of them: the brand identity. Patterns that only appear in one flow type are register, not identity, and get flagged as conflicts for the register map.

## Prerequisites: required before running

- 5 to 10 approved email samples from this client, one per flow type
- A one-paragraph brand context (product, audience, primary offer)

If either is missing, do not proceed. Respond:
"Cannot run. Missing: [list approved emails / brand context as applicable].
Provide and re-run."

## Inputs

Brand context: [PASTE ONE PARAGRAPH ABOUT PRODUCT, AUDIENCE, OFFER]

Approved emails: [PASTE 5-10 APPROVED EMAILS, ONE PER FLOW TYPE,
SEPARATED BY --- BELOW]

## Task

Extract the brand identity layer: what stays consistent across ALL email
types. Write a VOICE.md file using exactly this structure:

---
## Tone
[one sentence: the consistent emotional register across all samples,
citing Sample numbers that demonstrate it]

## Sentence rules
1. [length and rhythm rule, with verbatim example in quotes and Sample number]
2. [paragraph structure rule, with Sample number]
3. [one other consistent structural pattern, with Sample number]

## Opener patterns
- [first-sentence style, with verbatim example in quotes and Sample number]
- [second pattern if present, or "one consistent pattern observed"]

## Recurring phrases
- "[exact phrase found across multiple emails]" (Samples N, N)
- "[exact phrase found across multiple emails]" (Samples N, N)
- "[exact phrase found across multiple emails]" (Samples N, N)

## What this voice never does
- [prohibition derived from consistent absence, with reasoning]
- [prohibition with reasoning]
- [prohibition with reasoning]

## Subject line pattern
[describe the pattern, citing Samples, or write
"not visible in samples" if subject lines not provided]
---

Rules:
- Brand identity only. Skip patterns that appear in only one flow type.
- Use verbatim examples where possible.
- If samples contradict each other on a pattern, write:
  CONFLICT: [what varies]. These belong in the register map, not here.

## Self-validation (run before returning output)

Before returning the VOICE.md, check:

1. Every rule cites at least one specific sample number
2. No rule is contradicted by another rule in the file
3. No subjective adjective ("punchy", "engaging", "conversational")
   appears without an operational test attached
4. The structure matches the schema above exactly
5. CONFLICT lines appear for any pattern that varies across samples

If any check fails, fix internally and re-check. If you cannot satisfy a
check after one attempt, return:
"Validation failed: [rule]. Cannot produce compliant VOICE.md."

## Human gate (before writing to disk)

After returning the VOICE.md block, state:
"Review the extracted VOICE.md above. Type APPLY to write it to disk as
VOICE-[YYYY-MM].md, or REJECT with specific corrections."

Do not write the file until APPLY is received.

## Improvement proposal (optional)

If you noticed a voice pattern in the samples that does not fit the four
sections above, append:

"PROPOSED RULE ADDITION (review before adding to the schema): [one line
describing the section or rule type that should be added]"

Do not modify the schema yourself.

Any CONFLICT line the extraction returns is a signal to document that pattern in the register map. It means the brand genuinely calibrates that behavior by email type, and a single rule would either be too rigid or too vague to be useful.

The Register Map, Same Wrapper

Once VOICE.md is committed, run a second prompt against the same samples to derive the register coordinates per flow type. The register map does not need to be regenerated frequently. Only when the program adds a new flow type or when client feedback flags a specific flow as tonally off.

## Prerequisites

- VOICE.md (the brand identity file from the extraction prompt above)
- The same email samples used for the extraction
- A list of flow types in the program

If any is missing, do not proceed. Respond:
"Cannot run. Missing: [list]. Provide and re-run."

## Inputs

VOICE.md: [PASTE VOICE.md CONTENTS HERE]

Email samples: [PASTE THE SAME SAMPLES USED FOR EXTRACTION]

Flow types in program: [LIST EVERY FLOW TYPE, ONE PER LINE]

## Task

For each flow type, identify register coordinates:

- Warmth level: high / medium / low, with one verbatim example sentence
- Urgency level: high / medium / low, with one verbatim example sentence
- Lead style: what the first sentence focuses on
  (benefit / consequence / story / question / direct CTA)
- CTA style: how the call to action is framed
  (soft / direct / urgent / implied)

Format as a table with one row per flow type. Coordinates, not rules.
They describe where this brand sits on each axis for each context.

## Self-validation

Before returning:
1. Every flow type from the input list has a row
2. Every cell cites a verbatim example sentence
3. No coordinate contradicts the VOICE.md identity layer
4. The table format is consistent

If a flow type has no representative sample, mark its row "INSUFFICIENT
SAMPLE" rather than guessing. Return the table even if some rows are
incomplete.

## Human gate

After returning the table, state:
"Review the REGISTER.md above. Type APPLY to write to disk, or REJECT
with corrections."

## Improvement proposal

If a register dimension keeps coming up that is not warmth/urgency/lead/CTA,
append:
"PROPOSED RULE ADDITION: [one line describing the new dimension]"

The output is a table. Welcome might be high warmth, low urgency, benefit-led, soft CTA. Abandoned cart might be medium warmth, high urgency, consequence-led, direct CTA. The table goes into a REGISTER.md file stored in the same client folder as VOICE.md.

Using Both Files in Every Downstream Prompt

Once VOICE.md and REGISTER.md exist on disk, every other prompt in the system declares both as required prerequisites. The downstream prompts refuse to run without them.

Below is the usage prompt that consumes both files. Notice that the prerequisites block lists VOICE.md and REGISTER.md by name and stops if either is missing.

## Prerequisites: required before running

- VOICE.md for this client
- REGISTER.md for this client
- A brief with all six fields below filled in

If any prerequisite is missing, do not proceed. Respond:
"Cannot run. Missing: [list]. Provide and re-run."

## Inputs

VOICE.md: [PASTE VOICE.md CONTENTS HERE]

Register for this send: [PASTE THE RELEVANT ROW FROM REGISTER.md]

## Brief
Type: [welcome / abandoned cart / promotional / re-engagement]
Position: [e.g. first in a 4-part sequence / third of five]
Audience: [who receives this and what they know about the brand]
Offer: [if applicable]
Goal: [what this email needs to accomplish]
Length: [word count target]

## Task

Write the email in the voice described in VOICE.md, calibrated to the
register coordinates above.

## Output format

Subject: [subject line]
Preview: [preview text, max 90 characters]
Body: [email body only, no greeting, no sign-off]

## Self-validation

Before returning, check the draft against:
1. Every VOICE.md "What this voice never does" rule
2. Register warmth, urgency, lead style, and CTA style match the row
3. Body word count is at or below the Length value specified in Brief

If any check fails, fix internally and re-check. If you cannot satisfy a
check after one attempt, return:
"Validation failed: [rule]. Cannot produce compliant copy."

## Human gate

After returning the draft, state:
"Review the draft above. Type APPLY to queue for send, or REJECT with the
specific revision needed."

Do not move the draft to the send queue until APPLY is received.

## Improvement proposal (optional)

If during writing you noticed a recurring pattern not covered by the
self-validation rules above (e.g. a register coordinate pairing that
consistently produces a structural conflict with VOICE.md), append:

"PROPOSED RULE ADDITION (review before adding): [one line describing the
pattern]"

Do not modify the self-validation rules in this prompt. Only propose.

The register row anchors warmth and urgency before generation starts. Without it, the model calibrates from VOICE.md alone, which describes the brand's tonal center of gravity, not where a specific flow type should sit on the scale. The result is a promotional email written at welcome-register warmth, or a re-engagement email with the urgency level of a cart recovery send.

When to Update Each Layer

The two layers update on different triggers.

From experience, AI-assisted programs drift faster than human-authored ones because generation volume is higher. Set a review cadence before you start generating at volume, not after you notice output degrading. For AI-assisted production, monthly is the cadence we use. For human-authored programs, quarterly is the cadence we use as a baseline.

Brand identity layer: update when brand guidelines change, when a major campaign cycle introduces a deliberately different positioning, or when a drift detection run flags consistent violations across multiple flow types. Not every brand refresh requires a full extraction. If only the prohibited phrases changed, edit VOICE.md directly and commit the change.

Register map: update when a specific flow type consistently gets revision notes about tone. If every abandoned cart draft comes back too aggressive, the urgency coordinate for that flow type is wrong. Pull three recent approved sends from that flow, re-run the register prompt for that row only, and commit the updated table.

Keep both files versioned: VOICE-2026-05.md, REGISTER-2026-05.md. When an output starts feeling off and you cannot identify why, the file version is in git history and you can trace it back to the specific file the drafts were generated against. Improvement is governed, reviewable, reversible.

Discussion about this post

Ready for more?