Runaway Audit

Module 8, Lesson 8.5 · nine-check freeze gate · one copy per pipeline, re-run on every structural change

Save this as: /capstone/pipeline-v1/audit.md (or print). Re-run the audit every time you add an agent, widen a tool allowlist, change a retry policy, or on the next-review date.

Freeze rule: every check must pass before the pipeline is copied from pipeline-v1-draft/ to pipeline-v1/. A single failing check stops the freeze. Fix, re-audit, then freeze.

The audit is what turns a working pipeline into a safe pipeline.

Three runaway shapes — depth, fan-out, retry — and three rails — cap, cap, bounded. Nine checks are the audit that proves every rail is in place.

Header

Pipeline name:

Path (frozen side):   ☐ Claude Code CLI   ☐ Cowork tab

Audit date:   Auditor:

Group A — Shape rails (checks 1–3)
Check 1 — Depth cap

Is the pipeline's supervisor depth ≤ 3?

Count the layers: a sequential pipeline has depth 1 (no supervisor). A single supervisor dispatching workers has depth 2. A meta-supervisor over supervisors has depth 3. Anything deeper is a reject.

☐ PASS — depth =     ☐ FAIL — flatten before freeze.

Evidence / notes:

Check 2 — Fan-out cap

Does every parallel or hierarchical step fan out to ≤ 4 workers?

Count the widest fan-out in the pipeline. A sequential pipeline has fan-out 1 (each stage dispatches to the next). A supervisor spawning four workers is at the cap. Five or more is a reject — either split the work into two rounds or redesign.

☐ PASS — max fan-out =     ☐ FAIL.

Evidence / notes:

Check 3 — Shape is named in blueprint

Does blueprint.md explicitly name the pipeline shape and the depth/fan-out numbers?

"Shape: sequential" is not enough on its own; the blueprint should also state Depth cap: 3. Fan-out cap: 4. If the numbers are not written, a reviewer (future-you or a teammate) cannot confirm the rails without re-reading the agent definitions.

☐ PASS — blueprint states shape + caps.     ☐ FAIL — add the explicit statement, then freeze.

Evidence / notes:

Group B — Tool and scope rails (checks 4–6)
Check 4 — Bounded retries

Does every handoff contract name a bounded failure mode?

Read every handoff contract. Each failure-mode field must be one of: stop, retry once, fall back, or skip. "Retry if it fails" or "retry until it works" is a reject. This is a common audit failure.

☐ PASS — all contracts have bounded failure modes.     ☐ FAIL.

Which contract failed (if any)?

Check 5 — Minimum tool allowlist per agent

Is each agent's tool allowlist restricted to the minimum it needs?

Walk every agent. A drafter does not need web_search. A reviewer does not need write_file outside its own stage folder. A research agent has no business with send_email. If any agent has tools it does not use for its named job, remove them and re-audit.

☐ PASS — every agent is minimum-tool.     ☐ FAIL.

Which agent failed (if any)? What got removed?

Check 6 — File-write scope per agent

Is each agent's write access restricted to its own stage folder?

A stage-02 drafter writes to 02-draft/ only; it does not write to 01-research/ or 03-review/. A stage-03 reviewer writes to 03-review/ only. This prevents an agent from accidentally corrupting an upstream or downstream stage's output.

☐ PASS — every agent writes only to its own stage folder.     ☐ FAIL.

Which agent failed (if any)?

Group C — Observability rails (checks 7–9)
Check 7 — Pre-flight estimate within budget ceiling

Does the current pre-flight estimate sit below the per-invocation ceiling?

Read the Pre-Flight Cost Estimate worksheet. The per-invocation total must be less than the per-invocation ceiling (default: 2× the estimate). If estimate is 80% of ceiling or higher, either widen the ceiling with a documented reason or tighten the pipeline.

☐ PASS — estimate / ceiling .     ☐ FAIL.

Estimate date:

Check 8 — Kill switch practiced

Has the kill switch been practiced at least once, with observable confirmation?

A kill switch that has never been triggered is not a kill switch. Practice at least one cold-start stop: fire the pipeline, wait until the first stage is mid-run, stop it, and confirm the pipeline is not still running in the background. Note the command or clicks, the observable confirmation, and the recovery in the blueprint.

☐ PASS — practiced on .     ☐ FAIL.

Kill command / clicks:

Check 9 — Next-review date set

Is a next-review date written in the blueprint, ≤ 90 days from today?

A frozen pipeline is not a forgotten pipeline. The blueprint must name a specific calendar date at which the estimate and audit will re-run. If no date is set, the pipeline silently ages and the audit decays. If the date is > 90 days, pull it in.

☐ PASS — next review = .     ☐ FAIL.

Where the review date is surfaced (calendar reminder, blueprint.md):

Freeze gate

Did all nine checks pass?

☐   Yes — freeze. Copy /capstone/pipeline-v1-draft/<picked-path>/ to /capstone/pipeline-v1/. Rename the un-picked path to /capstone/pipeline-v1-draft/<other>-sibling/. Write the retrospective and file this audit as /capstone/pipeline-v1/audit.md.

☐   No — do not freeze yet. Name the failed check, fix, and re-audit from check 1. A single failing check stops the freeze. Note below which check failed and what you are doing about it:

Retrospective pointer

After the audit passes and the pipeline freezes, write a short retrospective at /capstone/pipeline-v1/retrospective.md. Two paragraphs is enough: what surprised you between invocation-1 and the frozen version; what rail (depth, fan-out, retry, tool allowlist, budget ceiling) was the most load-bearing one for your specific pipeline.

This worksheet accompanies Lesson 8.5 of AI Architect Academy. The nine checks and the freeze gate are concept. Specific tool-allowlist syntax and retry-policy flags are recipe and live in /recipe-book/auditing-a-pipeline-before-freeze.md. Re-run this audit every time the pipeline's structure changes and on every next-review date.