Runaway Audit
Save this as: /capstone/pipeline-v1/audit.md (or print). Re-run the audit every time you add an agent, widen a tool allowlist, change a retry policy, or on the next-review date.
Freeze rule: every check must pass before the pipeline is copied from pipeline-v1-draft/ to pipeline-v1/. A single failing check stops the freeze. Fix, re-audit, then freeze.
The audit is what turns a working pipeline into a safe pipeline.
Three runaway shapes — depth, fan-out, retry — and three rails — cap, cap, bounded. Nine checks are the audit that proves every rail is in place.
Header
Pipeline name:
Path (frozen side): ☐ Claude Code CLI ☐ Cowork tab
Audit date: Auditor:
Is the pipeline's supervisor depth ≤ 3?
Count the layers: a sequential pipeline has depth 1 (no supervisor). A single supervisor dispatching workers has depth 2. A meta-supervisor over supervisors has depth 3. Anything deeper is a reject.
☐ PASS — depth = ☐ FAIL — flatten before freeze.
Evidence / notes:
Does every parallel or hierarchical step fan out to ≤ 4 workers?
Count the widest fan-out in the pipeline. A sequential pipeline has fan-out 1 (each stage dispatches to the next). A supervisor spawning four workers is at the cap. Five or more is a reject — either split the work into two rounds or redesign.
☐ PASS — max fan-out = ☐ FAIL.
Evidence / notes:
Does blueprint.md explicitly name the pipeline shape and the depth/fan-out numbers?
"Shape: sequential" is not enough on its own; the blueprint should also state Depth cap: 3. Fan-out cap: 4. If the numbers are not written, a reviewer (future-you or a teammate) cannot confirm the rails without re-reading the agent definitions.
☐ PASS — blueprint states shape + caps. ☐ FAIL — add the explicit statement, then freeze.
Evidence / notes:
Does every handoff contract name a bounded failure mode?
Read every handoff contract. Each failure-mode field must be one of: stop, retry once, fall back, or skip. "Retry if it fails" or "retry until it works" is a reject. This is a common audit failure.
☐ PASS — all contracts have bounded failure modes. ☐ FAIL.
Which contract failed (if any)?
Is each agent's tool allowlist restricted to the minimum it needs?
Walk every agent. A drafter does not need web_search. A reviewer does not need write_file outside its own stage folder. A research agent has no business with send_email. If any agent has tools it does not use for its named job, remove them and re-audit.
☐ PASS — every agent is minimum-tool. ☐ FAIL.
Which agent failed (if any)? What got removed?
Is each agent's write access restricted to its own stage folder?
A stage-02 drafter writes to 02-draft/ only; it does not write to 01-research/ or 03-review/. A stage-03 reviewer writes to 03-review/ only. This prevents an agent from accidentally corrupting an upstream or downstream stage's output.
☐ PASS — every agent writes only to its own stage folder. ☐ FAIL.
Which agent failed (if any)?
Does the current pre-flight estimate sit below the per-invocation ceiling?
Read the Pre-Flight Cost Estimate worksheet. The per-invocation total must be less than the per-invocation ceiling (default: 2× the estimate). If estimate is 80% of ceiling or higher, either widen the ceiling with a documented reason or tighten the pipeline.
☐ PASS — estimate / ceiling . ☐ FAIL.
Estimate date:
Has the kill switch been practiced at least once, with observable confirmation?
A kill switch that has never been triggered is not a kill switch. Practice at least one cold-start stop: fire the pipeline, wait until the first stage is mid-run, stop it, and confirm the pipeline is not still running in the background. Note the command or clicks, the observable confirmation, and the recovery in the blueprint.
☐ PASS — practiced on . ☐ FAIL.
Kill command / clicks:
Is a next-review date written in the blueprint, ≤ 90 days from today?
A frozen pipeline is not a forgotten pipeline. The blueprint must name a specific calendar date at which the estimate and audit will re-run. If no date is set, the pipeline silently ages and the audit decays. If the date is > 90 days, pull it in.
☐ PASS — next review = . ☐ FAIL.
Where the review date is surfaced (calendar reminder, blueprint.md):
Freeze gate
Did all nine checks pass?
☐ Yes — freeze. Copy /capstone/pipeline-v1-draft/<picked-path>/ to /capstone/pipeline-v1/. Rename the un-picked path to /capstone/pipeline-v1-draft/<other>-sibling/. Write the retrospective and file this audit as /capstone/pipeline-v1/audit.md.
☐ No — do not freeze yet. Name the failed check, fix, and re-audit from check 1. A single failing check stops the freeze. Note below which check failed and what you are doing about it:
Retrospective pointer
After the audit passes and the pipeline freezes, write a short retrospective at /capstone/pipeline-v1/retrospective.md. Two paragraphs is enough: what surprised you between invocation-1 and the frozen version; what rail (depth, fan-out, retry, tool allowlist, budget ceiling) was the most load-bearing one for your specific pipeline.
This worksheet accompanies Lesson 8.5 of AI Architect Academy. The nine checks and the freeze gate are concept. Specific tool-allowlist syntax and retry-policy flags are recipe and live in /recipe-book/auditing-a-pipeline-before-freeze.md. Re-run this audit every time the pipeline's structure changes and on every next-review date.