Pre-Flight Cost Estimate
Save this as: /capstone/pipeline-v1/estimate.md (or print). Re-run the estimate whenever you add an agent, widen a tool allowlist, change a model, or on the next-review date (≤ 90 days).
When to fill: before each scheduled invocation starts firing, and any time the pipeline structure changes. The estimate is what turns "the pipeline runs" into "the pipeline runs inside a known budget."
Pre-flight cost estimate = agents × tokens per run × price per token.
If actuals exceed the estimate by 2× on any run, stop the pipeline and re-estimate before the next run. A runaway you noticed in the estimate is free; a runaway you noticed in the bill is not.
Part 1 — The formula
Per-stage cost = (input tokens + output tokens) × price per token for the stage's model.
stage_cost = (input_tokens + output_tokens) × price_per_token
Per-invocation cost = sum of per-stage costs across all stages that fire in one pipeline run (include every retry the contract permits; zero retries is the safest assumption).
invocation_cost = sum(stage_costs)
Per-run budget ceiling = the dollar amount at which you stop the pipeline mid-invocation and investigate. Reasonable default: 2× the estimated per-invocation cost.
Part 2 — Per-stage estimate table
One row per agent. Token counts are rough; use the upper bound a typical invocation might hit, not the average. Local models that run on your own hardware have zero per-token price but non-zero electricity and time cost — note them in the last column as "local, ~<minutes> min".
| Stage / agent | Model family | Input tokens (est) | Output tokens (est) | Price per 1K tokens | Stage cost | Wall-clock (est) |
|---|---|---|---|---|---|---|
| TOTAL per invocation | ||||||
Part 3 — Reference pipeline estimate (from Lesson 8.5)
For comparison. A three-agent sequential research → draft → review pipeline with cloud-default models came out like this in the lesson walkthrough. Your numbers will differ; the shape of the estimate should not.
| Stage | Input tok | Output tok | Price / 1K | Stage cost |
|---|---|---|---|---|
| research-sweeper | 8,000 | 1,500 | ~$0.0073 | ~$0.069 |
| brief-drafter | 3,500 | 1,200 | ~$0.0073 | ~$0.034 |
| brief-reviewer | 3,000 | 1,100 | ~$0.0073 | ~$0.030 |
| Per-invocation total | ~$0.133 | |||
Reference only. Token prices move. The Recipe Book entry pre-flight-estimating-ai-pipelines.md carries current cloud-default prices, dated. For Module 8 estimates, look up the price the week you estimate; do not rely on this table as current.
Part 4 — Frequency and budget projection
How often the pipeline runs multiplies the cost. A $0.13 pipeline that fires once a day is $48/year; a $1.30 pipeline that fires five times a day is $2,372/year. Name the frequency honestly before you set the budget.
How often does this pipeline run?
☐ On demand (student triggers) — rough runs per week:
☐ Daily ☐ Weekly ☐ Other:
Monthly budget projection
Per-invocation cost × invocations per month. Include expected retries if the contract permits any (for bounded-retry stages, count the retry as a second full stage cost).
Invocations per month:
Estimated monthly cost:
Annual cost (x12):
Part 5 — Budget ceilings (the freeze gate)
The three ceilings
1. Per-invocation ceiling: the single-run cost at which the pipeline auto-stops and alerts. Default: 2× the estimated per-invocation cost.
2. Per-day ceiling: the total across all invocations in one day. Default: 2× (estimate × runs per day).
3. Per-month ceiling: the figure at which you re-estimate and re-audit before next run. Default: 1.5× monthly projection.
If any ceiling is hit, the pipeline is not broken; the estimate was wrong. Re-estimate, re-audit, then decide whether to raise the ceiling (with a documented reason) or tighten the pipeline.
My ceilings
Per-invocation ceiling:
Per-day ceiling:
Per-month ceiling:
Where the ceilings are enforced (dashboard alert, billing cap, manual check):
Part 6 — When to re-run this estimate
- Before first invocation of a frozen pipeline.
- Whenever you add an agent, widen a tool allowlist, or change a model.
- After any invocation whose actuals exceed the estimate by 2×.
- On the next-review date (no more than 90 days from last freeze).
- After any month where the per-month ceiling fired.
Part 7 — Copy into the blueprint
Once filled, copy the three ceilings (Part 5) and the per-invocation total (Part 2) into /capstone/pipeline-v1/blueprint.md under Budget and kill switch. The estimate worksheet itself lives at /capstone/pipeline-v1/estimate.md — refresh it, do not throw away old versions.
- Per-stage estimate table filled (Part 2).
- Per-invocation total computed.
- Frequency and monthly projection filled (Part 4).
- Three ceilings chosen and written down (Part 5).
- Estimate copied into blueprint's Budget section.
- Next-review date set (≤ 90 days).
This worksheet accompanies Lesson 8.5 of AI Architect Academy. The formula, the three ceilings, and the re-run triggers are concept. Current per-token prices are recipe and live in /recipe-book/pre-flight-estimating-ai-pipelines.md, dated. Always look up current prices the week you estimate; do not use the reference numbers as current.