Lesson 6.5 — Reliability, Cost, and Retirement (Capstone Freeze)

Observability: a log line, an artifact header, a next-run expectation CORE

In the first four lessons of Module 6 you built automations that produce. Lesson 6.5 is about the automations running reliably over time, which is a different discipline. Reliability’s first rail is observability — the practice of leaving enough evidence of each run that you can tell, in thirty seconds, whether your automations are healthy.

The Module 6 observability minimum is three things, per task:

A log line per run. Successful, failed, or no-op — every run writes one line to the task’s log file. ISO-8601 timestamp, task name, status (success, no-trigger, error), artifact path (if any), cost in dollars, a one-sentence message. A log you can tail -n 30 and understand. If your task cannot explain what it did in one line, the task is too big or the log is lazy.
An artifact header. Every artifact a task produces opens with a header block: run timestamp, task name, run ID, the idempotency key that names it, what the agent searched, how many items it read, how much it cost, which section-5 health notes apply. You saw this in Lessons 6.2 and 6.3 piece-by-piece; Lesson 6.5 formalizes it. An artifact without a header is not reviewable at 6 AM — you cannot distinguish a current run from a stale file.
A next-run expectation. Printed inside the artifact footer or the log entry: Next run expected at 2026-04-23T06:30 local. Knowing the expected next run lets you notice when a run is missing, which is the silent failure mode the log by itself will never tell you about. A log that stops growing is indistinguishable from a healthy quiet week unless you know when the next line should have appeared.

That’s the floor. You do not have to build a dashboard. You do not have to wire metrics into a time-series database. You do have to write the log line, add the header, and print the next-run expectation. If you do those three things for every task in your register, you have the basic observability most student automations skip.

One more note: the log file itself needs housekeeping. A log that grows forever is a log nobody reads. Every task’s log is truncated or rotated when it exceeds a size — the recipe uses a simple “keep last 30 days” rule by default. Lesson 6.5’s retirement ritual audits log files along with register rows.

Retries and the failure-mode hierarchy CORE

Scheduled tasks fail. They fail because the network was flaky, because an API key rotated, because a scope was revoked, because the model provider had a five-minute incident, because the laptop was asleep, because the prompt returned something the agent could not parse. The question is not how to avoid failures — you cannot — but how your task behaves when they happen.

The failure-mode hierarchy tells you which failures to treat which way:

Transient failures — retry once. Network hiccups, a 429 rate-limit, a brief model-provider outage. The right response is one retry with a short backoff (30–60 seconds), then either success or escalation to the next tier. The recipe’s wrapper scripts handle this for you in the scriptable path; Cowork’s scheduled tasks retry similarly. Do not retry forever — three retries is still conservative; an hour of retries is an outage.
Input failures — fail loud, do not retry. The scope the agent was told to read is gone (label deleted, file moved, credentials revoked). Retrying will produce the same failure. The task should log the error, write no artifact, and either send itself a self-addressed alert or just let last success go stale in the register so the next review catches it.
Output failures — fail loud, preserve partial output. The task produced something, but not the right shape (four sections instead of five, missing evidence links). The right response is to write what was produced to an attempted path — brief-2026-04-22.attempt.md — log the shape-mismatch, and not replace the last-good artifact with the broken one. The student sees the attempt on the next review and decides what to do; they do not wake up to a corrupted file in place of yesterday’s usable one.
Catastrophic failures — stop the task. Credentials have been revoked, the underlying model has been deprecated, the scope no longer exists. One retry and then stop — the task pauses until a human audits it. This is the same shape as a past-due review date: a paused task is better than a task silently producing nothing.

The rule of thumb: retry transient, fail loud on input/output, stop on catastrophic. Your scripts and Cowork both support this; the discipline is writing the prompt and the wrapper so the failure modes actually land where the hierarchy says.

One frequent trap: students who catch-and-swallow errors because “I don’t want the task to crash.” A task that silently succeeds on failure is the worst possible outcome — it looks healthy in the register, it has a fresh last run, and it is quietly producing nothing useful. Loud failures are a feature; make them louder, not quieter.

Cost is a rate, not a session CORE

You already set a cost ceiling per task in the register. Lesson 6.5 makes you compute the rate and check it against reality.

Per-run cost is the dollars a single firing of the task spent. Get this from your model provider’s billing page or the cost figure the task logs in its artifact header. For the tasks you built:

Morning brief: typically $0.03–$0.15 per run depending on inbox volume and model. Once per day.
Weekly report: typically $0.15–$0.60 per run depending on commit volume and topic scope. Once per week.
Research refresh: typically $0.05–$0.25 per run. Once per week, baseline + scheduled.
Watcher: typically $0.005–$0.03 per run (most runs produce nothing, so cost is the classification pass). Multiple runs per day.

Monthly cost is per-run × runs per month. Do the math for each task. Add them up. Compare to the total-automation ceiling you wrote in your Automation Posture statement in Lesson 6.1.

The example looks small. The reason the course still puts a ceiling on every task is that students routinely add a second watcher, then a third, then start running the research refresh daily instead of weekly, then add a second morning brief scoped to a different topic — and three months later they are paying $60/month for automations they have stopped reading. The ceiling forces the math to happen before the bill arrives, not after.

The rate-vs-session difference matters. A $3 chat session is a decision the student made once and paid for once. A $3-a-month automation is a decision the student made once and will pay for in perpetuity until they retire it. $3 a month for ten years is $360 if the task runs that long. Automations age into real money if they are not retired when they stop earning. This is one of the main reasons the retirement ritual is compulsory, not optional.

The five retirement signals CORE

Every scheduled task eventually stops being worth running. The signals that a task has reached that point are five — named and specific so you can recognize any of them without waiting to find out the hard way:

Named event. The retirement trigger in the register fired. The court case was decided; the deadline passed; the project ended; the competition concluded. When a named event fires, retirement is automatic, not optional. This is the easy case.
Stale last-success. Last run is current (scheduler is firing) but last success has gone stale past your own threshold. For briefs and reports this typically means you have stopped reading them. For watchers this is the alert-fatigue signal from Lesson 6.4. Retire or redesign; do not keep a task running into a void.
Silent drift. The task still produces artifacts, but the artifacts no longer answer the question the task was built for. A morning brief whose three things has degenerated to three newsletters is drifting. A research refresh whose diff contract matches less and less of the real news-shape of the topic is drifting. Drift catches you at a review date if you look; it does not announce itself.
Ceiling breach. The task’s monthly cost exceeded the ceiling and you cannot explain why. Either the underlying data volume grew (inbox tripled, commit history exploded) or the prompt grew, or the model was upgraded with a different price point. Pause, audit, resume — or retire.
Human-attention bankruptcy. You have too many tasks. Not because any single task is wrong, but because the aggregate demand on your review time exceeds what you actually have. The rule of thumb: if your total automation suite demands more weekly review time than you wrote into your Automation Posture as your budget, it is over-built. Retire the lowest-leverage task.

The retirement ritual in the worksheet asks you to check every register entry against all five signals. Expect to retire at least one task during Lesson 6.5, even if everything is working. A module where nothing gets retired is a module where the discipline did not land.

Revoking schedule-bound credentials CORE

Module 5 installed the revocation ritual for email/calendar permissions: when a task is done, access is withdrawn, not left running. Module 6 extends the ritual to the specific case of scheduled tasks.

When you retire a scheduled task, three things happen, in order:

Disable the scheduler entry. Remove the cron line, unload the launchd plist, disable the Windows Task Scheduler entry, or delete the Cowork scheduled task. Do this first — the task does not fire tomorrow.
Revoke the credentials the task was using. This is the step most students skip. The OAuth scope the task held, the API key that was specific to the task, the MCP permission that was granted — each is withdrawn. If the credential is shared across tasks, reduce the scope to match what the remaining tasks actually need.
Archive the artifacts and the log. Move the task’s folder to ~/ai-architect-academy/automation/retired/<task-name>-YYYY-MM-DD/. Leave the register row in place with status Retired <date> and a one-sentence post-mortem; do not delete the row. Future-you will want to remember what this task did and why it retired.

Students rebuild tasks they previously retired far more often than they expect. The archive path is small and makes restarting trivial. Deleting the task entirely is the anti-pattern.

The specific credential-revocation question — “does this task have a token I should kill?” — is a good instinct to bring into every retirement. For Cowork-tab scheduled tasks the credentials are session-scoped and retire with the task; for Optional Advanced (CLI + OS scheduler) tasks the credentials may live in ~/.ai-directed/credentials/ and outlast the cron entry unless you revoke them.

Retiring a scheduled task RECIPE

Tool	Claude desktop app — Cowork tab (primary). Optional advanced backends: cron / launchd / Windows Task Scheduler driving the Claude Code CLI.
Last verified	2026-04-17
Next review	2026-07-17
Supported OSes	Cowork tab: macOS, Windows. Optional advanced backends: macOS, Linux, Windows.

Full walkthrough in /recipe-book/retiring-a-scheduled-task.md.

In the register, mark the row’s status as Retiring <date> with a one-sentence reason (one of the five signals from Block 4).
Disable the scheduler entry.
- Cowork tab: open the scheduled task and click Disable, then Delete once you have confirmed the archive.
- cron: crontab -e, comment out the line first, confirm the next scheduled run did not fire, then delete the line.
- launchd: launchctl unload ~/Library/LaunchAgents/ai-directed.<task>.plist, then move the plist into retired/.
- Windows Task Scheduler: right-click the task, Disable, then Delete.
Revoke the credentials. For each OAuth token or API key the task used, visit the provider’s token page and revoke. If the credential is shared, edit its scope down to only what remaining tasks need.
Archive the folder: mv ~/ai-architect-academy/automation/<task>/ ~/ai-architect-academy/automation/retired/<task>-$(date +%Y-%m-%d)/.
Update the register row: status = Retired <date>, add a one-sentence post-mortem (what worked, what didn’t, would you build again).

Do not skip step 3. A live token for a retired task is a zombie credential. It costs nothing to revoke today and solves a problem Module 9 will name explicitly: the long tail of “I thought that was turned off.”

Designing and running your own task (entry 5) RECIPE

Tools	Whichever path fits the task shape
Last verified	2026-04-17
Next review	2026-07-17
OSes	macOS, Windows, Linux (path-dependent)

The sixth register row is a task you designed yourself. The requirements:

It does not duplicate entries 1–4. A fifth watcher is not acceptable. A second morning brief scoped to a different topic is acceptable if the topic is genuinely different.
It passes the audience = only you rule. No exceptions for this task.
It carries an idempotency key you can name in one sentence.
It has a cost ceiling, a next-review date, and a retirement trigger in its register row.
It has run successfully at least twice during Module 6, with a gap of at least one scheduled interval between runs (so you have exercised the idempotency, not just produced the same artifact twice in a minute).

Realistic student-designed task ideas — one to pick from or adapt:

A reading-list aggregator. Every Saturday, pulls items you saved during the week into a single reading list document, categorized by topic.
A practice-problems generator. Every evening at 8 PM, generates three practice problems in a subject you are studying, keyed by date so each day is a fresh set, scoped so yesterday’s problems are not repeated.
A cost-ceiling watcher. Every morning at 7 AM, checks the total automation spend for the current month across all your tasks and writes an alert if you have crossed 75% of your monthly ceiling.
A study-schedule reconciler. Every Sunday at 8 PM, compares next week’s calendar to your study plan and writes a one-page report of where the plan and the calendar disagree.
A Module 3 directed-edit-style checker. Every time you commit to your project repo, a scheduled task runs at the next hour and checks whether the last N commits follow the directed-edit-style you committed to in Module 3, and flags violations in a log.

Build one. Run it twice. File entry 5.

Try it — Retirement ritual, design-and-run entry 5, and the capstone freeze CORE

Part A — Retirement ritual.

Use /resources/module-06/retirement-ritual.md.

Open automation-register-v1-draft.md. Open the retirement-ritual worksheet.
For each row (entries 1–4 from Lessons 6.2–6.4):
- Compute the monthly cost. Compare to the ceiling. Flag any breach.
- Check the five retirement signals. Decide keep, redesign, or retire.
- Audit the idempotency key — can you articulate it in one sentence? Does the artifact path actually use it?
- Re-read the retirement trigger column. Is it still specific, or has it gone vague?
If you flagged retire for any row, run the retirement recipe above. Do not skip the credential-revocation step.
If you flagged redesign, tighten the prompt, update the register, and note the change in the prompt-change log. Do not roll to a new register version yet — this is edit-in-place.

Part B — Design and run entry 5.

Pick a task from the idea list above, or invent one. Confirm it passes audience = only you and the response test (if it’s a watcher-shaped task).
Fill in the register row fully before you build. Including the idempotency key and the retirement trigger.
Build it via whichever recipe path fits best.
Run it on-demand. Then wait for the next scheduled interval and run it again (or force-fire a second time for a testable idempotency check). Confirm the second run overwrote or left-alone correctly.
File register entry 5.

Part C — Freeze the capstone.

Review the full register — six entries, every column populated, Automation Posture updated if your numbers changed, at least one retirement or one redesign documented.
Rename automation-register-v1-draft.md to automation-register-v1.md and save it to /capstone/.
Add a one-paragraph Module 6 reflection at the top of the frozen register. Three to five sentences: what surprised you, what you retired, what entry 5 is for, what you’ll do with this register after Module 6 ends.
Update my-first-loop.md’s Automation Posture section if anything has shifted (you probably raised or lowered your per-task ceiling once you saw real numbers).

Deliverable. /capstone/automation-register-v1.md frozen, with six entries and the reflection paragraph. At least one retirement recorded. /capstone/automation-artifacts/ populated with representative output from every entry. my-first-loop.md updated. This is the sixth frozen capstone artifact in the course.

Done with the hands-on?

When the recipe steps and any activity above are complete, mark this stage to unlock the assessment, reflection, and project checkpoint.

Key concepts CORE

Observability floor: a log line per run, an artifact header, a next-run expectation. Everything else is optional. Skipping these is what makes automations feel unreliable six weeks in.
Failure-mode hierarchy: retry transient, fail loud on input/output, stop on catastrophic. Swallowing errors silently is the worst outcome.
Cost is a rate, not a session. Per-run × runs-per-month. A $3/month task compounded forever. The ceiling forces the math before the bill.
Five retirement signals: named event, stale last-success, silent drift, ceiling breach, attention bankruptcy. Expect to retire at least one task during this lesson.
Revocation extends to schedules. Disable, revoke credentials, archive. The credential-revocation step is the one most students skip and the one that causes the long tail of “I thought that was turned off.”

Quick check CORE

Five questions. Open each to see the answer and reasoning.

Q1. The Module 6 observability minimum for a scheduled task is:

A — A dashboard with real-time metrics.
B — A log line per run, an artifact header, and a printed next-run expectation.
C — Every run sends you a notification.
D — A weekly health report.

Show explanation

Answer: B. The three pieces combine into the minimum that lets you distinguish a healthy quiet week from a broken task. (A) is over-engineering. (C) reproduces alert fatigue in the observability layer. (D) is fine but not a substitute for the three per-task pieces.

Q2. Your morning brief task fails with a 429 rate-limit error on one run. What is the correct failure-mode-hierarchy response?

A — Exit silently and try again tomorrow.
B — Retry once after a short backoff. If it still fails, log a loud error and do not write over yesterday’s artifact.
C — Retry infinitely until it succeeds.
D — Disable the task permanently.

Show explanation

Answer: B. 429s are the textbook transient failure. One retry with backoff, then loud failure if the second attempt fails. (A) is the silent-failure anti-pattern. (C) turns a transient failure into an outage and may burn through your budget. (D) overreacts to a recoverable error.

Q3. You compute that your total monthly automation cost is $8.40. Your Automation Posture ceiling is $10. Which of the following is the Module 6 response?

A — You are fine; keep everything as-is.
B — You are under ceiling but close. Audit the two tasks with the highest per-run cost, confirm they are still earning their keep, and consider whether you could scale one back (e.g., refresh from weekly to biweekly) if you are about to add entry 5 and entry 6 over the next months.
C — You are over ceiling; retire the most expensive task.
D — Raise the ceiling to $15 and keep going.

Show explanation

Answer: B. (A) is “trust the number,” the stale-cost anti-pattern — a healthy number today can drift to an over-ceiling number next month as data volume grows. (C) is false — you are under ceiling, not over. (D) is the “kick the can” anti-pattern that turns budgets into suggestions. The right posture is to notice headroom is tighter than last month and choose deliberately.

Q4. Which of the following is not one of the five retirement signals?

A — Named event (the task’s retirement trigger fired).
B — Stale last-success (the task is running but the student has stopped reading it).
C — Silent drift (the task produces artifacts that no longer answer the original question).
D — The task is more than six months old.

Show explanation

Answer: D. Age alone is not a retirement signal. An automation that has worked well for two years and still earns its keep is a good automation, not an obsolete one. The five real signals are named event, stale last-success, silent drift, ceiling breach, and human-attention bankruptcy.

Q5. When you retire a scheduled task, the three steps are:

A — Delete the cron entry; delete the folder; forget it existed.
B — Disable the scheduler entry; revoke the credentials the task used; archive the folder and mark the register row Retired with a post-mortem.
C — Ask the agent to retire itself.
D — Rename the task to retired and leave it running.

Show explanation

Answer: B. The three-step ritual is disable–revoke–archive. (A) skips the credential revocation (the step most students skip and most regret). (C) is not a real operation — the agent does not own the retirement. (D) leaves the task running, which is the opposite of retirement.

Reflection

In 8–10 sentences: Walk through your register as it stands right now — six entries, including your student-designed task. For each entry, name the one specific thing that would make you retire it. Then rank the entries by how long you expect each to last before it retires. Which is shortest-lived? Which is longest-lived? Now — honestly — predict which of the six entries you will actually still be running six months from now, and which you will have quietly let lapse without going through the retirement ritual. What does that prediction tell you about the automations you built?

The honest prediction is the important part. Most automations do not get retired cleanly; they just stop being useful and the student moves on. Naming, now, which ones you expect to fade is the cheapest way to catch them when they do.

Project checkpoint — capstone freeze

Freeze /capstone/automation-register-v1.md. The frozen file has:

A reflection paragraph at the top (3–5 sentences).
Six entries (0–5, where 0 is the empty register from Lesson 6.1’s baseline set-up; 1 = morning brief; 2 = weekly report; 3 = research refresh; 4 = watcher; 5 = student-designed).
At least one retirement recorded (status Retired <date> with a one-sentence post-mortem), even if the retirement was a task you built specifically for this lesson.
Every column populated. Idempotency keys named. Cost ceilings and actuals computed. Next-review dates inside 90 days.

Alongside the frozen register, /capstone/automation-artifacts/ contains representative outputs from each of the five live tasks. my-first-loop.md has the up-to-date Automation Posture section.

Capstone artifact 6 — automation-register-v1.md

Top: Module 6 reflection (3–5 sentences)

Row 0: baseline empty register (Lesson 6.1)

Row 1: morning brief (Lesson 6.2)

Row 2: weekly report (Lesson 6.3)

Row 3: research refresh (Lesson 6.3)

Row 4: watcher (Lesson 6.4)

Row 5: student-designed task (Lesson 6.5)

At least one row with status Retired <date> + one-sentence post-mortem.

This is the sixth frozen capstone artifact. Module 7 begins when the artifact is frozen.

Instructor / parent note

This lesson is the hardest one in the module because it asks students to retire, and retirement feels like admitting something did not work. The course’s framing — retirement is the proper end of a working automation, not a failure — is the point, and needs to be reinforced by the adult reading alongside.

Three specific things to watch for. First, students who refuse to retire anything because “it might be useful someday.” This is the hoarding failure mode. The retirement archive preserves the task’s prompt and logs; someday is covered by the archive, not by leaving the task running. The parent-prompt: “If the task is useful someday, will you remember to turn it back on that day? Then the archive is enough.” Second, students who skip credential revocation because “the token still works.” The token working is the reason to revoke it. A live token for a retired task is a credential held by no current task, and is exactly the kind of zombie credential Module 9 will teach the student to hate. Revoke it here and the habit carries forward. Third, students who design an entry-5 task that is effectively a fifth watcher on a different topic, or a fifth version of the brief. The requirement — does not duplicate entries 1–4 — is enforceable; send the student back to the idea list if they skirt it. The point of entry 5 is to prove the student can apply the five-move loop to a task they imagined themselves, not re-run a walk-through with a different subject.

This lesson also pairs with a longer parent conversation about the ongoing cost — dollars and attention — of the automations the student has set up. Scheduled tasks live on the family’s shared machine, sometimes, and cost real money on the student’s AI provider bill. Making the register visible and the cost column honest is a cost-of-this-education conversation parents deserve to have the first time the bill is higher than expected. The register makes it possible to have the conversation in specifics rather than abstractions.

Module 6 ends here. Module 7 picks up the thread with custom skills and plugins — work the student will do inside the agent, rather than around it. The automations the student built in Module 6 are the natural candidates to extend into custom skills in Module 7, and the module makes that continuity explicit.

Next in Module 6

End-of-module check.

A 15-point check that maps cleanly to the parent rubric: six multiple-choice items, two short-answer, two applied prompts that ask you to walk through the five moves and the retirement ritual on a register row you wrote this module. You pass at 11.5/15 with full credit on at least one applied question.

Take the end-of-module check →

Reliability, cost, and retirement.

Read & Understand

Observability: a log line, an artifact header, a next-run expectation CORE

Retries and the failure-mode hierarchy CORE

Cost is a rate, not a session CORE

The five retirement signals CORE

Revoking schedule-bound credentials CORE

Try & Build

Try it — Retirement ritual, design-and-run entry 5, and the capstone freeze CORE

Part A — Retirement ritual.

Part B — Design and run entry 5.

Part C — Freeze the capstone.

Done with the hands-on?

Check & Reflect

Quick check CORE

Reflection

Project checkpoint — capstone freeze

End-of-module check.