Module 6 · Lesson 6.5

Reliability, cost, and retirement.

Automations fail more quietly than they succeed. Three reliability rails — observability, retries, the failure-mode hierarchy. The monthly-cost math that turns $3 into $360. Five retirement signals. A three-step ritual — disable, revoke, archive — that kills a task cleanly. One more student-designed task (entry 5), then the register freezes into automation-register-v1.md and ships to the capstone.

Stage 1 of 3

Read & Understand

5 concept blocks

Observability: a log line, an artifact header, a next-run expectation CORE

In the first four lessons of Module 6 you built automations that produce. Lesson 6.5 is about the automations running reliably over time, which is a different discipline. Reliability’s first rail is observability — the practice of leaving enough evidence of each run that you can tell, in thirty seconds, whether your automations are healthy.

The Module 6 observability minimum is three things, per task:

  • A log line per run. Successful, failed, or no-op — every run writes one line to the task’s log file. ISO-8601 timestamp, task name, status (success, no-trigger, error), artifact path (if any), cost in dollars, a one-sentence message. A log you can tail -n 30 and understand. If your task cannot explain what it did in one line, the task is too big or the log is lazy.
  • An artifact header. Every artifact a task produces opens with a header block: run timestamp, task name, run ID, the idempotency key that names it, what the agent searched, how many items it read, how much it cost, which section-5 health notes apply. You saw this in Lessons 6.2 and 6.3 piece-by-piece; Lesson 6.5 formalizes it. An artifact without a header is not reviewable at 6 AM — you cannot distinguish a current run from a stale file.
  • A next-run expectation. Printed inside the artifact footer or the log entry: Next run expected at 2026-04-23T06:30 local. Knowing the expected next run lets you notice when a run is missing, which is the silent failure mode the log by itself will never tell you about. A log that stops growing is indistinguishable from a healthy quiet week unless you know when the next line should have appeared.

That’s the floor. You do not have to build a dashboard. You do not have to wire metrics into a time-series database. You do have to write the log line, add the header, and print the next-run expectation. If you do those three things for every task in your register, you have the basic observability most student automations skip.

One more note: the log file itself needs housekeeping. A log that grows forever is a log nobody reads. Every task’s log is truncated or rotated when it exceeds a size — the recipe uses a simple “keep last 30 days” rule by default. Lesson 6.5’s retirement ritual audits log files along with register rows.

Retries and the failure-mode hierarchy CORE

Scheduled tasks fail. They fail because the network was flaky, because an API key rotated, because a scope was revoked, because the model provider had a five-minute incident, because the laptop was asleep, because the prompt returned something the agent could not parse. The question is not how to avoid failures — you cannot — but how your task behaves when they happen.

The failure-mode hierarchy tells you which failures to treat which way:

  • Transient failures — retry once. Network hiccups, a 429 rate-limit, a brief model-provider outage. The right response is one retry with a short backoff (30–60 seconds), then either success or escalation to the next tier. The recipe’s wrapper scripts handle this for you in the scriptable path; Cowork’s scheduled tasks retry similarly. Do not retry forever — three retries is still conservative; an hour of retries is an outage.
  • Input failures — fail loud, do not retry. The scope the agent was told to read is gone (label deleted, file moved, credentials revoked). Retrying will produce the same failure. The task should log the error, write no artifact, and either send itself a self-addressed alert or just let last success go stale in the register so the next review catches it.
  • Output failures — fail loud, preserve partial output. The task produced something, but not the right shape (four sections instead of five, missing evidence links). The right response is to write what was produced to an attempted path — brief-2026-04-22.attempt.md — log the shape-mismatch, and not replace the last-good artifact with the broken one. The student sees the attempt on the next review and decides what to do; they do not wake up to a corrupted file in place of yesterday’s usable one.
  • Catastrophic failures — stop the task. Credentials have been revoked, the underlying model has been deprecated, the scope no longer exists. One retry and then stop — the task pauses until a human audits it. This is the same shape as a past-due review date: a paused task is better than a task silently producing nothing.

The rule of thumb: retry transient, fail loud on input/output, stop on catastrophic. Your scripts and Cowork both support this; the discipline is writing the prompt and the wrapper so the failure modes actually land where the hierarchy says.

One frequent trap: students who catch-and-swallow errors because “I don’t want the task to crash.” A task that silently succeeds on failure is the worst possible outcome — it looks healthy in the register, it has a fresh last run, and it is quietly producing nothing useful. Loud failures are a feature; make them louder, not quieter.

Cost is a rate, not a session CORE

You already set a cost ceiling per task in the register. Lesson 6.5 makes you compute the rate and check it against reality.

Per-run cost is the dollars a single firing of the task spent. Get this from your model provider’s billing page or the cost figure the task logs in its artifact header. For the tasks you built:

  • Morning brief: typically $0.03–$0.15 per run depending on inbox volume and model. Once per day.
  • Weekly report: typically $0.15–$0.60 per run depending on commit volume and topic scope. Once per week.
  • Research refresh: typically $0.05–$0.25 per run. Once per week, baseline + scheduled.
  • Watcher: typically $0.005–$0.03 per run (most runs produce nothing, so cost is the classification pass). Multiple runs per day.

Monthly cost is per-run × runs per month. Do the math for each task. Add them up. Compare to the total-automation ceiling you wrote in your Automation Posture statement in Lesson 6.1.

The example looks small. The reason the course still puts a ceiling on every task is that students routinely add a second watcher, then a third, then start running the research refresh daily instead of weekly, then add a second morning brief scoped to a different topic — and three months later they are paying $60/month for automations they have stopped reading. The ceiling forces the math to happen before the bill arrives, not after.

The rate-vs-session difference matters. A $3 chat session is a decision the student made once and paid for once. A $3-a-month automation is a decision the student made once and will pay for in perpetuity until they retire it. $3 a month for ten years is $360 if the task runs that long. Automations age into real money if they are not retired when they stop earning. This is one of the main reasons the retirement ritual is compulsory, not optional.

The five retirement signals CORE

Every scheduled task eventually stops being worth running. The signals that a task has reached that point are five — named and specific so you can recognize any of them without waiting to find out the hard way:

  1. Named event. The retirement trigger in the register fired. The court case was decided; the deadline passed; the project ended; the competition concluded. When a named event fires, retirement is automatic, not optional. This is the easy case.
  2. Stale last-success. Last run is current (scheduler is firing) but last success has gone stale past your own threshold. For briefs and reports this typically means you have stopped reading them. For watchers this is the alert-fatigue signal from Lesson 6.4. Retire or redesign; do not keep a task running into a void.
  3. Silent drift. The task still produces artifacts, but the artifacts no longer answer the question the task was built for. A morning brief whose three things has degenerated to three newsletters is drifting. A research refresh whose diff contract matches less and less of the real news-shape of the topic is drifting. Drift catches you at a review date if you look; it does not announce itself.
  4. Ceiling breach. The task’s monthly cost exceeded the ceiling and you cannot explain why. Either the underlying data volume grew (inbox tripled, commit history exploded) or the prompt grew, or the model was upgraded with a different price point. Pause, audit, resume — or retire.
  5. Human-attention bankruptcy. You have too many tasks. Not because any single task is wrong, but because the aggregate demand on your review time exceeds what you actually have. The rule of thumb: if your total automation suite demands more weekly review time than you wrote into your Automation Posture as your budget, it is over-built. Retire the lowest-leverage task.

The retirement ritual in the worksheet asks you to check every register entry against all five signals. Expect to retire at least one task during Lesson 6.5, even if everything is working. A module where nothing gets retired is a module where the discipline did not land.

Revoking schedule-bound credentials CORE

Module 5 installed the revocation ritual for email/calendar permissions: when a task is done, access is withdrawn, not left running. Module 6 extends the ritual to the specific case of scheduled tasks.

When you retire a scheduled task, three things happen, in order:

  1. Disable the scheduler entry. Remove the cron line, unload the launchd plist, disable the Windows Task Scheduler entry, or delete the Cowork scheduled task. Do this first — the task does not fire tomorrow.
  2. Revoke the credentials the task was using. This is the step most students skip. The OAuth scope the task held, the API key that was specific to the task, the MCP permission that was granted — each is withdrawn. If the credential is shared across tasks, reduce the scope to match what the remaining tasks actually need.
  3. Archive the artifacts and the log. Move the task’s folder to ~/ai-architect-academy/automation/retired/<task-name>-YYYY-MM-DD/. Leave the register row in place with status Retired <date> and a one-sentence post-mortem; do not delete the row. Future-you will want to remember what this task did and why it retired.

Students rebuild tasks they previously retired far more often than they expect. The archive path is small and makes restarting trivial. Deleting the task entirely is the anti-pattern.

The specific credential-revocation question — “does this task have a token I should kill?” — is a good instinct to bring into every retirement. For Cowork-tab scheduled tasks the credentials are session-scoped and retire with the task; for Optional Advanced (CLI + OS scheduler) tasks the credentials may live in ~/.ai-directed/credentials/ and outlast the cron entry unless you revoke them.

Stage 2 of 3

Try & Build

2 recipes + activity

Try it — Retirement ritual, design-and-run entry 5, and the capstone freeze CORE

Part A — Retirement ritual.

Use /resources/module-06/retirement-ritual.md.

  1. Open automation-register-v1-draft.md. Open the retirement-ritual worksheet.
  2. For each row (entries 1–4 from Lessons 6.2–6.4):
    • Compute the monthly cost. Compare to the ceiling. Flag any breach.
    • Check the five retirement signals. Decide keep, redesign, or retire.
    • Audit the idempotency key — can you articulate it in one sentence? Does the artifact path actually use it?
    • Re-read the retirement trigger column. Is it still specific, or has it gone vague?
  3. If you flagged retire for any row, run the retirement recipe above. Do not skip the credential-revocation step.
  4. If you flagged redesign, tighten the prompt, update the register, and note the change in the prompt-change log. Do not roll to a new register version yet — this is edit-in-place.

Part B — Design and run entry 5.

  1. Pick a task from the idea list above, or invent one. Confirm it passes audience = only you and the response test (if it’s a watcher-shaped task).
  2. Fill in the register row fully before you build. Including the idempotency key and the retirement trigger.
  3. Build it via whichever recipe path fits best.
  4. Run it on-demand. Then wait for the next scheduled interval and run it again (or force-fire a second time for a testable idempotency check). Confirm the second run overwrote or left-alone correctly.
  5. File register entry 5.

Part C — Freeze the capstone.

  1. Review the full register — six entries, every column populated, Automation Posture updated if your numbers changed, at least one retirement or one redesign documented.
  2. Rename automation-register-v1-draft.md to automation-register-v1.md and save it to /capstone/.
  3. Add a one-paragraph Module 6 reflection at the top of the frozen register. Three to five sentences: what surprised you, what you retired, what entry 5 is for, what you’ll do with this register after Module 6 ends.
  4. Update my-first-loop.md’s Automation Posture section if anything has shifted (you probably raised or lowered your per-task ceiling once you saw real numbers).

Deliverable. /capstone/automation-register-v1.md frozen, with six entries and the reflection paragraph. At least one retirement recorded. /capstone/automation-artifacts/ populated with representative output from every entry. my-first-loop.md updated. This is the sixth frozen capstone artifact in the course.

Done with the hands-on?

When the recipe steps and any activity above are complete, mark this stage to unlock the assessment, reflection, and project checkpoint.

Stage 3 of 3

Check & Reflect

key concepts, quiz, reflection, checkpoint, instructor note

Quick check CORE

Five questions. Open each to see the answer and reasoning.

Q1. The Module 6 observability minimum for a scheduled task is:
  • A — A dashboard with real-time metrics.
  • B — A log line per run, an artifact header, and a printed next-run expectation.
  • C — Every run sends you a notification.
  • D — A weekly health report.
Show explanation

Answer: B. The three pieces combine into the minimum that lets you distinguish a healthy quiet week from a broken task. (A) is over-engineering. (C) reproduces alert fatigue in the observability layer. (D) is fine but not a substitute for the three per-task pieces.

Q2. Your morning brief task fails with a 429 rate-limit error on one run. What is the correct failure-mode-hierarchy response?
  • A — Exit silently and try again tomorrow.
  • B — Retry once after a short backoff. If it still fails, log a loud error and do not write over yesterday’s artifact.
  • C — Retry infinitely until it succeeds.
  • D — Disable the task permanently.
Show explanation

Answer: B. 429s are the textbook transient failure. One retry with backoff, then loud failure if the second attempt fails. (A) is the silent-failure anti-pattern. (C) turns a transient failure into an outage and may burn through your budget. (D) overreacts to a recoverable error.

Q3. You compute that your total monthly automation cost is $8.40. Your Automation Posture ceiling is $10. Which of the following is the Module 6 response?
  • A — You are fine; keep everything as-is.
  • B — You are under ceiling but close. Audit the two tasks with the highest per-run cost, confirm they are still earning their keep, and consider whether you could scale one back (e.g., refresh from weekly to biweekly) if you are about to add entry 5 and entry 6 over the next months.
  • C — You are over ceiling; retire the most expensive task.
  • D — Raise the ceiling to $15 and keep going.
Show explanation

Answer: B. (A) is “trust the number,” the stale-cost anti-pattern — a healthy number today can drift to an over-ceiling number next month as data volume grows. (C) is false — you are under ceiling, not over. (D) is the “kick the can” anti-pattern that turns budgets into suggestions. The right posture is to notice headroom is tighter than last month and choose deliberately.

Q4. Which of the following is not one of the five retirement signals?
  • A — Named event (the task’s retirement trigger fired).
  • B — Stale last-success (the task is running but the student has stopped reading it).
  • C — Silent drift (the task produces artifacts that no longer answer the original question).
  • D — The task is more than six months old.
Show explanation

Answer: D. Age alone is not a retirement signal. An automation that has worked well for two years and still earns its keep is a good automation, not an obsolete one. The five real signals are named event, stale last-success, silent drift, ceiling breach, and human-attention bankruptcy.

Q5. When you retire a scheduled task, the three steps are:
  • A — Delete the cron entry; delete the folder; forget it existed.
  • B — Disable the scheduler entry; revoke the credentials the task used; archive the folder and mark the register row Retired with a post-mortem.
  • C — Ask the agent to retire itself.
  • D — Rename the task to retired and leave it running.
Show explanation

Answer: B. The three-step ritual is disable–revoke–archive. (A) skips the credential revocation (the step most students skip and most regret). (C) is not a real operation — the agent does not own the retirement. (D) leaves the task running, which is the opposite of retirement.

Reflection

In 8–10 sentences: Walk through your register as it stands right now — six entries, including your student-designed task. For each entry, name the one specific thing that would make you retire it. Then rank the entries by how long you expect each to last before it retires. Which is shortest-lived? Which is longest-lived? Now — honestly — predict which of the six entries you will actually still be running six months from now, and which you will have quietly let lapse without going through the retirement ritual. What does that prediction tell you about the automations you built?

The honest prediction is the important part. Most automations do not get retired cleanly; they just stop being useful and the student moves on. Naming, now, which ones you expect to fade is the cheapest way to catch them when they do.

Project checkpoint — capstone freeze

Freeze /capstone/automation-register-v1.md. The frozen file has:

  • A reflection paragraph at the top (3–5 sentences).
  • Six entries (0–5, where 0 is the empty register from Lesson 6.1’s baseline set-up; 1 = morning brief; 2 = weekly report; 3 = research refresh; 4 = watcher; 5 = student-designed).
  • At least one retirement recorded (status Retired <date> with a one-sentence post-mortem), even if the retirement was a task you built specifically for this lesson.
  • Every column populated. Idempotency keys named. Cost ceilings and actuals computed. Next-review dates inside 90 days.

Alongside the frozen register, /capstone/automation-artifacts/ contains representative outputs from each of the five live tasks. my-first-loop.md has the up-to-date Automation Posture section.

Capstone artifact 6 — automation-register-v1.md

Top: Module 6 reflection (3–5 sentences)

Row 0: baseline empty register (Lesson 6.1)

Row 1: morning brief (Lesson 6.2)

Row 2: weekly report (Lesson 6.3)

Row 3: research refresh (Lesson 6.3)

Row 4: watcher (Lesson 6.4)

Row 5: student-designed task (Lesson 6.5)

At least one row with status Retired <date> + one-sentence post-mortem.

This is the sixth frozen capstone artifact. Module 7 begins when the artifact is frozen.

Next in Module 6

End-of-module check.

A 15-point check that maps cleanly to the parent rubric: six multiple-choice items, two short-answer, two applied prompts that ask you to walk through the five moves and the retirement ritual on a register row you wrote this module. You pass at 11.5/15 with full credit on at least one applied question.

Take the end-of-module check →