Module 6 · Lesson 6.4

Alerts and watchers.

A brief runs every day. A report runs every week. A watcher runs on a schedule but produces output only when a specific condition is met. Nineteen runs in twenty write a log line and exit. The twentieth taps you on the shoulder. Name the three failure modes, write a condition-and-threshold statement, and pass the response test before the watcher ever goes on the schedule.

Stage 1 of 3

Read & Understand

5 concept blocks

What a watcher is and what it is not CORE

A brief runs every day whether there is news or not. A report runs every week whether there is news or not. A watcher is different: a watcher runs on a schedule (say, every four hours), but it produces output only when a specific condition is met. The other nineteen runs a day do nothing — a log line, maybe, but no artifact, no alert, nothing for you to read.

The shape makes watchers feel like an always-on background sensor. They are not always-on — they run on a discrete interval — but they behave, from your perspective, like a quiet listener that taps you on the shoulder only when something specific happens.

Examples of watchers the course recommends as realistic first builds:

  • An email from a specific person (a teacher, an employer, a college admissions office) arrives in your bounded label — fire once, write a one-line alert, flag the thread.
  • A keyword you are tracking (a specific court ruling name, a product announcement, a competition deadline) appears in the scheduled research refresh’s output — fire once.
  • A deadline in the Agent Access calendar crosses a threshold (72 hours away, 24 hours away) — fire once at each threshold.
  • A cost threshold you set — a daily or weekly ceiling on total automation spend — is crossed. (This is a meta-watcher that watches your own automations.)

Two things watchers are not, and confusing them is the typical first-build mistake:

  • Watchers are not live, push-triggered notifications. A webhook or push API that fires the moment a new email arrives is a different architecture and not what Module 6 teaches. Module 6 watchers run on an interval — every four hours, every hour, twice a day — and are accepted to be a little late. The interval is the tradeoff; pick one that matches how fast you actually need to know.
  • Watchers are not alert clouds. A watcher that fires 30 times a day because its condition is loose is not a watcher; it is noise. The alert fatigue failure mode (Block 2) retires most watchers that don’t last; Module 6 installs the rails to avoid it.

The three failure modes and the tradeoff between them CORE

Every watcher fails in three specific ways. Naming them lets you know which one you’re at risk of on the watcher you’re about to build.

  • False positives. The watcher fires when nothing that warranted firing happened. A cousin emails; the watcher (scoped too loosely) fires because their last name matches part of a tracked keyword. You read the alert, find it’s nothing, and the watcher’s credibility erodes one drop.
  • False negatives. The watcher fails to fire when something that did warrant firing happened. The teacher emails from a different address than the one the watcher’s condition named; the watcher saw nothing; you missed the email.
  • Alert fatigue. The watcher fires accurately but too often, and you stop reading the alerts. This is the most common long-term failure mode of watchers, because it is not a bug — the watcher is doing what you told it to — and is only visible from your own behavior.

The tradeoff: you cannot minimize all three. Every real watcher lives somewhere on a triangle between tolerates false positives (alerts a lot, catches everything), tolerates false negatives (alerts rarely, misses some), and tolerates fatigue (alerts at the right level, but over time wears down). Module 6’s recommendation is to bias toward false negatives for most first builds — a watcher that misses three things in a year but never wastes your attention is more useful than one that fires daily, because your capacity to keep reading is the binding constraint.

This bias is contextual. For a college admissions email you cannot miss, bias the other way — tolerate more false positives so the real one is certain to be caught. The module teaches the bias as a conscious choice per watcher, not a universal default, and the worksheet /resources/module-06/watcher-design.md asks you to state the bias for your watcher in one sentence.

Condition, threshold, and the anatomy of a good watcher CORE

A watcher’s prompt has a different shape than a brief or report. The core of the prompt is the condition-and-threshold statement — a single paragraph that says exactly when to fire. A good condition-and-threshold statement has four parts:

  • The event. Concretely: “An email from <named sender> at <named address> lands in my agent-access/school label.” Not “a message from my teacher” — the teacher’s actual email address. Not “anything school-related” — a specific label. Specificity is the difference between a watcher and a mailing list.
  • The qualifier. The extra conditions the event must satisfy to fire. “Only if the subject line does not start with [newsletter].” “Only if the message is not a calendar invite.” “Only if the body is longer than two sentences (auto-acknowledgements do not count).”
  • The threshold. What changed meaningfully since the last run. For keyword watchers: the keyword appears in the new research-refresh output but did not appear in the prior one. For deadline watchers: the time-until-deadline has crossed one of the bands (72h, 24h). Without a threshold, a watcher can fire repeatedly on the same underlying event.
  • The non-event. What explicitly does not fire. “If the teacher’s auto-responder reply appears, do not fire.” “If the keyword is a substring of an unrelated phrase (‘patel’ appearing in an email about a student named Patel unrelated to the court case), do not fire.” Naming non-events in the prompt reduces false positives in exactly the places the agent would otherwise err toward yes.

The watcher’s quiet-run behavior is the other half of the prompt. On any run where the condition is not met, the watcher writes one line to its log file — date, run ID, “no trigger, [N sources checked]” — and exits. No artifact file. No draft. The log line is the evidence the watcher ran; the absence of an artifact is the evidence nothing fired. A watcher that writes nothing on quiet runs is indistinguishable from a watcher that is broken, which means after a week you have no idea which one you have.

The watcher’s firing behavior is: exactly one artifact file per unique event, named from a stable event ID (not the run timestamp — two runs that both detect the same event must not produce two artifacts). The event ID can be the email’s message-ID, the calendar event’s UID, or a content-hash of the triggering content. The worksheet details the options per watcher type.

The response test: what will you do when it fires? CORE

The test that decides whether a watcher is worth building — before you build it, not after — is: what specific action will you take when this watcher fires?

If you can name the action (“I’ll open the teacher’s thread within 30 minutes and draft a reply”), the watcher has a reason to exist. If the action is vague (“I’ll see what’s up”), the watcher is a curiosity, not an alert, and will become noise within three weeks. If there is no action — “I just want to know” — the watcher is worse than noise; it is a demand on your attention with no return.

The response test is the module’s answer to the watcher-as-hobby failure mode — students who build watchers because watchers are fun, not because the watcher earns its place. The test is strict on purpose: a watcher without a named action is retired before it is built.

One softer version of the test, for watchers on long-running topics: the action may be “add to a list I review on Fridays.” That is a real action — it routes the signal somewhere you actually will process it. But “I’ll read the alert” is not an action; reading is not routing.

Apply the test every time you design a watcher, and apply it again in Lesson 6.5’s retirement ritual. A watcher that fires four times in two months but produced no student action in any of those four firings has failed the response test retroactively. Retire it.

Watcher-specific audit: the “last fire” column CORE

The automation register has columns for last run and last success. Watchers need one more: last fire. A watcher’s “last run” is any time the scheduler fired the task; “last success” is the last time the watcher produced output you read; “last fire” is the last time the watcher’s condition was met.

The three columns separate three distinct failure modes:

  • Last run is old. The scheduler stopped firing the watcher. The task is broken; fix the scheduler.
  • Last run is current but last fire is old. The watcher is running but has not triggered in a long time. This is either the watcher is well-scoped and the world has been quiet (fine) or the watcher has drifted into irrelevance (retirement candidate). Compare to the retirement trigger.
  • Last fire is current but last success is old. The watcher is firing, but you have not read the alerts. This is the alert fatigue signal. Either the alerts are too noisy or the response action is not happening. Either way, the watcher needs redesign or retirement before it fires again.

Only watchers need this third column, and only watchers need the extra scrutiny. Briefs and reports fire whether or not there is news; their “last run” and “last success” tell you what you need to know. Watchers’ asymmetric output — nothing most of the time, something occasionally — requires the third column to tell the three failure modes apart.

Stage 2 of 3

Try & Build

2 recipes + activity
Optional advanced — Conditional watcher via the Claude Code CLI + a local OS scheduler (scriptable) RECIPE

For terminal-comfortable students who want the watcher checked into a versioned repo, with prompt and scheduler config living together. The Cowork-tab path above is sufficient for this lesson; skip this section if scriptable automation isn't a goal.

Tool Claude Code CLI + cron / launchd / Task Scheduler
Last verified 2026-04-17
Next review 2026-07-17
Supported OSes macOS (launchd preferred), Linux (cron), Windows (Task Scheduler)

Full walkthrough in /recipe-book/conditional-watcher-recipe.md.

Set-up is identical in shape to the Cowork-tab path; the differences are:

  • ~/ai-architect-academy/automation/watchers/<watcher-name>/prompt.md holds the prompt; run.sh (or .ps1) invokes Claude Code and passes --event-id-strategy.
  • The script writes firing artifacts to watchers/<name>/fires/<event-id>.md and quiet-run log lines to watchers/<name>/log.txt.
  • The scheduler entry is a */4 * * * *-style cron or the scheduler UI equivalent. Prefer launchd on macOS for survival across sleep cycles; Windows Task Scheduler has a “run whether user is logged on” option that matters here.
  • Both the synthetic-trigger test and the no-trigger test are the same; you force-fire the task once with a matching input and then verify a quiet run produces only a log line.

Troubleshooting. If the watcher writes artifacts on every run whether or not the condition is met, the quiet-run paragraph is either missing or being ignored. Re-state it as: “if the condition is not met, write exactly one line to log.txt of the form ISO-8601 timestamp, task name, no-trigger, N sources checked and exit 0 without creating any artifact file.”

Try it — Signal-vs-noise drill + watcher first run RECIPE + CORE review

Part A — Signal-vs-noise drill.

Open the Signal-vs-noise drill in your browser. The drill presents 12 candidate watcher designs drawn from realistic student use cases — an admissions-office email, a court-case keyword, a research deadline, a cost-ceiling crossing, and so on. For each candidate you will:

  1. Rate it signal (worth building), adjustable (close but needs a sharper condition or response), or noise (retire before build).
  2. For adjustable candidates, sketch the sharper condition-and-threshold statement that would move it to signal.
  3. For noise candidates, name what made it fail the response test.

The drill reports back a summary: how your signal / adjustable / noise ratings compared to a reference rubric, and where your sharpening suggestions for adjustable items lined up or missed. Save or screenshot the summary into /capstone/automation-artifacts/watchers/.

Part B — Build one real watcher.

  1. Pick the watcher you want to actually run during the module. Pick one from your own life; do not borrow one from the drill.
  2. Fill in /resources/module-06/watcher-design.md in full.
  3. Build and launch the watcher in the Cowork tab (or the Optional Advanced CLI sidebar if you opened it). Run the synthetic trigger test. Run the no-trigger test.
  4. File register entry 4 with event, condition-and-threshold, quiet-run behavior, bias, and retirement trigger in their respective columns.

Deliverable. Register entry 4 complete. Watcher worksheet filed. Signal-vs-noise drill summary saved. At least one quiet-run log line and one firing artifact (from the synthetic trigger) present in the watcher’s folder.

Done with the hands-on?

When the recipe steps and any activity above are complete, mark this stage to unlock the assessment, reflection, and project checkpoint.

Stage 3 of 3

Check & Reflect

key concepts, quiz, reflection, checkpoint, instructor note

Quick check CORE

Five questions. Open each to see the answer and reasoning.

Q1. The difference between a watcher and a morning brief is:
  • A — A watcher is more expensive to run.
  • B — A morning brief runs every day whether or not there is news; a watcher runs on a schedule but produces output only when a specific condition is met.
  • C — A watcher uses a faster model.
  • D — There is no practical difference.
Show explanation

Answer: B. (A) and (C) are implementation details that may or may not be true. The defining difference is output cadence. A brief always produces an artifact; a watcher produces an artifact only when its condition is met, and logs a quiet-run line otherwise. The difference matters because it changes how you design, how you review, and what columns the register needs.

Q2. You build a watcher for emails from a specific teacher. Over three weeks it fires 22 times. You read the first three alerts and have not opened the rest. Which failure mode is this, and which column of the register surfaces it?
  • A — False positives; surfaced by a last run column that is stale.
  • B — False negatives; surfaced by comparing the teacher’s actual email count to the watcher’s fire count.
  • C — Alert fatigue; surfaced by last fire being current while last success (the last alert you actually read and acted on) has gone stale.
  • D — A broken scheduler; surfaced by last run.
Show explanation

Answer: C. Alert fatigue is the failure mode where the watcher is accurate but the student stops reading. The specific register signal is the gap between last fire (recent) and last success (stale). The fix is not to rewrite the condition — the watcher is firing correctly — but either to loosen the qualifier (fewer fires per real event) or to retire.

Q3. Which of the following is the correct quiet-run behavior for a watcher that ran but did not detect its condition?
  • A — Produce an artifact file named no-trigger-<timestamp>.md with body “nothing to report.”
  • B — Produce nothing at all — no artifact, no log — because nothing happened.
  • C — Write one log line to log.txt of the form “ISO-8601 timestamp, task name, no-trigger, N sources checked” and exit without creating any artifact file.
  • D — Retry three times before exiting.
Show explanation

Answer: C. (A) fills the watcher’s folder with noise that is as bad as false-positive alerts. (B) makes the watcher indistinguishable from a broken watcher — you cannot verify it ran. (D) is the wrong retry shape; a quiet run is not a failure to retry. The log line is the evidence of the run and the only output; no artifact distinguishes this from a firing run.

Q4. The response test for a watcher asks you, before you build it:
  • A — Will the watcher be expensive?
  • B — What specific action will I take when this fires, and is that action real (routing, drafting, opening a specific thread) rather than “I’ll see what’s up”?
  • C — Will the watcher use a cloud model or a local model?
  • D — What’s the retry policy?
Show explanation

Answer: B. The response test is the pedagogical retirement-before-build filter. A watcher whose response is “I’ll read the alert” is a watcher that will generate attention demand and produce no behavioral benefit; it fails the test and is not built. (A), (C), (D) are real design decisions but not the response test.

Q5. Your register shows a watcher whose last run is yesterday, last fire is three months ago, and last success is also three months ago. What does this tell you?
  • A — The watcher is broken.
  • B — The watcher is working, and either the world has been quiet about the watcher’s topic (fine) or the watcher has drifted into irrelevance — check the retirement trigger and decide whether to retire.
  • C — The watcher has alert fatigue.
  • D — The watcher is suffering from false positives.
Show explanation

Answer: B. Last run current means the scheduler is firing. Last fire old means the condition has not been met. Last success equaling last fire means on the rare runs the watcher did fire, you read and acted. This is a healthy watcher — or a watcher whose topic has gone quiet. The register cannot tell you which; the retirement trigger can. If the retirement trigger is “the court case is decided” and the case has been decided two months ago, retire. If the trigger is still open, keep.

Reflection

In 6–8 sentences: Name the one watcher you built and the response action you committed to when it fires. Be specific — if the watcher fires at 2 PM Tuesday, what do you do between 2 PM and 3 PM Tuesday? Now imagine a version of you three months from now. When was the last time that watcher fired? What did three-months-from-now-you actually do when it fired — or did you read the alert and close it? Based on the answer, what would you change about the watcher’s condition-and-threshold statement today to make sure three-months-from-now-you still acts on it?

The trick here is to imagine the unflattering answer. Most watchers fade. Designing today for a version of you who reads fewer alerts is more realistic than designing for a version who reads every one.

Project checkpoint

Open automation-register-v1-draft.md and add register entry 4: Watcher — <your watcher’s name>.

Every column, populated. Note: the register’s last fire column is relevant here — it was not relevant for entries 1, 2, 3. Start it with the synthetic-trigger run you did today. The retirement trigger column should be the specific event that would make this watcher irrelevant (the case being decided, the deadline passing, the project concluding, the inbox scope being reorganized).

Entry 4 — Watcher — <name>

Event: [named sender + label, or named keyword, or named calendar threshold]

Qualifier: [auto-acknowledgements excluded / subject shape / length / non-calendar-invite / etc.]

Threshold: [new-since-last-run keyword diff / time-band crossing / content hash]

Non-event: [explicit do-not-fire cases]

Idempotency key: [message-ID / event UID / content-hash]

Quiet-run behavior: one log line, no artifact

Bias: [false-negatives / false-positives / balanced] — one-sentence reason.

Response action: [specific, named, routable — not “I’ll see what’s up”]

Schedule: every N hours, HH:MM–HH:MM

Last run / last fire / last success: today (synthetic) / today (synthetic) / today (synthetic)

Retirement trigger: [the specific event that would make this watcher irrelevant]

Save the watcher-design worksheet and the signal-vs-noise drill summary into /capstone/automation-artifacts/watchers/.

Do not proceed to Lesson 6.5 until entry 4 is complete, the synthetic trigger produced a real artifact, and at least one quiet-run log line exists.

Next in Module 6

Lesson 6.5 — Reliability, cost, and retirement.

Automations fail more quietly than they succeed. Health checks, cost ceilings, the five retirement signals, and the revocation ritual that kills a task cleanly — disable the scheduler, revoke the credentials, archive the folder. One last student-designed task. Then the register freezes into automation-register-v1.md and ships to the capstone.

Continue to Lesson 6.5 →