Module 3 — End-of-Module Check | AI Architect Academy

How to take this check

Do it in one sitting. 50–65 minutes is enough.
For the multiple-choice and short-answer sections, close every AI tool and tab. This checks your internalized model, not the model’s.
For each multiple-choice item, pick an answer before you reveal the explanation. Guessing and then reading the answer is not the same as knowing.
For short-answer items, write your response on paper or in a text file before you reveal the model answer. Compare honestly.
For the applied section, open your directed-edit-log-v1.md, the diff-review checklist, your scoping-doc.md, and your tripwire-catalog-v1.md. Q9 and Q10 are open-workstation on purpose.
If you miss a question, the feedback names the lesson(s) to revisit.

Multiple choice CORE

Six questions. Concept recall and diagnosis. One point each.

Q1. The four moves of a directed edit are:

A Prompt, respond, copy, paste.
B Locate, plan, write, verify.
C Goal, model, tool, output.
D Read, run, review, merge.

Show explanation

Answer: B. The four moves are locate, plan, write, verify, with the director’s check laid over the top. Review Lesson 3.1 Block 2 if missed.

Q2. You ask an agent to fix a small bug. It returns a 60-line diff that touches three files you did not mention. The best next step is:

A Merge — more fixes are free.
B Read every line carefully and merge if each is individually correct.
C Ask the agent for a minimal diff — the smallest change that makes the test pass — before merging anything.
D Reject the whole session and switch tools.

Show explanation

Answer: C. Correctness of individual lines does not defend against unwanted scope. The minimal-diff ask is the first and cheapest move. Review Lesson 3.3 Block 3 (scope creep) if missed.

Q3. The four diff-review questions from Lesson 3.3 are:

A Does it compile, run, pass tests, and look nice?
B Right place, right scope, legitimate deletion, any surprise?
C Fast, small, correct, safe?
D Who wrote it, when, why, and for whom?

Show explanation

Answer: B. Every diff is four questions: right place, right scope, legitimate deletion, any surprise. Review Lesson 3.3 Block 2 if missed.

Q4. Which of the following is in the strong zone for AI coding agents, not a tripwire?

A “Refactor this module to be more elegant.”
B “Rename getUserData to fetchUserProfile across the repo and update all call sites and tests.”
C “Optimize my authentication flow to be more secure.”
D “Rewrite this to be better.”

Show explanation

Answer: B. A cross-file rename with test updates is a textbook mechanical-transformation-across-many-files task — the second of the four strong zones. (A), (C), and (D) each hit at least one tripwire: illegible goal, security-sensitive, or rewrite-over-addition. Review Lesson 3.4 Block 2.

Q5. “Session drift” (Tripwire 8) most commonly shows up as:

A A provider outage partway through a session.
B The agent gradually proposing larger, sloppier changes over a multi-hour session, and the director gradually approving quicker.
C A slow internet connection.
D A bug that appears only on Fridays.

Show explanation

Answer: B. Drift is the slow kind — both sides of the conversation degrade together. The counter-move is to start fresh sessions on purpose, especially past the two-hour mark. Review Lesson 3.4 Block 3 if missed.

Q6. The scoping pattern recommended in Lesson 3.5 is:

A Plan, code, ship.
B Top-down, bottom-up, inside-out.
C Scaffold → path-through → polish.
D MVP → V1 → V2.

Show explanation

Answer: C. Each slice leaves the system in a working state, which is what makes per-slice verification possible. Review Lesson 3.5 Block 2 if missed.

Short answer CORE

Two questions, three to four sentences each. Up to 1.5 points each. Write your answer before revealing the model answer.

Q7. A classmate says: “Running the five-move review on every change is too slow. The agent is smart enough that I can skip the review on small changes.” Name three specific problems with this stance, referencing Module 3’s four failure modes and the moves that catch them.

Model answer.

Silent-deletion risk is highest on small, “obvious” changes. Move 3 (asking the agent what it removed and why) is the move that catches silent deletions. Skip Move 3 because the change looks benign and you've made the failure mode invisible by definition. Reviewer attention is lowest exactly when stakes feel lowest.
Plausible-wrong failures pass tests. The whole point of failure mode 4 is that the change runs, the tests are green, and the result is still wrong for your actual goal. You can only catch this with Move 4 — running the result yourself on real input. “Tests pass” is an agent-side signal; Move 4 is the one signal that comes from you.
Scope creep often hides as a small incidental change. Move 1 (checking the shape of the diff — file count, line count, anything unexpected) catches it without any reading. Skip Move 1 on “small” changes and you’ve given up the cheapest defense you have.

Also worth naming: the five-move habit only works if it is uniform — exceptions metastasize. “Skip on small” turns into “skip on most” within a few weeks, because the stakes always feel low until they don't.

Time-cost framing (bonus, not required): a small amount of review time per change vs. potentially much longer recovery from the bugs it catches. “Too slow” is the wrong denominator.

Scoring: Full credit (1.5) for naming at least three of {silent-deletion risk on small diffs, plausible-wrong failures pass tests, scope creep disguised as incidental, habit only works if uniform}. Partial credit (0.75) for two reasons without the time-cost framing. No credit for “you just should” without specific failure modes named. Review Lesson 3.3 Block 5 (answering the common objections) if missed.

Q8. A student opened the Code tab and asked the agent to fix a bug. The agent proposed running rm -rf tests/ as part of the plan because the tests “are the problem.” What should the student do, in what order, and what course rule is at stake?

Model answer.

Refuse the command. Destructive file operations are blocked by the Module 3 safety rule regardless of the stated reason.
Ask the agent what it is actually trying to accomplish. Usually “regenerate the tests,” for which the correct move is to edit the existing tests, not delete them.
If the agent insists the tests need to be replaced, direct a more surgical change. Delete specific test functions, not the whole folder.

The rule at stake: “Never let an agent run a destructive command on your behalf without confirming the command first.” That is the Module 3 operational safety norm, and the second of the two rules frozen at the top of the directed-edit log. The paired rule — “You do not merge what you have not read” — is adjacent, because the same skim-in-trust failure mode is what lets a destructive command slip through.

Scoring: Full credit (1.5) requires the correct order (refuse first, then unblock) AND naming the Module 3 safety rule. Partial credit (0.75) for the correct order without the rule, or the rule without the order. No credit for “approve; the agent knows what it’s doing.” Review the Module 3 README safety norms and Lesson 3.2’s safety-default callouts if missed.

Applied CORE

Two questions, half a page each. Open-workstation — keep your frozen log, checklist, scoping doc, and tripwire catalog in front of you. Up to 3 points each.

Q9. Director's review under time pressure.

Below is a real change an agent made. You see the goal, the agent's plain-English summary, the shape of the diff, and the test output — the same signals you'd see in a real Code-tab session (or Claude Code CLI session, if that's where you work). Run the five-move review and decide: accept, revise, or reject. If revise, write the one-sentence revision prompt. Name any failure mode you identify.

Goal: Fix the crash in load_config when the config file is missing. The function should return an empty dict in that case so the caller can proceed with defaults.

The agent's summary of what it did:

“I added a check for whether the config file exists. If it doesn't, the function returns an empty dict so the caller can use defaults. I also added handling for malformed JSON files — those return an empty dict too and log a warning so you can spot bad config files.”

Shape of the diff: 1 file changed (config.py). 7 lines added. 3 lines removed.

Test output: 8 tests pass (1 new test for the missing-file case).

You don't need to read code. You have the agent's claim about what it did, the shape of the change, and the test output. Run the five moves on those signals.

Show model answer

Model answer. Decision: revise, not reject.

The agent's summary itself names something that wasn't asked for: handling of malformed JSON files. The goal was about missing files; malformed-JSON handling is scope creep — Lesson 3.3 failure mode 2. A director catches this with Move 2 of the five-move review (reading the agent's summary against the prompt). You don't need to read code to spot it — the agent told on itself in plain English. The added behavior is plausibly good, but it wasn't requested and should either be justified or split into a separate change with its own test. “Tests pass” doesn't defend against unrequested scope — the existing tests cover the missing-file case, not the malformed-JSON case.

Revision prompt (one-sentence): “Please give me a version that handles only the missing-file case — return {} when the file doesn't exist and otherwise leave the function unchanged. If you think malformed-JSON handling is also worth doing, propose it as a separate change with its own test.”

Scoring: Full credit (3) requires: (i) identifying the scope-creep failure mode by name, (ii) noting that the agent's own summary surfaced the scope creep (so this is a Move 2 catch, not a code-reading catch), (iii) deciding revise rather than reject or accept, and (iv) writing a revision prompt that asks for the minimal missing-file change and defers the malformed-JSON handling to a separate change. Partial credit (1.5) for recognizing scope creep without offering a revision path. No credit for “accept, tests pass” — the pass signal does not cover the unrequested behavior, which is exactly the lesson being tested. Remediation: re-do the Diff Review Trainer activity and redo Entry 2 of your capstone log with a fresh, deliberately-hard review.

Q10. Scope a real feature.

Pick a real feature you would actually add to a real codebase you have access to — your capstone tidy-starter, a personal project, or a homework codebase. In the space below, produce a scoping doc with the five required sections:

Goal (one sentence, verifiable).
Surface area (three or four bullets naming the files/components this touches).
Slices (three to six, each with a verifiable end state; follow scaffold → path-through → polish unless you can defend a different order).
Verify signals (one per slice, concrete not vague).
Out-of-scope list (at least two items that are genuinely adjacent to the feature).

Target 200–400 words for the scoping doc. Then, in one additional paragraph, predict: which two of the eight tripwires are you most at risk of on this feature, and what will you do differently to neutralize them?

Show model answer

Scoring: Full credit (3) requires:

All five scoping-doc sections present and coherent.
Slices follow scaffold → path-through → polish (or a defensible alternative) and each has a verifiable end state.
Out-of-scope list contains at least two items that are genuinely adjacent — not obviously-unrelated filler.
Verify signals are concrete (“I can type apple and see filtered results”), not vague (“it works”).
The tripwire-prediction paragraph names two distinct tripwires with a specific counter-move for each.

Partial credit (1.5) for missing one of the five sections, vague verify signals, or a tripwire paragraph that lists tripwires without counter-moves. No credit for a scoping doc that is generic enough to apply to any feature — the question requires specificity, because the point of scoping is that it is specific. Remediation: redo Lesson 3.5’s scoping-doc step with the agent drafting, and compare your revised plan against the first draft.

Parent / instructor scoring summary

Total: 15 points across 10 questions.

Multiple choice (Q1–Q6): 1 point each — 6 points.
Short answer (Q7–Q8): up to 1.5 points each — 3 points.
Applied (Q9–Q10): up to 3 points each — 6 points.

Passing bar: 11.5 of 15 or better, with at least one applied question at full credit. A miss on Q9 sends the student back to Lesson 3.3 and the Diff Review Trainer; a miss on Q10 sends them back to Lesson 3.5 before Module 4.

Weighting suggestions for parents issuing credit:

Multiple choice (Q1–6): 40% of Module 3 score.
Short answer (Q7–8): 20%.
Applied (Q9–10): 40%. Q9 and Q10 are the load-bearing items — they demonstrate applied judgment on a real diff and real planning, not recall.

Evidence to file in the student’s credit portfolio for Module 3:

This completed check (all ten answers written out).
The frozen directed-edit-log-v1.md in the capstone folder (two safety norms, zone map, three entries).
The scoping-doc.md from Lesson 3.5.
The tripwire-catalog-v1.md from Lesson 3.4.
At least one completed per-lesson quiz and reflection from Lessons 3.1–3.5.
A short (2–3 sentence) instructor note on Q9 and Q10 — which failure mode the student identified in Q9, and which two tripwires they flagged in Q10.

Remediation if missed:

Q1: Re-read Lesson 3.1 Block 2 and redo the Anatomy of a directed edit activity.
Q2: Re-read Lesson 3.3 Block 3 and run the Diff Review Trainer activity a second time.
Q3: Re-read Lesson 3.3 Block 2; print the diff-review checklist and tape it next to your editor if you haven’t.
Q4: Re-read Lesson 3.4 Block 2 (strong zones) and Block 3 (tripwires).
Q5: Re-read Lesson 3.4 tripwire 8 and Lesson 3.5 Block 3 (reset between slices).
Q6: Re-read Lesson 3.5 Block 2 and walk through the --dry-run worked example.
Q7: Re-read Lesson 3.3 Block 5 (answering the common objections).
Q8: Re-read the Module 3 README safety norms and Lesson 3.2's safety-default callouts in both the Code-tab Bug 1 and Bug 2 recipes (and the optional CLI sidebar if you did it).
Q9: The diff-review skill is the load-bearing habit of Module 3. Re-do the Diff Review Trainer activity and redo Entry 2 of your capstone log with a fresh, deliberately-hard review.
Q10: Scoping is the skill Module 3 builds toward. If this is weak, redo Lesson 3.5’s scoping-doc step with the agent drafting, and compare your revised plan against the first draft.

If the student passes at 11.5 / 15 or above with at least one applied question at full credit, Module 3 is complete and Module 4 can begin. Below that bar, target remediation to the specific lesson(s) listed above before moving on.

Next up

Module 4 — Research agents.

The directing muscle transfers. Module 4 picks up the same loop — locate, plan, write, verify — on work that is not code: structured research, source triangulation, fact-checking, synthesis, shipping research outputs.

Open Module 4 →

Module 4 opens when the Module 3 portfolio is complete — this check, the frozen log, the scoping doc, and the tripwire catalog.

Ten questions. Direct, review, and scope.

How to take this check

Multiple choice CORE

Short answer CORE

Applied CORE

Parent / instructor scoring summary

Module 4 — Research agents.