How to take this check
- Do it in one sitting. 45–60 minutes is enough.
- Close every AI tool and tab. This checks your own internalized model, not the model’s.
- For each multiple-choice item, pick an answer before you reveal the explanation. Guessing and then reading the answer is not the same as knowing.
- For short-answer and applied items, write your answer on paper or in a text file before you reveal the model answer. Compare honestly.
- If you miss a question, the feedback names the lesson(s) to revisit.
Multiple choice CORE
Six questions. Concept recall and diagnosis.
Show explanation
Answer: B. Review Lesson 1.1 Block 2 if missed.
Show explanation
Answer: C. Review Lesson 1.2 Block 2 if missed.
Show explanation
Answer: B. Review Lesson 1.2 Block 3 if missed.
Show explanation
Answer: B. Review Lesson 1.3 Block 2 if missed.
Show explanation
Answer: B. Review Lesson 1.3 Block 4 and Lesson 1.5 Block 3 if missed.
Show explanation
Answer: C. Review Lesson 1.4 Block 2 if missed.
Short answer CORE
Two questions, two to four sentences each. Write your answer before revealing the model answer.
Q7. In your own words, explain why asking a model to “think step by step” often improves its answer on a hard problem. Mention next-token prediction in your answer.
Model answer (for instructor / parent): Each step in a step-by-step solution is itself a common pattern the model has seen many times in training. When you ask the model to produce intermediate steps, you are breaking a hard next-token prediction into a sequence of easier ones, each routed through denser training territory. The model’s own earlier steps become context for its later steps. The mechanism is not that the model is “thinking harder”; it is that smaller predictions are more reliable, and the model can use its own intermediate output as high-quality context.
Scoring: Full credit if the answer mentions (1) intermediate steps as easier predictions and (2) the model using its own earlier output as context. Half credit if only one. Review Lesson 1.2 Block 4 if missed.
Q8. Name two of the four “forgetting modes” an agent can exhibit, and give the directing move that fixes each.
Model answer (any two of):
- Context-window truncation — fix: shorten up-front context, use a larger-window model, or have the agent write a running summary to a scratchpad file.
- New-conversation reset — fix: put needed information in external memory and direct the agent to read it at the start.
- Tool result never added to state — fix: inspect the trace, find the missing result, and rework the task so tool results are small enough to fit or are summarized before being added.
- External state not consulted — fix: be explicit in the task instruction (“first read notes.md, then proceed”) or build a habit into the agent’s standing instructions.
Scoring: Full credit if two are correctly named with a matching fix. Half credit for one. Review Lesson 1.4 Block 4 if missed.
Applied CORE
Two questions, half a page each. Write your answer before revealing the model answer.
A student gave an agent this task: “Find a summer reading list recommended by the American Library Association and save it to reading-list.md.” The agent’s trace shows:
- Turn 1. Reasoning: “I’ll search for ALA summer reading list.” Tool call: web search for “ALA summer reading list.” Result: Error: network timeout.
- Turn 2. Reasoning: “Here is a list of books.” Tool call: write a file called reading-list.md with eight books listed as “Recommended by ALA” (all real titles, but none actually ALA-endorsed). Result: file written.
- Turn 3. Reasoning: “Task complete.” Final answer: “Saved the ALA summer reading list to reading-list.md.”
In 5–8 sentences, (a) name the specific failure pattern you see, using the vocabulary from this module, (b) identify which of the four director questions would most prevent it next time, and (c) propose one concrete change to the task or the tool surface that would catch this failure the next time around.
Show model answer
Model answer (for instructor / parent):
(a) This is a classic ignored tool result combined with hallucination. The tool failed (network timeout on Turn 1), but the model continued to Turn 2 and produced fluent output as if the search had succeeded — completing the pattern “here is the ALA reading list.” The confident tone on Turn 2 is Lesson 1.2’s confidence-≠-correctness mechanism playing out inside the loop.
(b) The most directly relevant director question is “what tools does the agent need, and does it have them?” — specifically, whether the tool surface handles tool failures. But the loop-framing question is also relevant: the task did not require the agent to cite the source of its list, so the agent had no constraint against inventing one.
(c) Concrete change: rewrite the task as “Find a summer reading list recommended by the ALA. Cite the exact ALA URL it came from in the output file. If the search tool fails, stop and report the failure instead of writing the file.” This forces grounding (citation) and specifies an error-handling path, either of which would have caught the silent failure.
Scoring: Full credit if all three parts are addressed with correct vocabulary. Partial credit if the student identifies the failure but cannot name it in the module’s vocabulary. Review Lessons 1.2, 1.3, and 1.5 if missed.
Re-read the one-page loop design you wrote in Lesson 1.5 (my-first-loop.md). In half a page, answer:
- Which part of your loop design is the most fragile — most likely to be wrong or break — and why.
- Which of the four zones (model / loop / tools / state) is that fragility in.
- What is the cheapest test you could run before Module 5 to see whether your assumption holds.
Show scoring rubric & examples
Scoring: There is no single right answer. Full credit if the student (i) identifies a specific fragility, not a vague one; (ii) classifies it in the correct zone from Lesson 1.5; and (iii) proposes a test that is both concrete and genuinely cheaper than building the whole system.
Example of a good fragility: “I am assuming my school’s email system will let the agent read messages, but I have not checked. This is a tools-zone fragility. The cheapest test is to log into my school email in a browser and see whether it exposes an API or at least IMAP access.”
Example of a weak fragility: “The AI might hallucinate.” (Too vague; send student back to revise.)
Parent / instructor scoring summary
Total: 10 questions.
Passing bar: 8 of 10 correct, with at least one applied question at full credit.
Weighting suggestions for parents issuing credit:
- Multiple choice (Q1–6): 40% of Module 1 score (roughly 6.7% each).
- Short answer (Q7–8): 30% (15% each).
- Applied (Q9–Q10): 30% (15% each).
Evidence to file in the student’s credit portfolio for Module 1:
- This completed check.
- The student’s my-first-loop.md from Lesson 1.5.
- At least one completed per-lesson quiz and reflection (all five lessons have these).
- A short (2–3 sentence) instructor note on the applied questions — which fragility the student named in Q10, and what test they proposed.
If the student passes at 8 of 10 or above, Module 1 is complete and Module 2 can begin. If below 8, target remediation to the specific lesson(s) listed in the missed questions’ feedback before moving on.
Next up
Module 2 — Your AI workstation.
Stand up the actual machine: install the Claude desktop app and sign in with Anthropic Pro, optional local LLM on your hardware, dev tools, and safe defaults for privacy and cost. The first build-heavy module.
Module 2 opens when Module 1 is complete in your portfolio.