How to take this check
- Do it in one sitting.
- For the multiple-choice and short-answer sections, close every AI tool and tab. This checks your internalized model, not the model’s.
- For each multiple-choice item, pick an answer before you reveal the explanation. Guessing and then reading the answer is not the same as knowing.
- For short-answer items, write your response on paper or in a text file before you reveal the model answer. Compare honestly.
- For the applied section, open your frozen /capstone/extension-register-v1.md, the hygiene ritual worksheet, your custom-skill-v1/SKILL.md and CHANGELOG.md, and your custom-plugin-v1/SECURITY.md. Q9 and Q10 are open-workstation on purpose.
- If you miss a question, the feedback names the lesson(s) to revisit.
Multiple choice CORE
Six questions. Concept recall and diagnosis. One point each.
Show explanation
Answer: B. Review Lesson 7.1 Block 2 if missed. Both persist across sessions, so (A) is wrong. (C) and (D) are implementation fictions. The distinguishing property is what the artifact is for — memory tunes conversational style; a skill adds a discrete named capability.
Show explanation
Answer: C. Review Lesson 7.3 Block 2 if missed. The skill is firing on cases the student did not intend because the description’s scope is wider than the procedure justifies. The fix is an exclusion block: “Do not use for single-article reads, casual summaries, or when the user has already named a single source.” (B) is the wrong diagnosis — overfit would under-fire. (A) and (D) are body-level issues, not trigger issues.
Show explanation
Answer: B. Review Lesson 7.2 Block 4 if missed. The plugin is not clean-audit (send capability exists) and not decline-required (a drafts-only mode is available). Option B is the textbook case. Option E is reserved for extensions that cannot be narrowed and are being installed anyway with a specific, time-bound reason — not the situation here.
Show explanation
Answer: B. Review Lesson 7.5 Block 3 if missed. Use has stopped is the first retirement signal. 74 days is past the threshold. The next review date being 20 days out does not change the signal — the signal was already lit. The right move is retire (unless a seasonal tag explains the gap, which the prompt rules out). (A) misreads the review-date semantics; (C) and (D) are reflexive.
Show explanation
Answer: B. Review Lesson 7.4 Block 3 if missed. S6 makes the convenience-vs-control tradeoff explicit; S7 makes update-time permission changes auditable. (A), (C), and (D) are non-answers — legitimate concerns, not part of the module’s questionnaire.
Show explanation
Answer: C. Review Lesson 7.5 Block 3 if missed. The five signals are use-stopped, tool-changed, description-no-longer-matches, permission-widened, better-option-matured. An open issue is a maintenance signal but is not by itself a retirement trigger; retire when the issue has actual impact on the work you use the extension for.
Short answer CORE
Two questions, 3–4 sentences each. Up to 2.5 points each. Write your response before revealing the rubric.
Rubric (5 sub-points, up to 2.5 points total):
- (0.5) Names that the agent reads descriptions — not bodies — to decide which skill to fire.
- (0.5) Explains that the description functions as the input to a classifier: “does this job belong to this skill?”
- (0.5) Names the specific failure mode of vagueness — the skill never fires, rather than firing wrongly — so the student does not notice the problem (no visible error); they just do not get the lift they expected.
- (0.5) Contrasts against loud failure modes like a body-level bug (which the student would see in the output).
- (0.5) Correctly identifies the implication: the description deserves the same tuning attention as the body.
A passing short-answer (3–4 sentences) hits at least four of the five bullets.
Show model answer
Model answer. “When I ask the agent for something, it does not inspect every installed skill’s body to decide what to do. It reads the short description field and uses that as the input to a classifier: ‘does this request belong to this skill?’ If the description is vague, the classifier never matches — the skill silently does not fire, and the student just gets a generic chat response. That is worse than a loud bug, because there is no error to debug; the student concludes ‘skills are mysterious’ and stops building. The implication is that the description deserves as much careful tuning as the body. Both are prompts, for two different decisions.”
Remediation: re-read Lesson 7.3 Block 2 and re-run the description-tuning drill.
Rubric (5 sub-points, up to 2.5 points total):
- (0.5) Grants mail:read_label (narrowest scope that does the job — read a specific label, nothing more).
- (0.5) Explains least-privilege reasoning: the plugin does not need wider read access, so do not request it.
- (0.5) Notes that draft capability is not needed — drafting happens by writing to a local file, not via the mail provider’s draft API — so mail:read+draft is excluded.
- (0.5) Names a change-condition that would flip the scope — e.g., “if the skill later needs to draft a reply that stays in the user’s Gmail drafts folder, mail:read+draft becomes correct; this would be a minor-version bump with a changelog entry and a fresh audit.”
- (0.5) References the plugin-level security discipline — the decision is documented in SECURITY.md and the register.
Show model answer
Model answer. “mail:read_label. The skill only needs the agent-access label, so a full-mailbox scope (mail:read_all) would be an unnecessary widening; mail:read+draft adds draft capability I do not use because the summary is written to a local file, not to Gmail drafts. This is least-privilege applied: narrowest scope that does the job. The answer would change if a future version of the skill started drafting replies that live in the mail provider’s drafts folder — then mail:read+draft becomes the correct scope, with a minor-version bump, a changelog entry, and a fresh pass through the S4 and S6 questions in SECURITY.md.”
Remediation: re-read Lesson 7.4 Block 4 (least-privilege, applied) and Block 3 (questionnaire).
Applied CORE
Two questions, half a page each, up to 2.5 points each. Open-workstation: keep your frozen register, your custom skill’s CHANGELOG.md, and your custom plugin’s SECURITY.md open. Full credit requires the analysis be grounded in your own artifacts, not a generic response.
Scoring rubric (5 sub-points, up to 2.5 points total):
- (0.5) Description quoted verbatim.
- (0.5) Names the first-draft failure mode and cites changelog evidence (specific test request that failed, with the reason).
- (0.5) Proposes a concrete edit (not a vague “make it better”).
- (0.5) Predicts the failure-mode risk the edit addresses (e.g., “reduces Mode 2 / overfit by adding an exclusion for casual summaries”).
- (0.5) Cites at least one specific trigger-test phrase from the changelog by name.
Full credit requires the analysis be grounded in the student’s own changelog, not a generic response.
Show model answer
Model answer (illustrative — your specifics must come from your own skill).
- Current description (verbatim): “Runs a multi-source research sweep on a named topic. Use when the user asks to triangulate sources, check citations, produce a sources.md, or build a competitive landscape. Inputs: a topic string. Outputs: a sources.md with three independent sources, one fabrication-risk callout, and a one-paragraph synthesis. Do not use for single-article summaries or casual curiosity lookups.”
- First-draft failure mode (vague): my Round 1 description was “Helps with research.” The changelog Round 1 test shows this failed to fire on “do a research sweep on carbon capture” — the agent produced a generic chat reply instead of invoking the skill, which is the silent-fail pattern of Mode 1 / vague. I confirmed this by seeing the skill was not named in the session trace at all.
- Proposed edit & predicted risk: add a sentence to the exclusion block: “Do not use when the user names a single author or source — prefer in-session prompting for single-source reads.” The Round 3 test “read the Stanford NLP paper and summarize it” fired the sweep when I wanted an in-session reply. This edit reduces Mode 2 / overfit risk by narrowing the exclusion to name the “single named source” shape explicitly, which is the surface my Round 3 test exposed.
Remediation: a miss here sends the student back to Lesson 7.3. Run a fourth tuning round; add a changelog entry. This is a core module capability and must be demonstrated before Module 8.
Scoring rubric (5 sub-points, up to 2.5 points total):
- (0.5) Row correctly identified and summarized (not a generic row).
- (0.5) Correctly cites which of the five ritual steps surfaced the issue.
- (0.5) Reasoning tips cleanly — the student names a concrete factor, not a vibe.
- (0.5) Honest uncertainty acknowledged if present (the rubric rewards honesty; pretending confidence you do not have loses a point).
- (0.5) 90-day prediction is specific and falsifiable (e.g., “if keep was right, use count should be > 5 and no collisions with newly authored skills; if replace was right, the new extension should have its own row with 2 real-use traces”).
A passing Q10 requires that the student demonstrate the ritual is a thinking tool, not a checklist.
Show model answer
Model answer (illustrative — your specifics must come from your own register).
- Row: research-sweep (authored skill). State: frozen v1.0, 6 invocations in the last 60 days, description tight after Round 3 tuning, permission surface unchanged.
- Ritual walkthrough: Step 1 (descriptions aloud) passed. Step 2 (permissions vs audit) passed. Step 3 (review dates) passed — row’s next review is 35 days out. Step 4 (collisions) surfaced the issue: my newly installed quick-summary plugin shares the trigger phrase “summarize this article” with my research-sweep skill’s exclusion block. The agent has been picking the plugin correctly on 4 of 5 recent short-article requests, but the collision is not explicit — it is working by the exclusion block doing its job, not by an affirmative disambiguation.
- Decision: Refactor, not retire. The skill is still earning its keep, but the exclusion block should become a positive reference to the plugin by name — tightening the classifier input, not just ruling out adjacent cases. The one piece of information that would have made this easier: 30 more days of collision data — right now I only have 5 short-article requests in the window.
- 90-day prediction: if refactor was right, the next ritual pass should show (a) the 10+ short-article requests in that 90-day window all routed to quick-summary without the agent reaching for research-sweep; (b) the refactored description hitting a new Round 4 trigger test at 6/6; (c) no new collision surfaced with any additional plugin installed since. If instead I see research-sweep firing on short-article requests anyway, the refactor was the wrong call and I should replace rather than patch.
Remediation: a miss here sends the student back to Lesson 7.5. Re-run the hygiene ritual on at least three rows with the worksheet open; produce a written ritual log alongside the register.
Parent / instructor scoring summary
Total: 15 points across 10 questions.
- Multiple choice (Q1–Q6): 1 point each — 6 points.
- Short answer (Q7–Q8): up to 2.5 points each — 5 points.
- Applied (Q9–Q10): up to 2.5 points each — 5 points (total capped at 15 for the effective scale).
Passing bar: 11.5 of 15 or better, with at least one applied question at full credit. A miss on the applied section sends the student back to Lesson 7.3 (if the missed applied is Q9, description-tuning) or Lesson 7.5 (if the missed applied is Q10, hygiene-ritual) before Module 8.
Weighting suggestions for parents issuing credit:
- Multiple choice (Q1–6): 40% of Module 7 score.
- Short answer (Q7–8): 20%.
- Applied (Q9–10): 40%. Q9 and Q10 are the load-bearing items — they demonstrate applied judgment on the student’s own artifacts, not recall.
Evidence to file in the student’s credit portfolio for Module 7. This check alone is not sufficient evidence of Module 7 completion. The full Module 7 portfolio is:
- This completed check (all ten answers written out). File in /ops/credit-docs/module-07/.
- /capstone/extension-register-v1.md — 5–7 rows, all audited, every row with current next-review date and status column filled.
- /capstone/custom-skill-v1/ — SKILL.md, README.md, CHANGELOG.md (3+ entries from Rounds 1, 2, 3 of the tuning loop), traces/ (2 real-use runs).
- /capstone/custom-plugin-v1/ — manifest, skills/, README.md, SECURITY.md (all seven questionnaire items answered), CHANGELOG.md, traces/ (2 real-use runs), and a clean-uninstall confirmation.
- The Module 7 retrospective in my-first-loop.md — 300–500 words answering the four retrospective questions.
- The updated Extension Posture section in my-first-loop.md — with the pre-module version visible as historical record.
- A short (2–3 sentence) instructor note on Q9 and Q10 — which description-tuning failure mode the student diagnosed and which row they chose for the hardest hygiene decision.
Transcript language. If a parent is assembling a transcript, the transcript line for Module 7 can accurately say: “Designed, audited, authored, and retired AI extensions — installed and evaluated community plugins with an outbound decision matrix, built and tuned a custom skill across three trigger-test rounds, wrapped it in a custom Cowork plugin with a seven-question security disclosure, and maintained a frozen extension register using a five-step quarterly hygiene ritual.”
Remediation if missed:
- Q1: Re-read Lesson 7.1 Blocks 2–3. The six-place taxonomy is the navigation map for the whole module.
- Q2: Re-read Lesson 7.3 Block 2. Re-run the description-tuning drill in /activities/module-07/description-tuning-drill.html.
- Q3: Re-read Lesson 7.2 Block 4 (the outbound decision matrix). Re-audit one community plugin using the minimum-viable-audit worksheet.
- Q4: Re-read Lesson 7.5 Block 3. The use-has-stopped signal is the quietest and the strongest.
- Q5: Re-read Lesson 7.4 Block 3 (the seven-question security questionnaire). S6 and S7 are the plugin-specific additions.
- Q6: Re-read Lesson 7.5 Block 3. Print the five signals and keep them near your workstation.
- Q7: Re-read Lesson 7.3 Blocks 1–2. Description-as-classifier is the module’s headline technical insight; if it is not clear, nothing else downstream works.
- Q8: Re-read Lesson 7.4 Block 4 (least-privilege, applied). Re-audit your plugin’s SECURITY.md answers against your actual skill’s needs.
- Q9: The description-tuning muscle is the load-bearing authoring skill of Module 7. Return to Lesson 7.3. Run a fourth tuning round; add a changelog entry. This is a core module capability and must be demonstrated before Module 8.
- Q10: The hygiene ritual is the load-bearing discipline of Module 7. Return to Lesson 7.5. Re-run the ritual on at least three rows with the worksheet open; produce a written ritual log alongside the register.
If the student passes at 11.5 / 15 or above with at least one applied question at full credit, Module 7 is complete and Module 8 can begin. Below that bar, target remediation to the specific lesson(s) listed above before moving on.
Next up
Module 8 — Agent Orchestration.
The skills and plugin you shipped in Module 7 are the natural components Module 8 composes. Module 8 teaches you to design systems where multiple agents hand off work — sequential pipelines, parallel workers, hierarchical supervisors — and to debug the handoff failures that only show up when more than one agent is in the loop. A clean Module 7 register makes Module 8 possible; a messy one makes Module 8 miserable.
Module 8 opens when the Module 7 portfolio is complete — this check, the frozen register, the custom skill, the custom plugin, and the Module 7 retrospective.