Description-Tuning Drill
Module 7, Lesson 7.3 — the description is the classifier. Find out where five candidate descriptions fail (or don't).
How the drill works
- Five skill descriptions follow. Each is shown as it would appear in the skill's frontmatter — exactly the text the agent's classifier would read.
- Each description is paired with five test requests. For every request, predict whether the classifier should fire this skill on that request (YES) or should not (NO).
- After five predictions, pick the description's failure mode:
Vague Description is under-specified. Fires on many requests that have nothing to do with the skill's actual job. High false-positive rate. Overfit Description is tied to a one-time, hyper-specific context (named class, named teacher, dated assignment). Fails to fire on valid adjacent requests. High false-negative rate. Triggerless Description has no descriptive clauses — nothing the classifier can match against. Rarely fires, and the reasons are opaque. Well-tuned D+I+E shape: descriptive about the job, inclusive of valid invocation phrases, exclusive of what it is not for. Low both-kinds-of-error rate.
- Submit to see which predictions matched the reference, the correct failure mode, and the one-sentence fix (or, if already well-tuned, the reason nothing needs to change).
There is no perfect score. The drill is calibration. The descriptions you disagreed with matter more than the ones you nailed — those are the failure modes most likely to recur in your own 7.3 authoring work.