From “summary” to “research” CORE
You have asked an AI to summarize something before. You typed “explain quantum entanglement” or “give me the arguments for and against a four-day work week” and the model produced a few paragraphs. That output looked like research. It was not.
What you received was the model’s compressed memory of its training data, phrased as a summary. The model did not go read any sources. It did not check whether its training was accurate. It did not look at anything newer than its cutoff date. It gave you an average of what it had seen, stated in a confident voice. That is a summary, and for many tasks — explaining a term, giving an overview — a summary is a perfectly fine output.
A research agent is a different kind of thing. It goes and retrieves. It runs searches against the live web, opens pages, reads documents you give it, follows links, and brings actual sources back into the conversation. The output is anchored in specific, nameable places — not an average of everything it has ever seen.
The difference matters for three reasons. First, freshness: a research agent can tell you something that happened last week; a summary cannot. Second, verifiability: a research agent’s claims are traceable to sources you can open; a summary’s claims are traceable to nothing. Third, correctability: when a research agent is wrong, you can find out by checking its sources; when a summary is wrong, you typically cannot tell.
This is the same category shift you learned in Module 3 — from suggester to coding agent. There, the jump was tool use: the agent could read files, run commands, verify results. Here, the tool is web retrieval (and sometimes file-reading tools pointed at your local sources). The principle is identical. An agent that can reach outside itself is a different category from one that cannot.
The four moves of a research agent CORE
Every real research task a directed agent performs runs through four moves. Naming them gives you the handles you need to direct the work and check it.
Scope. The agent — or, much better, you — figures out what question is actually being asked. Not “tell me about X” but “I want to know Y about X, for purpose Z, and here is what would count as an answer.” A huge fraction of bad research-agent output is the result of a scope that was left vague. Scoping is almost entirely a human move; the agent can help, but you are the one who knows why you are asking.
Retrieve. The agent fetches candidate sources. It runs web searches, fetches pages, opens documents. A good agent does this across a range of source types — academic, journalistic, primary-document, domain-specialist — rather than grabbing whatever is at the top of a single search. A bad agent grabs the first four Google results and calls it done.
Triangulate. The agent compares what those sources say. Which claims appear in multiple independent sources? Which appear in only one? Which sources disagree, and why? Which are primary and which are secondary? Triangulation is where research stops being a pile of links and starts being an answer that can be defended.
Synthesize. The agent turns the triangulated findings into a structured output — a brief, a memo, a comparison, a paragraph of prose — that a human reader can use. Synthesis is where clarity happens, and also where the biggest honesty test lives: does the output represent what the sources actually say, including their uncertainty, or does it smooth everything into confident-sounding prose?
As the director, you check three of those moves and read the fourth critically.
You own scope. The agent does not know why you are asking; you do. If scope is wrong, everything downstream is wrong, no matter how well the agent runs.
You audit retrieve by asking: what sources did it actually fetch? Are they the right kind? Is anything important missing? Is any of it fabricated? You look at the list, not just the summary.
You check triangulate by reading enough of the sources yourself to confirm the agent’s comparisons. This is the single move most likely to go wrong invisibly — the agent can claim two sources “agree” when they are addressing slightly different questions, or claim a lone source is “the consensus” when it is an outlier.
You read synthesize the way you read a diff in Module 3: line by line. If a sentence makes a claim, ask what source supports it. If no source does — if it is a smooth connector sentence that sounds right but is not actually supported — that is the research equivalent of a silent bug. It will slip past every casual reader, and it will still be wrong.
Why fabrication risk is highest here CORE
Of all the directed work in this course, research is the one where the agent is most likely to produce a confident wrong answer — and the one where that wrong answer is most likely to ship without being noticed.
Three things stack.
The model wants to sound helpful. Language models are trained, among other things, to produce answers that look complete. When the real answer is “I could not find a reliable source on this,” the model’s strong pull is to produce a plausible-looking citation instead. That is where hallucinated citations come from — references to papers that do not exist, quotations that were never written, URLs that 404. Research agents with retrieval tools hallucinate less than raw chat, but they still hallucinate. Sometimes the agent retrieves real pages and then invents a quote that is not in any of them.
Research outputs are low-feedback. If a coding agent writes broken code, the test fails and you know. If a research agent writes a confident sentence supported by a fabricated source, no alarm goes off. The only way to catch it is to open the source. Nothing else will catch it. This is the exact reason Module 4 installs you do not cite what you have not opened as its operational safety rule.
The reader trusts the format. A research brief with a list of footnotes and URLs looks authoritative. It inherits the visual credibility of a genre built by humans who actually checked. A reader — including you, the student who wrote the prompt — will tend to trust that format unless they have a reason not to. That is exactly the trust the fabricated citation is borrowing.
The defense is not cleverness. The defense is a habit: every source in your final output has been opened, skimmed, and confirmed by you. You are the only part of the system that can reliably catch this. That is the habit this module is here to install.
When a research agent is the right tool, and when it is not CORE
Not every information request needs a research agent. Three categories help.
Well-suited to a research agent. Questions where the answer depends on sources outside the model’s head, where you want a range of viewpoints or sources, and where “I need to be able to defend this” matters. Examples: “What do current state-by-state laws say about X?” “How have three major news outlets covered Y this month?” “What are the arguments, pro and con, for switching our small business from accounting software A to accounting software B?” “Summarize what the last three years of published research says about sleep and high-school academic performance.”
Well-suited to a summary (no retrieval). Questions where you want a concept explained, a term defined, an idea mapped out, and the model’s training is plenty. Examples: “Explain what a Fourier transform is.” “Give me the gist of the Cold War in one page.” “What does ‘ontology’ mean?” Using a research agent here is not wrong — it will still work — but you are paying for retrieval you did not need.
Poorly suited to either. Questions where the right move is to read the primary document yourself. Examples: “What does this 2,000-word essay argue?” (read the essay). “What does clause 4B of this contract mean?” (read the contract, possibly with an agent helping you parse, but with your eyes on the text). “What does my grandmother mean in her letter?” (read the letter). The research agent can help you index long reading, but it cannot substitute for you engaging with a single document that matters.
There is also a fourth category worth naming: questions that are dangerous to let an agent answer alone, because the stakes of a confident wrong answer are high and the verification cost is also high. Medical questions about your own body, legal questions about your own case, financial decisions you are about to act on — these are not “research agent says” questions. They are “research agent helps me find the right specialist” questions.
Picking a topic to thread through the module CORE
This is the first module where your homework is not a standardized exercise. It is a real research question you actually need answered. The rest of Module 4’s practice is built on this topic — you will scope it in 4.2, fact-check a claim inside it in 4.3, synthesize it in 4.4, and ship three outputs about it in 4.5. So pick a topic you actually care about, or at least actually need.
Three decent sources for a topic:
A school paper or assignment. If you are writing a paper this term on a real topic, use that. The assignment’s existing scope is a head start, and you end the module with work your teacher can grade.
A decision in your life. A summer job you are considering, a college you are weighing, a product you are about to buy, a software choice for a side project, a skill you are considering investing a year in. Real decisions make good research topics because you actually care whether the answer is right.
A question inside the work a parent does. A homeschool parent running a small tutoring business, a family business, a freelance practice — there is almost always a bounded research question inside that work they would pay to have answered honestly (“which of these three invoicing tools is actually right for us?”). Ask if you can pick theirs.
What makes a good topic: it is bounded (you can describe the answer in a paragraph, not a book), the answer depends on sources outside the model, and you will actually care about the quality of the answer.
What makes a bad topic: it is too broad (“tell me about AI”), it is purely conceptual (“what is consciousness”), it is a summary disguised as research (“explain the French Revolution”), or you picked it because it sounded impressive and you do not actually need the answer.
One topic. Commit to it in the project checkpoint at the end of this lesson. You can change it once, in Lesson 4.2, if scoping reveals it does not work. After 4.2, you are locked in.
Is the Cowork tab's web research live? RECIPE
| Tool | Claude desktop app — Cowork tab (primary). Optional advanced: Claude Code CLI. |
| Last verified | 2026-04-17 |
| Next review | 2026-07-17 |
| OSes | macOS, Windows |
Module 4 lives in the Cowork tab — the autonomous mode of the Claude desktop app you met in Lesson 2.4. Cowork is the right home for research because the work is asynchronous (you hand the agent a task, it runs in a cloud environment, you come back to a finished result) and because it has web-research tools available out of the box.
A common early mistake is running a “research” session against an agent that has no retrieval tools at all. What you get back looks like research and is actually a summary — the model answering from training, dressed up with confident-sounding phrasing. Before you start any Module 4 hands-on, confirm.
In the Cowork tab. Before starting a research session, check that the session has a web-research or browsing tool enabled. Look for a web-search, fetch-URL, or “browse” capability in the tools panel or session settings. If the panel shows no retrieval tool, add one (via the MCP registry or built-in tools) or switch to a session that has one. If you are unsure, ask the agent: “List the tools you currently have access to.” A working research agent will name its web-research tool. A summary-only session will list no retrieval tool or will say it does not have one.
Optional advanced — Claude Code CLI. Same engine, terminal interface. Click to expand if you set up the CLI in Lesson 2.4 and want to script research from a shell.
From the terminal where the CLI is running, type /status (or the current status command — check --help if it has changed). The output should include the MCP tools or built-in tools available to the agent. If you see a web-search or web-fetch entry, retrieval is live. If not, you are in a summary-only session. Stop and enable retrieval before continuing.
Safe default
Run a ten-second sanity check at the start of every research session: ask the agent to find something that happened after its likely training cutoff (“What is the headline on the home page of ft.com right now?”). If it answers accurately, retrieval is live. If it says it cannot browse, or confidently invents a headline, you are in a summary session.
Try it — Anatomy of a research task CORE
worksheet deliverable · open the printable worksheet →
You are given six short information requests. For each, you will:
- Classify it as (a) best served by a research agent with retrieval, (b) fine with a summary / no retrieval, (c) best served by reading a primary document yourself, or (d) too high-stakes to let an agent answer alone.
- For the research-agent-suited ones, write a one-paragraph scoping sketch: what specific question you would ask, what would count as an answer, and what kinds of sources you would want the agent to pull from.
- For the research-agent-suited ones, note one specific source type you'd want the agent to pull from (a government dataset, a peer-reviewed paper, a news outlet, the product's own page, etc.) — this is the kind of constraint your scoping brief will name explicitly in Lesson 4.2.
The tasks:
- “What are the three main arguments for a year-round school calendar, and the three main counter-arguments, using sources published since 2023?”
- “Define mitochondrial DNA.”
- “My mom is picking between two laptop models for our family homeschool — a ThinkPad X1 and a MacBook Air M3. Which is the better fit for our setup, and why?”
- “Summarize the plot of Great Expectations.”
- “What are the medication interactions of drug A and drug B, and should I take them at the same time?”
- “This letter is from my grandfather, written in 1974. What is he arguing in it?”
Deliverable. A one-page table: task → category → scoping sketch (if applicable) → preferred source type → one-sentence reason. Keep it. Lesson 4.2 will reuse the structure of task 1 or 3 as the template you build your own scoping brief on.
Done with the hands-on?
When the recipe steps and any activity above are complete, mark this stage to unlock the assessment, reflection, and project checkpoint.
Quick check
Five questions. Tap a question to reveal the answer and the reasoning.
Show explanation
Answer: C. The category difference is retrieval, not model size, price, or length. A tiny local model wired to a web-search tool is a research agent; a frontier model with no retrieval is a summarizer. A and B are confusions with size/cost. D is sometimes true and sometimes not — length is not the defining line.
Show explanation
Answer: D. The agent does not know why you are asking; you do. Getting scope right is almost entirely a human move — you decide what question is actually being asked and what would count as an answer. The agent can assist (offer clarifying questions, sharpen the phrasing), but if scope is wrong, everything downstream is wrong. The other three moves are agent-led with human audit; scope is the reverse.
Show explanation
Answer: B. “Cite only what you have opened” is the operational rule of this module, and the reason is exactly this scenario — fabricated citations look correct, and the only reliable defense is a human opening the source. A is the trap the whole lesson is written to warn against. C is better than A but is not the habit the course installs, because students routinely pick the two to spot-check in a way that misses the fabricated one. D is a well-meaning mistake — a second AI will often confirm the fabrication, because the format is persuasive.
Show explanation
Answer: C. A personal letter from someone you know is a primary document, short enough to read yourself, and the meaning depends on context only you have (who she is, what she usually means by this phrase, what you were talking about last week). The agent can help you parse, but it cannot substitute for your reading. A, B, and D are all well-suited: they depend on sources outside the model, you want a range, and you want to be able to defend the answer.
Show explanation
Answer: B. The rule is uniform and absolute. A is what the lesson identifies as the normal failure mode — students spot-check in a way that routinely misses the fabricated entry. C and D carve out the exact cases where fabrication is most likely to slip through — shorter sources and news URLs are among the most commonly hallucinated kinds. The rule exists because you cannot reliably tell in advance which sources are safe.
Reflection prompt
Summary or research — and did you actually open the sources?
In 6–8 sentences: Think of the last time you used AI to “do research” for you. Using this lesson’s definitions, was it a research agent or a summary? How many of the sources it cited — if it cited any — did you actually open? If zero, what would have been different about the work if you had opened each one? And: what is a question you actually need answered right now that a research agent would be the right tool for, that you would be willing to run for real in this module?
The last half of the prompt is the important half. Module 4’s capstone is built on a topic the student picks here. A half-serious topic produces half-serious capstone work. A real one produces work the student keeps using after the course ends.
Project checkpoint
Add a “Research Surface” section to my-first-loop.md.
Open your my-first-loop.md capstone file. You already have sections for Goal, Model, Tools, Secrets, Cost ceiling, and (from Module 3) Directed Edit Style. Add a new section called Research Surface with this template:
Research Surface. The kind of question I most often want a research agent to answer for me is: [one sentence — what shape of question, in your life].
The human who will most often read the answer is: [me, a parent, a teacher, a client — name them].
If the agent gave me a wrong but confident-sounding answer, the cost would be: [one sentence — what actually gets hurt].
My Module 4 topic is: [one sentence — the real research question you are threading through the rest of this module].
Keep it short. The “Module 4 topic” line is the one the rest of the module will reference. If you need one more day to think about it, that is fine — but do not start Lesson 4.2 without a topic committed.
Next in Module 4
Lesson 4.2 — Scoping the research question.
Turn your committed topic into a five-part scoping brief the agent can actually work from. You will write it, run it, and save the first draft to capstone-entry-1-draft.md.