Here is what a single-agent prompt looks like after six months of growth: it is 4,000 words. It handles research, drafting, code review, client communication, and "anything else that comes up." It references internal formats by name without defining them. It uses the phrase "act as a senior developer" in three different sections, each meaning something different. When you ask it to do a straightforward task, it gets confused about which role applies. When you add a new requirement, you break an old one. You spend more time maintaining the prompt than doing actual work.
This is the swiss-army-knife failure mode. One agent, infinite surface area, zero accountability.
The fix is not a smarter prompt. The fix is what you would do with a human team: break the work into discrete roles, hire for each role, and define how they hand off to each other. This article walks through how to do that with Claude Code's agent teams -- shipped in Claude Code v2.1.32, independent of any model version -- what changed with Opus 4.7, and how to avoid the coordination mistakes that cost you money and correctness.
What an Agent Team Is -- and What It Isn't
An agent team is a set of independent agents, each with a defined role, running under a shared project context. Each agent reads its own profile, has its own model assignment, and operates on a bounded scope of work.
What it is not: a single agent with multiple "modes." It is not a chain of prompts duct-taped together with copy-paste. It is not a workflow where one agent does everything and calls itself a team.
The key constraint that makes a team useful is the same constraint that makes human organizations useful: no single person (or agent) has the full picture. The researcher does not write the final document. The writer does not approve the budget. The reviewer does not redesign the system. Specialization forces clarity. Clarity reduces errors.
In Claude Code, this works through the .claude/agents/ directory (project scope) or ~/.claude/agents/ (user scope). Each file in that directory is a profile for one agent. The team lead -- usually the agent you talk to directly -- reads the project CLAUDE.md and delegates to the right specialist. Specialists return results. The lead synthesizes or escalates.
Three things make this different from a prompt chain: agents can run in parallel, each agent's context is scoped to its role, and the task dependency mechanism provides explicit blocking and sequencing. More on all three below.
Designing the Org Chart Before You Write a Single Prompt
The single most common mistake with agent teams is starting with the tool, not the organization. Someone enables the feature flag, creates three agents named Agent1/Agent2/Agent3, and then figures out what they should do. This produces teams that duplicate work, step on each other, and produce inconsistent output.
Do the org chart first. On paper. Ask four questions:
What are the distinct types of work? Research and writing are different. Writing and review are different. Code generation and code review are different. If two things require different expertise, they probably need different agents.
What are the handoff points? Where does one type of work end and another begin? What is the artifact that moves between agents? A research summary, a draft, a diff, a test result?
What decisions require a human? Agent teams are not autonomous by default, and they should not be. Identify the points where a human needs to say yes before work continues. Build escalation into the design.
Which tasks can run in parallel? If two research tasks do not depend on each other, they can run simultaneously. If the writer needs the researcher's output, that is a serial dependency. Get this right up front or you will either block unnecessarily or produce garbage.
Write this down before you open a text editor. A 10-minute whiteboard session prevents three days of debugging why Agent2 is ignoring Agent1's output.
Technical Setup -- Claude Code Agent Teams in Practice
Set the environment variable:
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1Restart Claude Code. The agent team feature is now active.
Directory structure:
project-root/
CLAUDE.md # Project-wide instructions, rules, team structure
.claude/
agents/
lead.md # Team lead profile
researcher.md # Researcher profile
writer.md # Writer profile
reviewer.md # Reviewer profileFor agents available across all projects, place profiles in ~/.claude/agents/ instead.
CLAUDE.md is the constitution. It defines the project, the team structure, the escalation policy, and any constraints that apply to all agents. Every agent reads it.
Agent profiles live in .claude/agents/. Each file is a markdown document with YAML frontmatter:
---
name: nova
description: Researches technical topics
model: claude-sonnet-4-6
effort: medium
memory: project
---The model field sets which Claude model this agent runs on. The effort field controls how much compute the agent applies to each task. Valid values for effort: xhigh, high, medium, low.
The memory field controls where the agent's persistent memory is stored. Valid values:
user→~/.claude/agent-memory/<agent-name>/project→.claude/agent-memory/<agent-name>/local→.claude/agent-memory-local/<agent-name>/
When memory is configured, the system automatically injects the first 200 lines of the agent's MEMORY.md into its system prompt at startup. This is not a directory you create manually -- it is managed by the runtime.
The body of the profile file is the agent's system prompt: its role, its responsibilities, its constraints, and its output format.
Opus 4.7 Changed the Rules -- Here's What Broke
If you built agent profiles against Opus 4.6 or earlier and upgraded to claude-opus-4-7, some things stopped working. Here is the breakdown.
Literal instruction following. Opus 4.7 reads your instructions more literally than previous versions. Where 4.6 would infer intent from vague instructions, 4.7 does what you said, not what you meant. If you wrote "check the document for issues," 4.6 might return a structured report. 4.7 might return "yes, there are issues" and stop. Audit your profiles for ambiguous instructions and replace them with specific output requirements.
Default tone. Opus 4.7 runs cooler by default. If your profiles relied on implicit warmth or assumed the model would soften outputs naturally, you will notice the difference. Add explicit tone requirements if your role calls for them. "Write in a direct, collegial tone" is a requirement, not a suggestion to be inferred.
Removed and deprecated parameters. Several parameters have been pulled across recent releases:
temperature,top_p,top_k-- removed entirely in Opus 4.7. You cannot set them.budget_tokens-- deprecated in Sonnet 4.6, returns a 400 error in Opus 4.7. Replace with the appropriateeffortlevel in your agent profile.prefill-- removed in Sonnet 4.6, returns a 400 error. Opus 4.7 inherits this -- it is not a new 4.7 breaking change, but if you have not audited for it yet, it will break here too.
If your tooling or API calls set any of these parameters, they will error. Audit your configuration files and remove these fields.
Model ID. The model identifier is claude-opus-4-7. Update any hardcoded references.
The Opus 4.7 changes are not bugs. They reflect a deliberate design choice: explicit over implicit, specified over inferred. Agents built for this model need more precise profiles, but the output is more predictable as a result.
Task Dependencies and the Four Coordination Patterns
Dependencies between agents are managed through a shared task list, not through agent profile frontmatter. The lead agent creates tasks and sets blocking relationships at runtime using TaskCreate and TaskUpdate:
TaskCreate({ subject: "Step 1: Research" })
TaskCreate({ subject: "Step 2: Write" })
TaskUpdate({ taskId: "2", addBlockedBy: ["1"] })A task with a blockedBy reference will not be claimed until all blocking tasks are completed. The system handles unlock automatically -- when a blocking task is marked completed, dependent tasks become available without any manual intervention. Teammates claim tasks from the shared list themselves; the lead agent does not need to push work to them.
Four patterns cover most real use cases.
Pattern 1: Parallel research with synthesis.
Multiple research tasks run simultaneously. A synthesis task is blocked on all of them.
ResearcherA --\
ResearcherB ---+--> Synthesizer
ResearcherC --/TaskCreate({ subject: "Research: source A" }) // id: 1
TaskCreate({ subject: "Research: source B" }) // id: 2
TaskCreate({ subject: "Research: source C" }) // id: 3
TaskCreate({ subject: "Synthesize findings" }) // id: 4
TaskUpdate({ taskId: "4", addBlockedBy: ["1", "2", "3"] })Use this when research tasks are independent but the final output requires integrating all findings. The parallel phase cuts elapsed time; the synthesis phase ensures coherence.
Pattern 2: Linear pipeline.
Each stage depends on the previous one.
Researcher --> Writer --> Reviewer --> PublisherTaskCreate({ subject: "Research" }) // id: 1
TaskCreate({ subject: "Write draft" }) // id: 2
TaskCreate({ subject: "Review" }) // id: 3
TaskCreate({ subject: "Publish" }) // id: 4
TaskUpdate({ taskId: "2", addBlockedBy: ["1"] })
TaskUpdate({ taskId: "3", addBlockedBy: ["2"] })
TaskUpdate({ taskId: "4", addBlockedBy: ["3"] })Use this for content production, code generation, or any workflow where each step requires the previous step's output as input. Each agent works on a finished artifact, not a work-in-progress.
Pattern 3: Structured debate.
Three agents take different positions on a question. A decision task is blocked on all three.
ProponentA --\
ProponentB --+--> DecisionMaker
SkepticC --/TaskCreate({ subject: "Argue for approach X" }) // id: 1
TaskCreate({ subject: "Argue for approach Y" }) // id: 2
TaskCreate({ subject: "Argue against both" }) // id: 3
TaskCreate({ subject: "Rule on approach" }) // id: 4
TaskUpdate({ taskId: "4", addBlockedBy: ["1", "2", "3"] })Use this for architecture decisions, strategy questions, or any situation where you want multiple perspectives before committing. Assign explicit positions in each agent's profile. "Argue for approach X" produces better analysis than "consider approach X."
Pattern 4: Quality gates.
Work stops at defined checkpoints until a gate passes.
Planner --> [Human approval] --> Executor --> ReviewerTaskCreate({ subject: "Plan" }) // id: 1
TaskCreate({ subject: "Human approval" }) // id: 2
TaskCreate({ subject: "Execute" }) // id: 3
TaskCreate({ subject: "Review" }) // id: 4
TaskUpdate({ taskId: "2", addBlockedBy: ["1"] })
TaskUpdate({ taskId: "3", addBlockedBy: ["2"] })
TaskUpdate({ taskId: "4", addBlockedBy: ["3"] })Use this when the cost of a wrong decision is high. The plan is cheap to produce and cheap to reject. Execution is expensive. Put the gate before the expensive step.
These patterns compose. A real project might use parallel research feeding into a pipeline, with a quality gate before publication.
Token Cost Is a Team Budget -- Spend It Like One
Token cost in a multi-agent system is not a single agent's problem. It is a budget allocation question.
The baseline fact: parallel agents reduce elapsed time but do not reduce total token consumption. Running three research agents simultaneously takes roughly the same tokens as running them sequentially. The benefit is speed. If speed matters, pay for it. If it does not, serialize and save.
Model selection is the biggest lever.
- Haiku for routine tasks: classification, extraction, formatting, straightforward review against a checklist. These tasks do not need deep reasoning. Haiku is fast and cheap. Use it.
- Sonnet for standard work: research synthesis, first-draft writing, code generation, moderate-complexity analysis.
- Opus for tasks that require sustained reasoning, complex judgment, or where errors are expensive to fix.
The worst anti-pattern is using Opus for everything because "it's the best." A Haiku reviewer that checks output against an explicit checklist is faster and cheaper than an Opus reviewer doing the same task, and the checklist-constrained output is often more consistent.
The second worst anti-pattern: agents that duplicate context. If ResearcherA and ResearcherB both load the entire project knowledge base, you pay twice. Scope each agent's context to what it actually needs. The CLAUDE.md provides shared context; agent profiles provide role-specific context. Do not repeat the shared context inside each agent profile.
Practical model allocation for a content team:
| Role | Model | Rationale |
|---|---|---|
| Team lead / coordinator | Sonnet | Coordination and routing, not deep analysis |
| Researcher | Sonnet | Research synthesis requires judgment |
| Writer | Sonnet | Writing requires sustained coherence |
| Reviewer (checklist) | Haiku | Checklist comparison is mechanical |
| Escalation judgment | Opus | High-stakes decisions justify the cost |
Track your token spend by agent role for the first few projects. The distribution will tell you where to optimize.
Writing Agent Profiles That Actually Work
Most agent profiles fail for one of four reasons.
Loose verbs without criteria. "Assess the document," "interpret the requirements," "determine the appropriate approach." These verbs mean different things depending on context, and Opus 4.7 will not fill in the blanks for you. Replace every loose verb with a specific action and a specific output.
Before: "Assess the document for quality issues." After: "Read the document. List each factual claim that lacks a source citation. List each section that exceeds 300 words without a concrete example. Return both lists as numbered items."
Implicit format references. "Format the output in the standard way," "follow the usual structure," "use our template." The agent has no access to "standard" or "usual" unless you define them in the profile or link to a file that contains them. Write out the format, or reference a specific file path the agent can read.
Soft requirements. "Ideally include a summary," "a table of contents would be helpful," "consider adding examples." Soft language produces inconsistent output. If it matters, it is a requirement. Write it as one.
Before: "A summary at the end would be helpful." After: "End the document with a summary section. The summary contains three to five bullet points. Each bullet states one finding in plain English."
Inoperable metaphors. "Get a feel for the rhythm of the text," "use your judgment about whether this sounds right," "make it feel natural." These are not instructions. An agent cannot operationalize "feel for the rhythm." Replace metaphors with observable criteria.
Before: "Make sure the tone feels right." After: "Read each paragraph aloud (simulate this). If a sentence exceeds 25 words, split it. If a paragraph exceeds 100 words, check that it contains at most one main claim. Flag any paragraph that makes a claim without a supporting example."
The test for a well-written profile: give it to someone who has never seen your project and ask them to do the job manually, following the profile literally. If they produce the output you expect, the profile works. If they have questions, the profile has gaps.
Building a Mini-Team From Scratch -- A Complete Example
Here is a complete content team. Four agents, one pipeline, built to spec.
Team structure:
- Emma (Team Lead): coordinates, escalates to the user, delegates
- Nova (Researcher): researches technical background -- Sonnet
- Sage (Writer): writes based on Nova's output -- Sonnet
- Rex (Reviewer): checks facts and tone -- Haiku
Pipeline:
Nova researches
--> Sage writes (blocked on Nova's task)
--> Rex reviews (blocked on Sage's task)
--> Emma delivers (blocked on Rex's task)Emma sets up the task dependencies at the start of each run:
TaskCreate({ subject: "Research: [topic]" }) // id: 1
TaskCreate({ subject: "Write draft" }) // id: 2
TaskCreate({ subject: "Review draft" }) // id: 3
TaskCreate({ subject: "Deliver to user" }) // id: 4
TaskUpdate({ taskId: "2", addBlockedBy: ["1"] })
TaskUpdate({ taskId: "3", addBlockedBy: ["2"] })
TaskUpdate({ taskId: "4", addBlockedBy: ["3"] })Nova, Sage, and Rex claim tasks from the shared list when they become available. The system unlocks each task automatically when its blocker completes.
This team uses artifact files for handoffs rather than per-agent memory. Add memory: project to an agent's profile when it needs to accumulate state across runs -- for example, a reviewer that learns project conventions over time.
CLAUDE.md (excerpt):
# Content Team
## Team structure
- Emma: team lead, coordinates, escalates only
- Nova: researcher, Sonnet
- Sage: writer, Sonnet
- Rex: reviewer, Haiku
## Escalation policy
Emma escalates to the user when: Rex returns FAIL twice on the same document,
Nova cannot find a primary source for a key claim, Sage cannot complete a draft
due to contradictory requirements.
## Handoff artifacts
Nova delivers: a research brief in `artifacts/research-brief.md`.
Sage delivers: a draft in `artifacts/drafts/[slug]-draft.md`.
Rex delivers: a review report in `artifacts/reviews/[slug]-review.md`..claude/agents/nova.md:
---
name: nova
description: Researches technical topics and delivers research briefs
model: claude-sonnet-4-6
effort: medium
---
You are Nova. You research technical topics and deliver research briefs.
For each research task:
1. Identify the three to five primary sources most relevant to the topic.
2. For each source, write: source name, URL or citation, one-sentence summary of what it contributes.
3. Write a 200-400 word synthesis that answers the specific question you were given.
4. List any claims you could not verify with a primary source, marked [unverified].
Save your output to artifacts/research-brief.md. Do not write anything else..claude/agents/sage.md:
---
name: sage
description: Writes articles based on Nova's research brief
model: claude-sonnet-4-6
effort: medium
---
You are Sage. You write articles based on Nova's research brief.
Before writing:
1. Read artifacts/research-brief.md.
2. Identify the central claim the article will make.
3. Identify three to four supporting points from the research.
Write the article:
- Target length: 800-1200 words.
- One claim per paragraph.
- No paragraph longer than 150 words.
- Every claim that appears in the research brief must cite the source inline: (Source Name).
- Do not introduce claims that are not in the research brief.
- Do not use: "robust", "powerful", "seamless", "cutting-edge", "it is worth noting".
Save your output to artifacts/drafts/[slug]-draft.md..claude/agents/rex.md:
---
name: rex
description: Reviews drafts against a checklist and returns PASS or FAIL
model: claude-haiku-4-5-20251001
effort: low
---
You are Rex. You review drafts against a checklist. You return PASS or FAIL.
Checklist:
1. Every factual claim has a source citation.
2. No paragraph exceeds 150 words.
3. The following words do not appear: robust, powerful, seamless, cutting-edge, it is worth noting, additionally at the start of a sentence.
4. The article has a clear central claim stated in the first 100 words.
For each checklist item:
- PASS: state the item number and "pass".
- FAIL: state the item number, "fail", and quote the specific text that fails.
Final decision: PASS if all items pass. FAIL if any item fails.
Save your review to artifacts/reviews/[slug]-review.md..claude/agents/emma.md:
---
name: emma
description: Coordinates the content team and delivers final output to the user
model: claude-sonnet-4-6
effort: medium
---
You are Emma. You coordinate the content team and deliver final output to the user.
Before delivering:
1. Read artifacts/reviews/[slug]-review.md.
2. If Rex returned PASS, present the draft location to the user and summarize the review result.
3. If Rex returned FAIL, do not deliver the draft. List the specific failures and ask the user whether to revise or escalate.
You do not write, research, or review. You coordinate and report.This is a working team. Each agent has a single job, explicit inputs, explicit outputs, and defined criteria. Rex does not need Opus to run a checklist. Emma does not need to re-read the research -- she reads the review.
To run this on a new topic: give Emma the task. Emma delegates to Nova. Nova researches and writes to artifacts/research-brief.md. Sage picks up the brief and drafts. Rex reviews the draft. Emma delivers or escalates.
What Does Not Work Yet
Agent teams in Claude Code are experimental. The CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 flag is not a stable API. Behavior can change between releases without deprecation warnings.
Persistent state works through the system-managed memory frontmatter field (project, user, or local), which stores each agent's MEMORY.md under .claude/agent-memory/<agent-name>/ -- not through a generic project folder called memory/. Concurrent writes from parallel agents to the same artifact file can produce conflicts; design your pipelines so that writes happen in serial or to separate files.
The blockedBy mechanism does not currently support conditional dependencies. You cannot say "block on A only if A returned FAIL." All gating logic that requires conditions has to be built into the lead agent's profile as decision logic.
Cross-session state works if agents use the memory frontmatter field -- the runtime injects their persisted MEMORY.md at startup. It does not work if agents rely on conversational context. Every agent starts with its profile and the files it reads. Nothing else carries over.
Long-running pipelines with many handoff points accumulate context. An agent deep in a pipeline that reads all previous artifacts will have a large context window. Watch for this in complex pipelines. Keep artifacts small and focused.
Next steps
- Set the environment variable and restart:
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. - Map your current single-agent prompt to a list of distinct role types.
- Build the org chart on paper before writing a single profile.
- Start with a three-agent team: lead, worker, reviewer. Add agents when you have a specific job that does not fit.
- Assign models by task type, not by prestige. Haiku for checklists. Sonnet for drafts.
- Write profiles with specific verbs, explicit output formats, and no soft requirements.
- Test the team on a small task before scaling to a full workflow.
- Audit token spend by agent after the first real run. Optimize where cost and value are out of proportion.
The team concept works because it mirrors how real work gets done: divided by expertise, coordinated by explicit handoffs, and checked at the gates that matter. The technology is experimental. The organizational principle is not.