One generalist agent juggling every task is a bottleneck. Multi-agent systems split the work across specialists that collaborate, debate, and verify each other. The result is higher accuracy on complex tasks and a system that scales by adding agents instead of bloating prompts.
When You Actually Need Multi-Agent
Multi-agent is the right call when:
- The task spans distinct domains (research, coding, writing, review)
- Sub-tasks run in parallel
- You need adversarial verification (planner vs critic)
- Different agents need different tools or models
If a single ReAct loop with five tools handles your workload, do not reach for multi-agent. The orchestration cost is real.
Common Topologies
Supervisor / Worker
A supervisor decomposes the goal and delegates to specialist workers. Workers report back; the supervisor synthesizes.
Pipeline
Agents pass output to the next stage like a Unix pipe: Researcher → Writer → Editor → Publisher.
Debate
Two or more agents argue opposing positions; a judge agent picks the winner. Improves reasoning on hard problems.
Mesh
Agents communicate peer-to-peer via a shared message bus. Powerful but expensive to coordinate — reserve for genuinely emergent workflows.
Building a Supervisor with LangGraph
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import create_react_agent
researcher = create_react_agent(model, tools=[web_search])
coder = create_react_agent(model, tools=[run_python])
writer = create_react_agent(model, tools=[])
def supervisor(state):
next_agent = pick_next(state["messages"])
return {"next": next_agent}
graph = StateGraph(AgentState)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher)
graph.add_node("coder", coder)
graph.add_node("writer", writer)
graph.add_conditional_edges("supervisor", lambda s: s["next"])
graph.add_edge("researcher", "supervisor")
graph.add_edge("coder", "supervisor")
graph.add_edge("writer", END)
app = graph.compile()
Communication Protocols
Agents need a shared language. Three common formats:
- Free-form text: easy to start, hard to parse reliably
- Structured JSON: machine-readable, schema-validated, our default
- Function calls: native LLM tool-call format, best for typed workflows
Shared State Management
Every agent reads from and writes to a shared state object. Treat it like a database transaction:
- Append-only event log for auditability
- Single writer per field to avoid conflicts
- Versioning so agents can detect stale views
Cost and Latency
Multi-agent multiplies LLM calls. Mitigations:
- Use smaller models (Haiku, GPT-4-mini) for routine specialists
- Cache deterministic sub-task outputs
- Run independent agents in parallel
- Set hard step limits on every agent
Failure Handling
One bad agent should not crash the system. Wrap each agent invocation with timeouts, retries, and fallbacks. Log every handoff. Make the supervisor capable of marking a worker as unhealthy and rerouting.
Real-World Applications
- Software engineering: planner, coder, tester, reviewer
- Investment research: analyst, fact-checker, risk reviewer
- Content creation: researcher, writer, editor, SEO optimizer
- Customer support: classifier, retriever, drafter, escalator
Evaluation
Score the system end-to-end, not individual agents. Track success rate on the user-visible goal, total tokens used, and time-to-completion. Trace every handoff to debug regressions.
Conclusion
Multi-agent systems pay off when your problem genuinely decomposes — not before. Pick a topology that fits the workflow, define a strict communication contract, and instrument everything. Done well, multi-agent shifts from a research curiosity to a production multiplier.