Skip to main content

What is Swarm Mode?

Swarm mode deploys multiple persistent, autonomous agents that work in parallel to solve complex tasks. Unlike the standard DAG or Supervisor workflows where agents execute once and return results, swarm agents run iterative reason-act loops — checking their mailbox, calling tools, sharing findings with teammates, and converging independently. A swarm supervisor monitors execution and can dynamically spawn helper agents when an agent requests assistance.

How It Works

Shannon’s swarm workflow follows a four-phase lifecycle:

Phase 1: Task Decomposition

The supervisor receives your query and breaks it into subtasks using the same decomposition engine as other workflows. Each subtask becomes the assignment for one agent.
Query: "Compare AI chip markets across US, Japan, and South Korea"

Subtasks:
├── Agent 1: "Research US AI chip market landscape"
├── Agent 2: "Research Japan AI chip market landscape"
└── Agent 3: "Research South Korea AI chip market landscape"

Phase 2: Agent Spawning

For each subtask, the supervisor spawns an AgentLoop child workflow. Each agent receives:
  • A unique name (deterministic, based on Japanese station names)
  • Its subtask description
  • A team roster listing all agents and their assignments
  • Access to the shared workspace

Phase 3: Parallel Execution

All agents work simultaneously. Each runs up to 25 iterations (configurable) of a reason-act cycle:
  1. Check mailbox for messages from other agents
  2. Read shared workspace for findings published by teammates
  3. Call LLM to decide the next action
  4. Execute action (tool call, publish data, send message, request help, or finish)
  5. Check convergence (is the agent stuck or done?)

Phase 4: Synthesis

Once all agents complete, the supervisor collects results and produces a unified response. If there is only one agent result, it returns directly. For multiple results, an LLM synthesis step merges findings into a coherent answer.

Agent Actions

Each iteration, an agent chooses exactly one action:
ActionDescription
tool_callExecute a tool (web search, file read, etc.)
publish_dataShare findings with the team via the workspace
send_messageSend a direct message to a specific teammate
request_helpAsk the supervisor to spawn a new helper agent
doneReturn final response and exit

Inter-Agent Communication

Swarm agents collaborate through two mechanisms:

P2P Messaging

Agents send direct messages to specific teammates through Redis-backed mailboxes. Message types include request, offer, accept, delegation, and info. Before each LLM call, the agent’s mailbox is checked for new messages. Incoming messages appear in the agent’s prompt context.

Shared Workspace

Agents publish findings to topic-based workspace lists. Before each iteration, every agent fetches recent workspace entries from all topics, so the entire team stays aware of collective progress.
Shared Workspace:
├── Topic: "findings"
│   ├── Agent-Takao: "NVIDIA dominates US with 80% market share..."
│   └── Agent-Mitaka: "Japan focuses on edge AI chips..."
└── Topic: "sources"
    └── Agent-Kichijoji: "Samsung foundry plans announced..."

Dynamic Agent Spawning

When an agent encounters a subtask it cannot handle alone, it can request help from the supervisor:
  1. Agent sends a request_help action with a description and required skills
  2. Supervisor receives the request through its mailbox (polled every 3 seconds)
  3. Supervisor spawns a new AgentLoop with the helper task
  4. Supervisor notifies the requesting agent with the new agent’s ID
Safety limits: Each agent can spawn at most one helper, and total agents are capped at the configured maximum (default: 10).

Convergence Detection

Three mechanisms prevent agents from running indefinitely:
If an agent takes 3 consecutive iterations without using any tools (only messaging or publishing), it is considered converged and returns partial findings.
If 3 consecutive permanent tool errors occur (not transient errors like rate limits), the agent aborts and reports the failure.
On the last iteration, if the agent has not called done, the workflow forces completion and builds a summary from the most recent iterations.
Transient errors (rate limits, timeouts, 503s) trigger automatic retry with escalating backoff (5s increments, max 30s) and do not count toward the abort threshold.

When to Use Swarm vs Other Workflows

ScenarioRecommended Workflow
Simple Q&A, single-step tasksSimple / DAG
Multi-step research with citationsResearch Workflow
Tasks requiring real-time agent collaborationSwarm
Tasks where agents need to share intermediate findingsSwarm
Long-running exploration with dynamic subtask discoverySwarm
Tasks needing tool iteration (search, analyze, refine)Swarm
Swarm mode uses more tokens than standard workflows because each agent runs multiple LLM iterations. Use it for tasks that genuinely benefit from persistent, collaborative multi-agent execution.

Configuration

Swarm behavior is controlled via config/features.yaml:
ParameterDefaultDescription
swarm.enabledtrueEnable/disable swarm workflow
swarm.max_agents10Maximum total agents (initial + dynamic)
swarm.max_iterations_per_agent25Max reason-act loops per agent
swarm.agent_timeout_seconds600Per-agent timeout (10 minutes)
swarm.max_messages_per_agent20P2P message cap per agent
swarm.workspace_snippet_chars800Max chars per workspace entry in prompt
swarm.workspace_max_entries5Max recent entries shown to each agent

Streaming Events

Swarm workflows emit SSE events for real-time monitoring:
Event TypeAgent IDWhen
WORKFLOW_STARTEDswarm-supervisorWorkflow begins
PROGRESSswarm-supervisorPlanning, spawning, monitoring, synthesizing
AGENT_STARTEDAgent nameAgent begins first iteration
AGENT_COMPLETEDAgent nameAgent finishes
WORKFLOW_COMPLETEDswarm-supervisorFinal synthesis complete

Next Steps