Deep Research Agent
This tutorial shows how to use Shannon’s ResearchWorkflow for comprehensive, citation-backed research reports.What You’ll Learn
- Multi-stage workflow: Memory retrieval → Refine → Decompose → Execute → Citations → Gap Filling → Synthesis → Verification
- Adaptive patterns: React, Parallel, and Hybrid execution based on complexity
- Advanced features: Entity filtering, gap-filling, context compression, react-per-task mode
- Quality controls: Citation requirements, coverage enforcement, claim verification
- Language matching: Automatically responds in the user’s query language
Prerequisites
- Shannon stack running (Docker Compose)
- Gateway reachable at
http://localhost:8080 - Auth defaults:
- Docker Compose: authentication is disabled by default (
GATEWAY_SKIP_AUTH=1). - Local builds: authentication is enabled by default. Set
GATEWAY_SKIP_AUTH=1to disable auth, or include an API key header-H "X-API-Key: $API_KEY".
- Docker Compose: authentication is disabled by default (
Workflow Architecture
The ResearchWorkflow combines multiple stages to produce high-quality, cited research: Key Features:- Memory-aware: Uses conversation history for contextual research
- Entity-focused: Filters results when entity is detected (e.g., specific company/product)
- Self-correcting: Gap filling ensures comprehensive coverage
- Quality-enforced: Per-area coverage validation (≥600 chars, ≥2 citations per section)
Quick Start (HTTP)
Stream Events (SSE)
LLM_OUTPUT: final synthesis textDATA_PROCESSING: progress/usageWORKFLOW_COMPLETED: completion
Execution Patterns
Shannon selects the optimal execution pattern based on task complexity and dependencies:React Pattern (complexity < 0.5)
Iterative reason→act→observe loop for simpler research.Parallel Pattern (no dependencies)
Concurrent subtask execution for multi-faceted research.Hybrid Pattern (has dependencies)
Topological sort + fan-in/fan-out for sequential research with dependencies.React Per Task (deep research mode)
Mini ReAct loops per subtask for high-complexity research. Trigger conditions:- Manual:
context.react_per_task = true - Auto-enable: complexity > 0.7 AND strategy in
{deep, academic}
Strategy Presets
Presets provide opinionated defaults for different research depths. The Gateway validates and maps these to workflow context.quick | standard | deep | academic
| Strategy | react_max_iterations | max_concurrent_agents | verification | gap_filling |
|---|---|---|---|---|
| quick | 2 | 3 | ✗ | ✗ |
| standard | 3 | 5 | ✓ | ✓ (max_gaps: 3, max_iterations: 2) |
| deep | 4 | 6 | ✓ | ✓ (max_gaps: 2, max_iterations: 2) |
| academic | 5 | 8 | ✓ | ✓ (max_gaps: 3, max_iterations: 2) |
max_concurrent_agents(1–20): Controls maximum concurrency for parallel subtasks (only applies when multiple subtasks are executed)react_max_iterations(2–8, default 5): Controls ReAct loop depth per task (independent parameter, no prerequisites)
react_per_task(bool): Enable mini ReAct loops per subtask (auto-enabled for complexity > 0.7 + deep/academic)enable_verification(bool): Enable claim verification against citationsbudget_agent_max(int): Optional per-agent token budget with enforcement
gap_filling_enabled(bool): Enable/disable gap fillinggap_filling_max_gaps(1–20): Maximum number of gaps to detectgap_filling_max_iterations(1–5): Maximum retry iterations per gapgap_filling_check_citations(bool): Check citation density as gap indicator
Default preset:
standard (if not specified). Presets seed flags only if not already set in context. The Gateway accepts top‑level research_strategy on both /api/v1/tasks and /api/v1/tasks/stream and maps it into context. Preset defaults are loaded from config/research_strategies.yaml.Python SDK
CLI
Programmatic
Response Format
Typical status payloads returned byGET /api/v1/tasks/{id} include the synthesized result, metadata (citations, verification), model/provider, and timestamps.
resultis the final synthesized markdown/text.metadata.citationslists collected sources (all are included in the Sources section of the result).metadata.verificationappears when verification is enabled and completed.- In Gateway status responses,
created_at/updated_atreflect response time; authoritative run timing and totals are persisted in the database (events/timeline). usagesummarizes token usage and may include cost. Gateway status may exposeestimated_cost; workflow metadata may includecost_usd.model_usedandproviderreflect the model/provider chosen during synthesis.
Advanced Features
Memory Retrieval
Hierarchical Memory (priority):- Combines recent (last 5 messages) + semantic (top 5 relevant, similarity ≥ 0.75)
- Enables coherent multi-turn research with conversation context
- Fallback: Session memory (last 20 messages)
- Auto-triggers for sessions >20 messages
- LLM summarizes conversation (target: 37.5% of window)
- Prevents context window overflow in long conversations
Entity Filtering
When a specific entity is detected (e.g., company “Acme Analytics”): Query Refinement:- Detects canonical name, exact search queries, official domains, disambiguation terms
- Example:
canonical_name: "Acme Analytics",official_domains: ["acme.com"]
- Scoring: domain match +0.6, alias in URL +0.4, text match +0.4
- Threshold: 0.3 (any single match passes)
- Safety floor: minKeep=8 (backfilled by quality × credibility)
- Official domains always preserved (bypass threshold)
- Filters off-entity tool results (e.g., removes “Mind Inc” when searching for “Acme Mind”)
- Keeps reasoning-only outputs; prunes off-entity tool-driven results
Gap Filling
Auto-detection (max 2 iterations):- Missing section headings (
### Area Name) - Gap indicator phrases (“limited information”, “insufficient data”)
- Low citation density (< 2 inline citations per section)
- Build targeted queries:
"Find detailed information about: <area>" - Execute focused ReAct loops (max 3 iterations per gap)
- Re-collect citations (global deduplication with original citations)
- Re-synthesize with combined evidence using large tier
Example Gap Scenario
Example Gap Scenario
Citations
Collection:- Extracted from
web_searchandweb_fetchtool outputs - URL/DOI normalization and deduplication
- Scoring: quality (recency + completeness) × credibility (domain reputation)
- Diversity enforcement (max 3 per domain)
- Minimum 6 inline citations per report (clamped by available, floor 3)
- Per-area requirement: ≥2 inline citations per section
- Sources section lists all collected citations:
- “Used inline” (cited in text)
- “Additional source” (collected but not cited)
Synthesis Continuation
Trigger: Model stops early with incomplete output Detection (looksComplete validation):
- Must end with sentence punctuation (
.,!,?,。) - No dangling conjunctions (
and, but, however, therefore...) - Every research area has subsection with ≥600 chars and ≥2 citations
- Adaptive margin: min(25% of effective_max_completion, 300 tokens)
- Prompt: “Continue from last sentence; maintain headings and citation style”
- Uses large tier for quality (gpt-4.1/opus)
Verification
Claim Extraction:- Identifies factual assertions from synthesis
- Cross-references against collected citations
- Weighted by citation credibility
- Flags conflicts and unsupported claims
- Per-claim details: supporting/conflicting citations
enable_verification: true in context.
Language Matching
- Detects user query language (heuristic)
- Synthesis responds in the same language
- Supports English, Chinese, Japanese, Korean, Arabic, Russian, Spanish, French, German (heuristic‑based)
- Generic instructions ensure robustness across languages
Behavior Guarantees
Memory
Hierarchical memory (recent + semantic) injected when available
Compression
Automatic for sessions over 20 messages to prevent context overflow
Coverage
Each research area has dedicated subsection with minimum 600 chars and 2 citations
Gap Filling
Auto-detects and re-searches undercovered areas (max 2 iterations)
Language
Response matches user query language
Cost
Token usage and cost are aggregated and persisted; per-agent budgets enforced when configured
Continuation
Triggers only when synthesis is incomplete and capacity nearly exhausted
Entity Focus
When entity detected, filters citations and prunes off-entity results
Tips & Best Practices
- Getting Started
- Optimization
- Multi-turn Research
- Set
context.force_research=trueto ensure routing to ResearchWorkflow - Start with
standardpreset, adjust based on results - Monitor SSE for progress and token usage
Troubleshooting
Validation:- Invalid
web_searchsearch_type: Sanitized (falls back toauto) - Duplicate citations: Handled via URL/DOI normalization
- Entity detection: Performed during refinement when present in the query
- Memory retrieval:
memory_retrieval_v1 - Session memory fallback:
session_memory_v1 - Context compression:
context_compress_v1 - Gap filling:
gap_filling_v1
Next Steps
API Reference
REST API Documentation
Custom Tools
Adding Custom Tools
Vendor Adapters
Vendor-specific Integrations
- GitHub: https://github.com/Kocoro-lab/Shannon
- Discord: https://discord.gg/NB7C2fMcQR
- Company: https://kocoro.ai