Documentation Index Fetch the complete documentation index at: https://docs.shannon.run/llms.txt
Use this file to discover all available pages before exploring further.
Quick Diagnostics
Before diving into specific issues, run these quick checks:
# Check all services are running
docker compose ps
# View recent logs from all services
docker compose logs --tail=50
# Check specific service health
curl http://localhost:8080/health
curl http://localhost:8000/health # LLM Service
Installation & Setup Issues
Docker Compose Fails to Start
Symptoms :
Services won’t start
Exit code errors
Container crashes immediately
Common Causes :
1. Docker daemon not running
Check :Solution :# macOS
open -a Docker
# Linux
sudo systemctl start docker
# Verify
docker info
Check which ports are in use :# Check all Shannon ports
lsof -i :8080 # Gateway
lsof -i :50051 # Agent Core
lsof -i :50052 # Orchestrator
lsof -i :8000 # LLM Service
lsof -i :5432 # PostgreSQL
lsof -i :6379 # Redis
lsof -i :6333 # Qdrant
lsof -i :7233 # Temporal
Solution - Kill conflicting processes :# Find process using port
lsof -ti :8080
# Kill the process (macOS/Linux)
kill -9 $( lsof -ti :8080 )
Solution - Change Shannon ports :
Edit docker-compose.yml to use different ports:gateway :
ports :
- "8081:8080" # Use 8081 instead of 8080
3. Insufficient system resources
Check Docker resources :docker system df
docker stats
Solution - Increase Docker resources :
macOS : Docker Desktop → Preferences → Resources
RAM: Minimum 8GB (16GB recommended)
CPUs: Minimum 4 cores
Disk: Minimum 20GB free
Linux : Edit Docker daemon config
sudo nano /etc/docker/daemon.json
{
"default-ulimits" : {
"nofile" : {
"Name" : "nofile" ,
"Hard" : 64000 ,
"Soft" : 64000
}
}
}
Error : WARNING: The OPENAI_API_KEY variable is not setSolution :# Create .env from template
make setup
# Or manually
cp .env.example .env
# Add your API keys
echo "OPENAI_API_KEY=sk-..." >> .env
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env
5. Python WASI interpreter missing
Error : python_wasi/bin/python3.11: No such file or directorySolution :# Download and setup Python WASI (20MB)
./scripts/setup_python_wasi.sh
# Verify installation
ls -lh python_wasi/bin/python3.11
API & Connection Issues
401 Unauthorized
Symptoms :
HTTP 401 responses
“Unauthorized” error messages
Diagnosis :
# Check if auth is enabled
docker compose exec orchestrator env | grep GATEWAY_SKIP_AUTH
Solution 1: Disable authentication (development)
Edit .env :GATEWAY_SKIP_AUTH = 1 # 1 = disabled, 0 = enabled
Restart :docker compose restart gateway
Test :curl http://localhost:8080/api/v1/tasks
# Should work without X-API-Key header
Solution 2: Provide valid API key (production)
Request with API key :curl -H "X-API-Key: sk_test_123456" \
http://localhost:8080/api/v1/tasks
Python SDK :from shannon import ShannonClient
client = ShannonClient(
base_url = "http://localhost:8080" ,
api_key = "sk_test_123456"
)
Connection Refused / Service Unavailable
Symptoms :
connection refused
dial tcp: connect: connection refused
Services not responding
Diagnosis :
# Check service status
docker compose ps
# Check specific service logs
docker compose logs orchestrator --tail=50
docker compose logs agent-core --tail=50
docker compose logs llm-service --tail=50
# Test endpoints
curl http://localhost:8080/health
curl http://localhost:50052 # Should fail - gRPC doesn't support HTTP GET
Solution 1: Services not ready
Wait for all services to initialize :# Watch logs until services are ready
docker compose logs -f
# Look for these messages:
# orchestrator: "gRPC server listening on :50052"
# agent-core: "Server started on :50051"
# llm-service: "Uvicorn running on http://0.0.0.0:8000"
# gateway: "Gateway listening on :8080"
Typical startup time : 30-60 seconds
Solution 2: Service crashed
Check for crash errors :docker compose logs orchestrator | grep -i error
docker compose logs orchestrator | grep -i fatal
Restart crashed service :docker compose restart orchestrator
docker compose restart agent-core
docker compose restart llm-service
Full reset if needed :docker compose down
docker compose up -d
Solution 3: Database connection failed
Check PostgreSQL :docker compose logs postgres --tail=20
# Test connection
docker compose exec postgres psql -U shannon -d shannon -c "SELECT 1;"
Solution :# Restart database
docker compose restart postgres
# Wait for it to be ready
docker compose exec postgres pg_isready -U shannon
Task Stuck in RUNNING or QUEUED State
Symptoms :
Task never completes
Status remains RUNNING for hours
No progress updates
Diagnosis :
# Check Temporal workflows
docker compose logs temporal --tail=100
# Check orchestrator worker
docker compose logs orchestrator | grep -i workflow
# View task in Temporal UI
open http://localhost:8088
Solution 1: LLM API key invalid or quota exceeded
Check LLM service logs :docker compose logs llm-service | grep -i "api key\|unauthorized\|quota"
Solution :# Verify API keys in .env
grep -E "OPENAI_API_KEY|ANTHROPIC_API_KEY" .env
# Test API key
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY "
# Update .env with valid key
nano .env
# Restart LLM service
docker compose restart llm-service
Solution 2: Temporal worker deadlock
Restart Temporal workers :docker compose restart orchestrator
# Check workflow in Temporal UI
open http://localhost:8088
# Navigate to Workflows → Find your workflow → View execution history
Force workflow termination (last resort):# In Temporal UI: Workflows → Select workflow → Terminate
Solution 3: Circuit breaker open
Check circuit breaker status :docker compose logs orchestrator | grep -i "circuit"
Circuit breakers protect against cascading failures :
LLM Service circuit breaker
Database circuit breaker
Redis circuit breaker
Solution - Wait for automatic recovery (30-60 seconds)
Or restart services :docker compose restart orchestrator agent-core llm-service
Budget & Cost Issues
Budget Exceeded Errors
Symptoms :
budget exceeded error
Tasks fail with cost limit errors
HTTP 429 (Rate Limited) Payment Required
Diagnosis :
# Check budget configuration
docker compose exec orchestrator env | grep BUDGET
docker compose exec orchestrator env | grep MAX_COST
Solution 1: Increase budget limits
Edit .env :MAX_COST_PER_REQUEST = 1.00 # Increase from 0.50
MAX_TOKENS_PER_REQUEST = 20000 # Increase from 10000
Restart :docker compose restart orchestrator llm-service
Budgets are configured server-side via environment variables. The SDK does not accept per-request budget parameters.
Solution 2: Use simpler execution mode
# Instead of advanced mode
client.submit_task( query = "..." , # Mode auto-selected)
# Advanced → Standard → Simple (cheapest)
Cost comparison :
Simple : 1 LLM call, $0.01-0.05
Standard : 3-5 LLM calls, $0.05-0.20
Advanced : 10+ LLM calls, $0.20-1.00+
Solution 3: Disable budget enforcement (development only)
⚠️ Warning : Only for development/testingEdit .env :LLM_DISABLE_BUDGETS = 1 # Disable budget checks
Restart :docker compose restart orchestrator llm-service
Slow Response Times
Symptoms :
Tasks take 2-3x longer than expected
High latency
Timeouts
Diagnosis :
# Check resource usage
docker stats
# Check for slow queries
docker compose logs postgres | grep "duration:"
# Check Redis latency
docker compose exec redis redis-cli --latency
# Check Qdrant performance
curl http://localhost:6333/metrics
Solution 1: Insufficient CPU/Memory
Check resources :docker stats
# Look for CPU > 80% or Memory near limit
Increase Docker resources :
macOS: Docker Desktop → Resources → increase RAM to 16GB, CPUs to 6
Linux: More powerful machine or reduce concurrent workflows
Tune worker concurrency in .env:WORKER_ACT_CRITICAL = 5 # Reduce from 10
WORKER_WF_CRITICAL = 3 # Reduce from 5
TOOL_PARALLELISM = 2 # Reduce from 5
Solution 2: Cold start / cache misses
First request is always slower (10-30s)Subsequent requests use caching :
LLM response cache (Redis)
Session context cache
Tool result cache
Solution : Warm up with a test requestcurl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{"query": "Hello"}'
Solution 3: Database connection pool exhausted
Increase pool size in .env:DB_MAX_OPEN_CONNS = 50 # Increase from 25
DB_MAX_IDLE_CONNS = 10 # Increase from 5
Restart :docker compose restart orchestrator
Tokens > 0 but empty result
Symptoms :
Database or logs show non‑zero completion tokens, but the final result text is empty.
Complex prompts return nothing while simple prompts work.
Cause :
Some GPT‑5 chat responses return content as structured parts instead of a plain string. Older parsing could miss the text. This is fixed by routing GPT‑5 models via the Responses API and defensively normalizing content for chat responses.
Fix (Shannon ≥ 2025‑11‑05) :
LLM Service routes GPT‑5 models to the Responses API and prefers output_text when available.
Chat providers normalize content by joining text parts when a list is returned.
If you upgraded from an older build, restart the LLM Service to clear cached empty responses.
Verify :
Re‑run a long, multi‑paragraph prompt. result length should be > 0 and session history should include the assistant message.
High Memory Usage
Symptoms :
OOM (Out of Memory) errors
Container restarts
Swap usage high
Diagnosis :
docker stats
# Check session cache size
docker compose logs orchestrator | grep "session.*cache"
Solution: Reduce cache sizes
Edit config/shannon.yaml or set env vars :# Reduce session cache
SESSION_CACHE_SIZE = 5000 # From 10000
# Reduce history
SESSION_MAX_HISTORY = 250 # From 500
# Reduce LRU caches
TOOL_CACHE_SIZE = 1000 # From 5000
Restart :docker compose restart orchestrator agent-core
Data & State Issues
Sessions Not Persisting
Symptoms :
Session context lost between requests
Agent doesn’t remember previous tasks
Diagnosis :
# Check Redis connectivity
docker compose exec orchestrator nc -zv redis 6379
# Check session data
docker compose exec redis redis-cli KEYS "session:*"
Solution 1: Redis connection failed
Check Redis status :docker compose ps redis
docker compose logs redis --tail=20
Restart Redis :docker compose restart redis
Test connection :docker compose exec redis redis-cli ping
# Should return "PONG"
Solution 2: Session expired (TTL)
Sessions expire after 30 days by default Increase TTL in .env:REDIS_TTL_SECONDS = 7776000 # 90 days
Check session expiry :docker compose exec redis redis-cli TTL "session:YOUR_SESSION_ID"
# Returns seconds until expiry, or -1 for no expiry
Solution 3: Using consistent session IDs
Provide a stable session_id explicitly :session_id = "user-123-conversation"
handle1 = client.submit_task( "Load data" , session_id = session_id)
handle2 = client.submit_task( "Analyze data" , session_id = session_id)
Database Migration Errors
Symptoms :
Table doesn’t exist errors
Column not found errors
Schema version mismatch
Solution :
# Run migrations
docker compose exec orchestrator make migrate
# Or reset database (⚠️ DESTRUCTIVE)
docker compose down -v # Remove volumes
docker compose up -d
Viewing Logs
# All services
docker compose logs -f
# Specific service
docker compose logs -f orchestrator
docker compose logs -f agent-core
docker compose logs -f llm-service
# Last N lines
docker compose logs --tail=100 orchestrator
# Search logs
docker compose logs orchestrator | grep -i error
docker compose logs orchestrator | grep "task_id=YOUR_TASK_ID"
Temporal UI
Access : http://localhost:8088
Features :
View all workflows
See execution history
Replay failed workflows
Terminate stuck workflows
Time-travel debugging
Usage :
Navigate to Workflows
Search by workflow ID (task ID)
View execution history to see where it failed
Check Activity logs for detailed errors
Prometheus Metrics
# Orchestrator metrics
curl http://localhost:2112/metrics
# Agent Core metrics
curl http://localhost:2113/metrics
# LLM Service metrics
curl http://localhost:8000/metrics
Key metrics :
tasks_submitted_total
tasks_completed_total
tasks_failed_total
llm_requests_total
circuit_breaker_state
Real-time Monitoring
For real-time views of task execution:
Use the Shannon Desktop App (Runs view and Run Details) for live event streams
Use Prometheus/Grafana for metrics once configured (see Monitoring concepts)
Getting Help
Installation Guide Detailed setup instructions
API Documentation Complete API reference
GitHub Issues Report bugs or request features
Quick Reference Commands
# Health checks
curl http://localhost:8080/health
curl http://localhost:8000/health
# Service status
docker compose ps
docker stats
# Restart services
docker compose restart orchestrator
docker compose restart agent-core
docker compose restart llm-service
# View logs
docker compose logs -f orchestrator
# Full reset
docker compose down -v
docker compose up -d
# Database access
docker compose exec postgres psql -U shannon -d shannon
# Redis CLI
docker compose exec redis redis-cli
# Check environment
docker compose exec orchestrator env | grep -E "OPENAI|ANTHROPIC"