Documentation Index
Fetch the complete documentation index at: https://docs.shannon.run/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Shannon’s memory system provides intelligent context retention and retrieval across user sessions, enabling agents to maintain conversational continuity and leverage historical interactions for improved responses.Architecture
Storage Layers
PostgreSQL
- Session Context: Session-level state and metadata
- Execution Persistence: Agent and tool execution history
- Task Tracking: High-level task and workflow metadata
Redis
- Session Cache: Fast access to active session data (TTL: 3600s)
- Token Budgets: Real-time token usage tracking
- Compression State: Tracks context compression status
Qdrant (Vector Store)
- Semantic Memory: High-performance vector similarity search
- Collection Organization: task_embeddings, summaries, tool_results, document_chunks
- Hybrid Search: Combines recency and semantic relevance
Memory Types
Hierarchical Memory (Default)
Combines multiple retrieval strategies:- Recent Memory: Last N interactions from current session
- Semantic Memory: Contextually relevant based on query similarity
- Compressed Summaries: Condensed representations of older conversations
Session Memory
Chronological retrieval of recent interactions within a session.Agent Memory
Individual agent execution records including:- Input queries and generated responses
- Token usage and model information
- Tool executions and results
Supervisor Memory
Strategic memory for intelligent task decomposition:- Decomposition Patterns: Successful task breakdowns for reuse
- Strategy Performance: Aggregated metrics per strategy type
- Failure Patterns: Known failures with mitigation strategies
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
QDRANT_HOST | qdrant | Qdrant server hostname |
QDRANT_PORT | 6333 | Qdrant server port |
REDIS_TTL_SECONDS | 3600 | Session cache TTL |
Embedding Requirements
- Default Model:
text-embedding-3-small(1536 dimensions) - Fallback Behavior: If OpenAI key is not configured, memory operations silently degrade - workflows continue without historical context
Key Features
Intelligent Chunking
- Splits long answers (>2000 tokens) into manageable chunks
- 200-token overlap for context preservation
- Batch embeddings for efficiency
MMR (Maximal Marginal Relevance)
- Diversity-aware reranking balances relevance with information diversity
- Default lambda=0.7 optimizes for relevant yet diverse context
- Fetches 3x requested items, then reranks for diversity
Context Compression
- Automatic triggers based on message count and token estimates
- Rate limiting prevents excessive compression
- Model-aware thresholds for different tiers
Memory Retrieval Flow
Privacy & Data Governance
PII Protection
- Data minimization: Store only essential fields
- Anonymization: UUIDs instead of real identities
- Automatic PII detection and redaction
Data Retention
- Conversation History: 30-day default retention
- Decomposition Patterns: 90-day retention
- User Preferences: Session-based, 24-hour expiry
Performance Optimizations
- Batch Processing: Single API call for multiple chunks (5x faster)
- Smart Caching: LRU (2048 entries) + Redis
- Payload Indexes: 50-90% faster filtering on session_id, tenant_id, user_id
- Optimized HNSW: m=16, ef_construct=100 for fast similarity search
Limitations
- Memory retrieval adds latency (mitigated by caching)
- Vector similarity may miss exact keyword matches
- Compression is lossy (preserves key points only)
- Cross-session memory requires explicit session linking
Enabling Semantic Memory
Follow the steps below to enable Shannon’s semantic memory system backed by Qdrant.Prerequisites
Before proceeding, ensure the following are in place:
- Qdrant is running (included by default in Shannon’s
docker-compose.yaml) OPENAI_API_KEYis set in your environment (required for thetext-embedding-3-smallembedding model)
Step-by-Step Setup
Enable vector memory in shannon.yaml
Add or update the
vector block in your shannon.yaml configuration:shannon.yaml
Verify Qdrant collections
Shannon automatically creates 5 collections (all using 1536-dimensional vectors from
You can verify the collections are created by querying the Qdrant REST API:
text-embedding-3-small):| Collection | Purpose |
|---|---|
task_embeddings | Task result embeddings for semantic search |
tool_results | Tool execution result embeddings |
cases | Case library for pattern matching |
document_chunks | Document chunk embeddings for RAG |
summaries | Summary embeddings |
Configure MMR diversity reranking
MMR (Maximal Marginal Relevance) balances relevance and diversity in retrieval results. When enabled, Shannon fetches a larger candidate pool and reranks to reduce redundancy while preserving relevance.
shannon.yaml
A
mmr_lambda of 0.7 is a good default — it strongly favours relevance while still filtering out near-duplicate results.Next Steps
Architecture Overview
System architecture
Sessions API
Session management