Monitoring documentation is under development. Core concepts are outlined below.
Overview
Shannon provides comprehensive monitoring and observability features to track task execution, system performance, and resource usage in production environments.Monitoring Capabilities
Task Monitoring
Track individual task execution:- Execution status and progress
- Resource consumption
- Error rates and types
- Latency metrics
- Cost tracking
System Monitoring
Monitor Shannon infrastructure:- Service health status
- API endpoint latency
- Queue depths
- Agent availability
- LLM provider status
Metrics
Task Metrics
| Metric | Description | Unit |
|---|---|---|
task.latency | End-to-end task completion time | ms |
task.cost | Total cost per task | USD |
task.tokens.input | Input tokens consumed | count |
task.tokens.output | Output tokens generated | count |
task.iterations | Number of agent iterations | count |
task.tools.invocations | Tool usage count | count |
System Metrics
| Metric | Description | Unit |
|---|---|---|
api.latency | API response time | ms |
api.requests | Request rate | req/s |
api.errors | Error rate | errors/s |
queue.depth | Tasks waiting | count |
agents.active | Active agent count | count |
Health Checks
API Health
Component Health
Logging
Log Levels
Shannon uses structured logging with levels:DEBUG- Detailed diagnostic informationINFO- General operational messagesWARN- Warning conditionsERROR- Error conditionsFATAL- Critical failures
Log Format
Dashboards
Task Dashboard
Monitor task execution in real-time:- Active tasks
- Completion rate
- Average latency
- Error rate
- Cost per hour
System Dashboard
Track system health:- Service status
- Resource utilization
- Queue lengths
- Provider availability
Alerting
Alert Types
Configure alerts for:- Task failures
- Budget exceeded
- High latency
- Service degradation
- Rate limiting
Alert Configuration
Prometheus Integration
Export metrics to Prometheus (example scrape targets for local dev):Available Metrics
Grafana Dashboards
Pre-built Grafana dashboards for:- Task analytics
- Cost tracking
- Performance monitoring
- Error analysis
OpenTelemetry
Shannon supports OpenTelemetry for distributed tracing:Best Practices
- Set up alerts for critical metrics
- Monitor costs to prevent budget overruns
- Track error patterns to identify issues
- Use distributed tracing for debugging
- Archive logs for compliance
- Create custom dashboards for your use case
- Implement SLOs for reliability
Debugging
Enable Debug Logging
Trace Requests
Use distributed tracing via OpenTelemetry or increase logging verbosity in services. Refer to your observability stack configuration (Jaeger/Tempo) for exporters.Next Steps
Troubleshooting
Common issues and solutions
Cost Control
Manage and optimize costs