Skip to main content

Overview

Shannon exposes an OpenAI-compatible API layer that lets you use existing OpenAI SDKs, tools, and integrations to interact with Shannon’s agent orchestration platform. The compatibility layer translates OpenAI chat completion requests into Shannon tasks and streams the results back in OpenAI format. This means you can point the OpenAI Python or Node.js SDK at Shannon and get access to multi-agent research, tool use, and deep analysis — all through a familiar interface.
The OpenAI-compatible API is designed for compatibility with existing tooling. For full Shannon features (skills, session workspaces, research strategies, task control), use the native /api/v1/tasks endpoints.

Endpoints

MethodPathDescription
POST/v1/chat/completionsCreate a chat completion (streaming and non-streaming)
GET/v1/modelsList available models
GET/v1/models/{model}Get model details
Base URL: http://localhost:8080 (development)

Authentication

The OpenAI-compatible endpoints use the same authentication as other Shannon APIs.
# Bearer token (OpenAI SDK default)
Authorization: Bearer sk_your_api_key

# Or X-API-Key header
X-API-Key: sk_your_api_key
Development Default: Authentication is disabled when GATEWAY_SKIP_AUTH=1 is set. Enable authentication for production deployments.

Available Models

Shannon maps model names to different workflow modes and strategies. Select a model to control how your request is processed.
ModelWorkflow ModeDescriptionDefault Max Tokens
shannon-chatSimpleGeneral chat completion (default)4096
shannon-standard-researchResearchBalanced research with moderate depth4096
shannon-deep-researchResearchDeep research with iterative refinement8192
shannon-quick-researchResearchFast research for simple queries4096
shannon-complexSupervisorMulti-agent orchestration for complex tasks8192
If no model is specified, shannon-chat is used.
Models can be customized via config/openai_models.yaml. See the Shannon configuration documentation for details on adding custom models.

Chat Completions

POST /v1/chat/completions

Request Body

ParameterTypeRequiredDescription
modelstringNoModel name (defaults to shannon-chat)
messagesarrayYesArray of message objects
streambooleanNoEnable streaming (default: false)
max_tokensintegerNoMaximum tokens for response (capped at 16384)
temperaturenumberNoSampling temperature 0-2 (default: 0.7)
top_pnumberNoNucleus sampling parameter
nintegerNoNumber of completions (only 1 is supported)
stoparrayNoStop sequences
presence_penaltynumberNoPresence penalty -2.0 to 2.0
frequency_penaltynumberNoFrequency penalty -2.0 to 2.0
userstringNoEnd-user identifier for tracking and session derivation
stream_optionsobjectNoStreaming options (see below)
Message Object:
FieldTypeRequiredDescription
rolestringYessystem, user, or assistant
contentstringYesMessage content (text only)
namestringNoOptional name for the participant
Stream Options:
FieldTypeDescription
include_usagebooleanInclude token usage in the final streaming chunk

How Messages Are Processed

Shannon translates the OpenAI messages array into a Shannon task:
  • Last user message becomes the task query
  • First system message becomes the system prompt
  • All other messages (excluding system and last user) become conversation history
  • The model name determines the workflow mode and research strategy

Non-Streaming Response

{
  "id": "chatcmpl-20250120100000a1b2c3d4",
  "object": "chat.completion",
  "created": 1737367200,
  "model": "shannon-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The response text from Shannon..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}
Non-streaming requests have a 35-minute timeout to accommodate deep research and long-running workflows. For very long tasks, prefer streaming mode.

Streaming Response

When stream: true, the response is delivered as Server-Sent Events: First chunk (includes role):
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1737367200,"model":"shannon-chat","choices":[{"index":0,"delta":{"role":"assistant","content":"The"},"finish_reason":null}]}
Content chunks:
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1737367200,"model":"shannon-chat","choices":[{"index":0,"delta":{"content":" response text"},"finish_reason":null}]}
Final chunk (with finish reason):
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1737367200,"model":"shannon-chat","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":25,"completion_tokens":150,"total_tokens":175}}
Stream terminator:
data: [DONE]
Usage data in the final chunk is only included when stream_options.include_usage is set to true.

Shannon Extensions

shannon_events Field

During streaming, Shannon extends the standard OpenAI chunk format with a shannon_events field. This field carries agent lifecycle events that provide visibility into what Shannon’s agents are doing behind the scenes.
{
  "id": "chatcmpl-...",
  "object": "chat.completion.chunk",
  "created": 1737367200,
  "model": "shannon-deep-research",
  "choices": [
    {
      "index": 0,
      "delta": {}
    }
  ],
  "shannon_events": [
    {
      "type": "AGENT_STARTED",
      "agent_id": "researcher_1",
      "message": "Starting research on query...",
      "timestamp": 1737367201,
      "payload": {}
    }
  ]
}
ShannonEvent fields:
FieldTypeDescription
typestringEvent type (see list below)
agent_idstringAgent identifier
messagestringHuman-readable description
timestampintegerUnix timestamp
payloadobjectAdditional event-specific data
Forwarded event types:
CategoryEvents
WorkflowWORKFLOW_STARTED, WORKFLOW_PAUSING, WORKFLOW_PAUSED, WORKFLOW_RESUMED, WORKFLOW_CANCELLING, WORKFLOW_CANCELLED
AgentAGENT_STARTED, AGENT_COMPLETED, AGENT_THINKING
ToolTOOL_INVOKED, TOOL_OBSERVATION
ProgressPROGRESS, DATA_PROCESSING, WAITING, ERROR_RECOVERY
TeamTEAM_RECRUITED, TEAM_RETIRED, TEAM_STATUS, ROLE_ASSIGNED, DELEGATION, DEPENDENCY_SATISFIED
Budget & ApprovalBUDGET_THRESHOLD, APPROVAL_REQUESTED, APPROVAL_DECISION
Standard OpenAI clients ignore unknown fields, so the shannon_events field is safe to use with any OpenAI-compatible tooling. Parse it when you want richer progress information.

X-Session-ID Header

Shannon supports multi-turn conversations via the X-Session-ID request header. When provided, Shannon maintains conversation context across requests.
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk_your_key" \
  -H "X-Session-ID: my-conversation-1" \
  -H "Content-Type: application/json" \
  -d '{"model": "shannon-chat", "messages": [{"role": "user", "content": "Hello"}]}'
If no X-Session-ID is provided, Shannon derives a session ID from the conversation content (hash of system message + first user message) or from the user field. The response includes X-Session-ID and X-Shannon-Session-ID headers when a new session is created or a collision is detected.

Rate Limiting

Rate limits are enforced per API key, per model. The default limits are:
  • 60 requests per minute per model
  • 200,000 tokens per minute per model
Rate limit headers included in every response:
HeaderDescription
X-RateLimit-Limit-RequestsMaximum requests per minute
X-RateLimit-Remaining-RequestsRemaining requests in current window
X-RateLimit-Limit-TokensMaximum tokens per minute
X-RateLimit-Remaining-TokensRemaining tokens in current window
X-RateLimit-Reset-RequestsTime until request limit resets
Retry-AfterSeconds to wait before retrying (on 429)

Error Handling

Errors follow the OpenAI error response format:
{
  "error": {
    "message": "Model 'invalid-model' not found. Use GET /v1/models to list available models.",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}
Error types:
HTTP StatusTypeCodeDescription
400invalid_request_errorinvalid_requestMalformed request or missing required fields
401authentication_errorinvalid_api_keyInvalid or missing API key
403permission_errorinvalid_requestInsufficient permissions
404invalid_request_errormodel_not_foundModel does not exist
429rate_limit_errorrate_limit_exceededRate limit exceeded
500server_errorinternal_errorInternal server error

List Models

GET /v1/models

Returns all available Shannon models.
curl http://localhost:8080/v1/models \
  -H "Authorization: Bearer sk_your_key"
Response:
{
  "object": "list",
  "data": [
    {
      "id": "shannon-chat",
      "object": "model",
      "created": 1737367200,
      "owned_by": "shannon"
    },
    {
      "id": "shannon-deep-research",
      "object": "model",
      "created": 1737367200,
      "owned_by": "shannon"
    }
  ]
}

GET /v1/models/

Returns details for a specific model. The model description is included in the X-Model-Description response header.
curl http://localhost:8080/v1/models/shannon-deep-research \
  -H "Authorization: Bearer sk_your_key"

Usage with OpenAI SDKs

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="sk_your_api_key",  # or "not-needed" if auth is disabled
)

# Non-streaming
response = client.chat.completions.create(
    model="shannon-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Shannon?"}
    ],
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="shannon-deep-research",
    messages=[
        {"role": "user", "content": "Analyze the impact of AI on healthcare"}
    ],
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

    # Access Shannon-specific events (if available)
    if hasattr(chunk, "shannon_events") and chunk.shannon_events:
        for event in chunk.shannon_events:
            print(f"\n[{event['type']}] {event.get('message', '')}")

Node.js / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:8080/v1",
  apiKey: "sk_your_api_key",
});

// Non-streaming
const response = await client.chat.completions.create({
  model: "shannon-chat",
  messages: [
    { role: "user", content: "What is Shannon?" }
  ],
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: "shannon-deep-research",
  messages: [
    { role: "user", content: "Analyze the impact of AI on healthcare" }
  ],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}

curl

# Non-streaming
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_your_api_key" \
  -d '{
    "model": "shannon-chat",
    "messages": [
      {"role": "user", "content": "What is Shannon?"}
    ]
  }'

# Streaming
curl -N -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_your_api_key" \
  -d '{
    "model": "shannon-deep-research",
    "messages": [
      {"role": "user", "content": "Analyze the impact of AI on healthcare"}
    ],
    "stream": true,
    "stream_options": {"include_usage": true}
  }'

# With session ID for multi-turn
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk_your_api_key" \
  -H "X-Session-ID: my-session-1" \
  -d '{
    "model": "shannon-chat",
    "messages": [
      {"role": "system", "content": "You are a data analyst."},
      {"role": "user", "content": "Summarize Q4 revenue trends"}
    ]
  }'

Streaming with Shannon Events

To build rich UIs that show agent progress, parse the shannon_events field from streaming chunks:
import json
import httpx

def stream_with_events(query: str, model: str = "shannon-deep-research"):
    response = httpx.post(
        "http://localhost:8080/v1/chat/completions",
        headers={
            "Content-Type": "application/json",
            "Authorization": "Bearer sk_your_api_key",
        },
        json={
            "model": model,
            "messages": [{"role": "user", "content": query}],
            "stream": True,
        },
        timeout=None,
    )

    for line in response.iter_lines():
        if not line.startswith("data: "):
            continue
        data = line[6:]
        if data == "[DONE]":
            break

        chunk = json.loads(data)

        # Print content deltas
        delta = chunk["choices"][0].get("delta", {})
        if delta.get("content"):
            print(delta["content"], end="", flush=True)

        # Print Shannon agent events
        for event in chunk.get("shannon_events", []):
            print(f"\n  [{event['type']}] {event.get('message', '')}")

stream_with_events("Research the latest developments in quantum computing")

Heartbeat and Keepalive

During streaming, Shannon sends SSE comment lines (: keepalive) every 30 seconds to keep the connection alive. Conforming SSE clients ignore these automatically. This prevents load balancers and proxies from closing idle connections during long-running research tasks.

Limitations

The following OpenAI API features are not supported:
FeatureStatus
Function calling / toolsNot supported
Vision / image inputsNot supported (text content only)
Audio inputs/outputsNot supported
Embeddings API (/v1/embeddings)Not available
Fine-tuning APINot available
response_format (JSON mode)Not supported
logprobsNot supported
seedNot supported
n > 1 (multiple completions)Not supported
The messages[].content field only accepts plain text strings. Multipart content (arrays with image_url objects) is not supported.

Differences from Standard OpenAI API

AspectOpenAI APIShannon OpenAI-Compatible API
ModelsGPT-4, GPT-3.5, etc.shannon-chat, shannon-deep-research, etc.
ProcessingSingle LLM callMulti-agent orchestration, tool use, research
LatencySecondsSeconds to minutes (depending on model/strategy)
Streaming eventsContent onlyContent + shannon_events agent lifecycle
Session managementNot built-inX-Session-ID header with server-side context
Rate limitsPer-organizationPer API key, per model
Finish reasonsstop, length, tool_calls, content_filterstop only