Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.stratus.run/v1/chat/completions
Generate agent responses with Stratus planning and LLM execution.

Endpoint

POST https://api.stratus.run/v1/chat/completions

Request Format

Stratus follows the OpenAI Chat Completions API format:
interface ChatCompletionRequest {
  model: string;              // Required: Stratus model name
  messages: Message[];        // Required: Conversation history
  temperature?: number;       // Optional: 0-2, default 1.0
  max_tokens?: number;        // Optional: Max completion tokens
  top_p?: number;            // Optional: Nucleus sampling
  stream?: boolean;          // Optional: Stream response
  stop?: string[];           // Optional: Stop sequences
}

// Optional inline LLM provider key headers (take priority over vault keys and Formation pool)
// X-OpenAI-Key: sk-proj-...
// X-Anthropic-Key: sk-ant-...
// X-Google-Key: AIza...
// X-OpenRouter-Key: sk-or-...

interface Message {
  role: 'system' | 'user' | 'assistant' | 'tool' | 'developer';
  content: string | Array<object> | null;
  tool_calls?: Array<object>;
  tool_call_id?: string;
}

Example Request

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stratus.run/v1',
  apiKey: process.env.STRATUS_API_KEY
});

const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    {
      role: 'system',
      content: 'Current page: Google homepage with search box visible.'
    },
    {
      role: 'user',
      content: 'Find hotels in NYC for December 15-18, 2024'
    }
  ],
  temperature: 0.7,
  max_tokens: 1000
});

console.log(response.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.stratus.run/v1",
    api_key=os.environ["STRATUS_API_KEY"]
)

response = client.chat.completions.create(
    model="stratus-x1ac-small-gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "Current page: Google homepage with search box visible."
        },
        {
            "role": "user",
            "content": "Find hotels in NYC for December 15-18, 2024"
        }
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)

cURL

curl https://api.stratus.run/v1/chat/completions \
  -H "Authorization: Bearer stratus_sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "stratus-x1ac-small-gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "Current page: Google homepage with search box visible."
      },
      {
        "role": "user",
        "content": "Find hotels in NYC for December 15-18, 2024"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

Response Format

interface ChatCompletionResponse {
  id: string;                    // Completion ID
  object: 'chat.completion';     // Object type
  created: number;               // Unix timestamp
  model: string;                 // Model used
  choices: Choice[];             // Generated completions
  usage: Usage;                  // Token usage
  stratus?: StratusMetadata;     // Stratus-specific metadata
}

interface Choice {
  index: number;
  message: Message;
  finish_reason: 'stop' | 'length' | 'tool_calls' | 'content_filter';
}

interface Usage {
  prompt_tokens: number;
  completion_tokens: number;
  total_tokens: number;
}

interface StratusMetadata {
  stratus_model: string;                    // e.g., "x1ac-base"
  execution_llm: string;                    // e.g., "gpt-4o"
  action_sequence?: string[];               // Planned action names
  predicted_state_changes?: number[];       // Per-step embedding norm magnitude
  confidence_labels?: string[];             // "High" | "Medium" | "Low" per step
  overall_confidence?: number;              // Top-1 action softmax probability (0-1)
  steps_to_goal?: number;                   // Length of action_sequence
  planning_time_ms?: number;                // Time spent in world model
  execution_time_ms?: number;               // Time spent in LLM
  total_steps_executed?: number;            // Number of LLM calls in agentic loop
  execution_trace?: Array<{                 // Per-step trace
    step: number;
    action: string;
    response_summary: string;
  }>;
  brain_signal?: {                          // Policy head recommendation
    action_type: string;                    // Best action (e.g., "close_deal")
    confidence: number;                     // Softmax probability within available set
    plan_ahead: string[];                   // Predicted next 1-2 actions (lookahead)
    simulation_confirmed: boolean;          // World model validated this action
    goal_proximity?: number;                // Cosine sim of current state vs goal
  };
  confidence?: number;                      // Legacy — use overall_confidence
  key_source?: 'user' | 'formation';        // Where the LLM key came from
  formation_markup_applied?: number | null;  // 0.25 when Formation pool used; null for BYOK
}

Example Response

{
  "id": "stratus-chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1702500000,
  "model": "stratus-x1ac-small-gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'll help you find hotels in NYC. Let me search for that.\n\n**Action:** Type 'NYC hotels December 15-18 2024' in the search box\n**Next:** Click the search button"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 85,
    "completion_tokens": 42,
    "total_tokens": 127
  },
  "stratus": {
    "stratus_model": "x1ac-base",
    "execution_llm": "gpt-4o",
    "action_sequence": ["type", "click", "scroll", "click"],
    "predicted_state_changes": [0.42, 0.38, 0.15, 0.51],
    "confidence_labels": ["High", "High", "Medium", "High"],
    "overall_confidence": 0.92,
    "steps_to_goal": 4,
    "planning_time_ms": 45,
    "execution_time_ms": 890,
    "brain_signal": {
      "action_type": "type",
      "confidence": 0.94,
      "plan_ahead": ["click", "scroll"],
      "simulation_confirmed": true,
      "goal_proximity": 0.73
    },
    "key_source": "formation",
    "formation_markup_applied": 0.25
  }
}

State and Goal Convention

Stratus works best when you structure messages as: System message = Current State Describe where the agent currently is.
{
  "role": "system",
  "content": "Current page: Google homepage. Search box visible and focused."
}
User message = Goal Describe what the agent should achieve.
{
  "role": "user",
  "content": "Find hotels in NYC for December 15-18"
}
This convention helps Stratus:
  • Encode current state accurately
  • Plan toward the goal
  • Predict state transitions

Parameters

model (required)

Combined Stratus + LLM model name. Native format: stratus-x1ac-{size}-{llm} OpenRouter format: stratus-x1ac-{size}-{or-provider}/{or-model}
// Native models
model: 'stratus-x1ac-small-gpt-4o'
model: 'stratus-x1ac-base-claude-sonnet-4-20250514'

// OpenRouter models
model: 'stratus-x1ac-base-deepseek/deepseek-r1'
model: 'stratus-x1ac-base-meta-llama/llama-3.3-70b-instruct'
model: 'stratus-x1ac-base-google/gemini-2.5-pro'
See Models for the complete list of 2,050+ combinations.

messages (required)

Array of conversation messages.
messages: [
  { role: 'system', content: 'Current state description' },
  { role: 'user', content: 'Goal description' }
]
Best Practice:
  • Use system for state/context
  • Use user for goals/instructions
  • Include assistant for multi-turn conversations

temperature (optional)

Controls randomness in LLM execution (not planning).
  • Range: 0-2
  • Default: 1.0
  • Lower = more deterministic
  • Higher = more creative
temperature: 1.0
Note: Stratus planning is deterministic. Temperature only affects LLM execution.

max_tokens (optional)

Maximum tokens in completion.
max_tokens: 1000

top_p (optional)

Nucleus sampling parameter.
  • Range: 0-1
  • Default: 1.0
top_p: 0.9

stream (optional)

Stream response chunks (SSE format).
stream: true
Example:
const stream = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [...],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

stop (optional)

Stop sequences to end generation.
stop: ['\n\n', 'END']

n (optional)

Number of completions to generate for each prompt.
  • Range: 1 or higher
  • Default: 1
n: 1
Generating multiple completions (n > 1) increases token usage and latency proportionally. Use sparingly.

presence_penalty (optional)

Penalizes tokens based on whether they appear in the text so far.
  • Range: -2.0 to 2.0
  • Default: 0.0
  • Positive values reduce repetition
  • Negative values encourage repetition
presence_penalty: 0.0

frequency_penalty (optional)

Penalizes tokens based on their frequency in the text so far.
  • Range: -2.0 to 2.0
  • Default: 0.0
  • Positive values reduce repetition of common phrases
  • Negative values encourage reuse of phrases
frequency_penalty: 0.0

user (optional)

Unique identifier for the end-user, used for tracking and monitoring.
user: 'user-12345'
The user parameter helps with abuse detection and per-user analytics. Recommended for production applications.

tools (optional)

Array of tool (function) definitions for function calling. Maximum 100 tools per request.
tools: [
  {
    type: 'function',
    function: {
      name: 'get_current_weather',
      description: 'Get the current weather in a given location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'The city and state, e.g. San Francisco, CA'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit']
          }
        },
        required: ['location']
      }
    }
  }
]
See Tools & Function Calling for complete documentation.

tool_choice (optional)

Controls which tool (if any) is called by the model.
  • 'none' - Don’t call any tools
  • 'auto' - Model decides (default)
  • 'required' - Model must call a tool
  • {type: 'function', function: {name: 'tool_name'}} - Force specific tool
tool_choice: 'auto'
See Tools & Function Calling for examples.

Tools & Function Calling

Stratus supports OpenAI-compatible function calling, allowing the model to request tool execution during planning and execution.

Overview

Function calling enables:
  • Structured outputs - Get JSON responses in defined schemas
  • External data access - Fetch real-time information (weather, stock prices, etc.)
  • Action execution - Trigger operations (send email, create ticket, etc.)
  • Multi-step workflows - Chain tool calls to accomplish complex tasks

Defining Tools

Tools are defined using JSON Schema:
const tools = [
  {
    type: 'function',
    function: {
      name: 'search_web',
      description: 'Search the web for information about a query',
      parameters: {
        type: 'object',
        properties: {
          query: {
            type: 'string',
            description: 'The search query'
          },
          max_results: {
            type: 'number',
            description: 'Maximum number of results to return',
            default: 10
          }
        },
        required: ['query']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'City name or coordinates'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit'],
            default: 'celsius'
          }
        },
        required: ['location']
      }
    }
  }
];

Tool Choice Strategies

Control tool selection with the tool_choice parameter:

Auto (Default)

Model decides whether to call a tool:
const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [{ role: 'user', content: 'What is the weather in Paris?' }],
  tools: tools,
  tool_choice: 'auto' // or omit - this is default
});

None

Prevent tool calls:
tool_choice: 'none'

Required

Force the model to call at least one tool:
tool_choice: 'required'
Use 'required' when you always want a structured tool call response, ensuring the model doesn’t just respond with text.

Specific Tool

Force a specific tool to be called:
tool_choice: {
  type: 'function',
  function: { name: 'get_weather' }
}

Complete Example

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stratus.run/v1',
  apiKey: process.env.STRATUS_API_KEY
});

// 1. Define tools
const tools = [
  {
    type: 'function',
    function: {
      name: 'get_current_weather',
      description: 'Get the current weather in a given location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'The city and state, e.g. San Francisco, CA'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit']
          }
        },
        required: ['location']
      }
    }
  }
];

// 2. Make request with tools
const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    { role: 'user', content: 'What is the weather like in Boston?' }
  ],
  tools: tools,
  tool_choice: 'auto'
});

const message = response.choices[0].message;

// 3. Check if tool was called
if (message.tool_calls && message.tool_calls.length > 0) {
  const toolCall = message.tool_calls[0];

  console.log('Tool called:', toolCall.function.name);
  console.log('Arguments:', toolCall.function.arguments);

  // 4. Execute the tool
  const functionArgs = JSON.parse(toolCall.function.arguments);
  const weatherData = await getCurrentWeather(
    functionArgs.location,
    functionArgs.unit
  );

  // 5. Send tool result back to model
  const followUp = await client.chat.completions.create({
    model: 'stratus-x1ac-small-gpt-4o',
    messages: [
      { role: 'user', content: 'What is the weather like in Boston?' },
      message, // Assistant's response with tool call
      {
        role: 'tool',
        tool_call_id: toolCall.id,
        content: JSON.stringify(weatherData)
      }
    ],
    tools: tools
  });

  console.log('Final response:', followUp.choices[0].message.content);
} else {
  console.log('Direct response:', message.content);
}

Tool Call Response Format

When a tool is called, the response includes tool_calls:
{
  "id": "stratus-chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1702500000,
  "model": "stratus-x1ac-small-gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 82,
    "completion_tokens": 18,
    "total_tokens": 100
  }
}

Handling Tool Results

After executing the tool, send the result back:
// Tool was called
const toolCall = response.choices[0].message.tool_calls[0];

// Execute your function
const result = await myFunction(JSON.parse(toolCall.function.arguments));

// Send result back
const messages = [
  ...previousMessages,
  response.choices[0].message, // Assistant's message with tool_calls
  {
    role: 'tool',
    tool_call_id: toolCall.id,
    content: JSON.stringify(result)
  }
];

const finalResponse = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: messages,
  tools: tools
});

Multiple Tools Example

const tools = [
  {
    type: 'function',
    function: {
      name: 'search_hotels',
      description: 'Search for hotels in a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' },
          checkin: { type: 'string', description: 'YYYY-MM-DD format' },
          checkout: { type: 'string', description: 'YYYY-MM-DD format' },
          guests: { type: 'number' }
        },
        required: ['location', 'checkin', 'checkout']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'get_hotel_details',
      description: 'Get detailed information about a specific hotel',
      parameters: {
        type: 'object',
        properties: {
          hotel_id: { type: 'string' }
        },
        required: ['hotel_id']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'book_hotel',
      description: 'Book a hotel room',
      parameters: {
        type: 'object',
        properties: {
          hotel_id: { type: 'string' },
          checkin: { type: 'string' },
          checkout: { type: 'string' },
          guests: { type: 'number' },
          room_type: { type: 'string' }
        },
        required: ['hotel_id', 'checkin', 'checkout', 'guests']
      }
    }
  }
];

const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    {
      role: 'system',
      content: 'You are a hotel booking assistant. Help users find and book hotels.'
    },
    {
      role: 'user',
      content: 'Find me a hotel in San Francisco for next weekend, 2 guests'
    }
  ],
  tools: tools,
  tool_choice: 'auto'
});

// Model will likely call 'search_hotels' first

Parallel Tool Calls

The model can call multiple tools in parallel:
{
  "message": {
    "role": "assistant",
    "tool_calls": [
      {
        "id": "call_1",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"New York\"}"
        }
      },
      {
        "id": "call_2",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"Los Angeles\"}"
        }
      }
    ]
  }
}
Handle all tool calls and return results:
const toolCalls = message.tool_calls;
const toolMessages = [];

// Execute all tools
for (const toolCall of toolCalls) {
  const args = JSON.parse(toolCall.function.arguments);
  const result = await executeFunction(toolCall.function.name, args);

  toolMessages.push({
    role: 'tool',
    tool_call_id: toolCall.id,
    content: JSON.stringify(result)
  });
}

// Send all results back
const finalResponse = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    ...previousMessages,
    message,
    ...toolMessages
  ],
  tools: tools
});

Best Practices

1. Clear Tool Descriptions

// ❌ Vague
description: 'Gets data'

// ✅ Specific
description: 'Get current weather conditions including temperature, humidity, and forecast for a given location'

2. Validate Tool Arguments

const args = JSON.parse(toolCall.function.arguments);

// Validate required fields
if (!args.location) {
  throw new Error('Missing required parameter: location');
}

// Validate formats
if (args.date && !isValidDate(args.date)) {
  throw new Error('Invalid date format. Use YYYY-MM-DD');
}

3. Handle Tool Errors Gracefully

try {
  const result = await executeFunction(toolCall.function.name, args);
  content = JSON.stringify(result);
} catch (error) {
  content = JSON.stringify({
    error: true,
    message: error.message
  });
}

// Send error back to model
{
  role: 'tool',
  tool_call_id: toolCall.id,
  content: content
}

4. Limit Tool Count

Maximum 100 tools per request. For better performance, limit to 10-20 most relevant tools.

5. Use Enums for Constrained Values

parameters: {
  type: 'object',
  properties: {
    priority: {
      type: 'string',
      enum: ['low', 'medium', 'high', 'urgent'],
      description: 'Task priority level'
    }
  }
}

Streaming with Tools

Tool calls work with streaming:
const stream = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [...],
  tools: tools,
  stream: true
});

let toolCalls = [];
let currentToolCall = null;

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta;

  if (delta?.tool_calls) {
    for (const toolCallDelta of delta.tool_calls) {
      if (toolCallDelta.index !== undefined) {
        if (!toolCalls[toolCallDelta.index]) {
          toolCalls[toolCallDelta.index] = {
            id: toolCallDelta.id,
            type: 'function',
            function: { name: '', arguments: '' }
          };
        }

        const toolCall = toolCalls[toolCallDelta.index];

        if (toolCallDelta.function?.name) {
          toolCall.function.name += toolCallDelta.function.name;
        }
        if (toolCallDelta.function?.arguments) {
          toolCall.function.arguments += toolCallDelta.function.arguments;
        }
      }
    }
  }
}

// Process accumulated tool calls
console.log('Tool calls:', toolCalls);
  • Messages API - Anthropic-format alternative with tools support
  • Models - Available models for function calling

Integration with Agent Frameworks

LangChain

import { ChatOpenAI } from '@langchain/openai';

const model = new ChatOpenAI({
  openAIApiKey: process.env.STRATUS_API_KEY,
  modelName: 'stratus-x1ac-small-gpt-4o',
  configuration: {
    baseURL: 'https://api.stratus.run/v1'
  }
});

const response = await model.invoke([
  { role: 'system', content: 'Current state...' },
  { role: 'user', content: 'Goal...' }
]);

AutoGPT / CrewAI

from openai import OpenAI

# Initialize with Stratus endpoint
llm_provider = OpenAI(
    base_url="https://api.stratus.run/v1",
    api_key=os.environ["STRATUS_API_KEY"]
)

# Use in your agent
agent = Agent(
    llm=llm_provider,
    model="stratus-x1ac-small-gpt-4o",
    ...
)

Error Handling

try {
  const response = await client.chat.completions.create({...});
} catch (error) {
  if (error.status === 400) {
    console.error('Invalid request:', error.message);
  } else if (error.status === 401) {
    console.error('Invalid API key');
  } else if (error.status === 429) {
    console.error('Rate limit exceeded');
  } else if (error.status === 503) {
    console.error('Model not available');
  }
}
See Error Reference for details.

Best Practices

1. Structure State and Goal Clearly

Good:
messages: [
  {
    role: 'system',
    content: 'Current page: Amazon product page. Product: "Laptop Backpack". Price: $49.99. "Add to Cart" button visible.'
  },
  {
    role: 'user',
    content: 'Add this product to cart'
  }
]
Not Ideal:
messages: [
  {
    role: 'user',
    content: 'I want to buy this laptop backpack'
  }
]

2. Use Appropriate Model Size

  • small: Simple navigation, prototyping
  • base: Start here - good balance
  • large: Complex multi-step tasks
  • xl/huge: Research, specialized domains

3. Monitor Planning Confidence

if (response.stratus.overall_confidence < 0.7) {
  // Low confidence - consider fallback or user confirmation
  console.warn('Low planning confidence:', response.stratus.overall_confidence);
}

4. Leverage Action Sequence

console.log('Planned actions:', response.stratus.action_sequence);
// ["type", "click", "wait", "scroll", "click"]

// Use for debugging, logging, or verification

Next Steps