Chat Completions - STRATUS X1

Generate agent responses with Stratus planning and LLM execution.

Endpoint

POST https://api.stratus.run/v1/chat/completions

Request Format

Stratus follows the OpenAI Chat Completions API format:

interface ChatCompletionRequest {
  model: string;              // Required: Stratus model name
  messages: Message[];        // Required: Conversation history
  temperature?: number;       // Optional: 0-2, default 1.0
  max_tokens?: number;        // Optional: Max completion tokens
  top_p?: number;            // Optional: Nucleus sampling
  stream?: boolean;          // Optional: Stream response
  stop?: string[];           // Optional: Stop sequences
}

// Optional inline LLM provider key headers (take priority over vault keys and Formation pool)
// X-OpenAI-Key: sk-proj-...
// X-Anthropic-Key: sk-ant-...
// X-Google-Key: AIza...
// X-OpenRouter-Key: sk-or-...

interface Message {
  role: 'system' | 'user' | 'assistant' | 'tool' | 'developer';
  content: string | Array<object> | null;
  tool_calls?: Array<object>;
  tool_call_id?: string;
}

Example Request

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stratus.run/v1',
  apiKey: process.env.STRATUS_API_KEY
});

const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    {
      role: 'system',
      content: 'Current page: Google homepage with search box visible.'
    },
    {
      role: 'user',
      content: 'Find hotels in NYC for December 15-18, 2024'
    }
  ],
  temperature: 0.7,
  max_tokens: 1000
});

console.log(response.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.stratus.run/v1",
    api_key=os.environ["STRATUS_API_KEY"]
)

response = client.chat.completions.create(
    model="stratus-x1ac-small-gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "Current page: Google homepage with search box visible."
        },
        {
            "role": "user",
            "content": "Find hotels in NYC for December 15-18, 2024"
        }
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)

cURL

curl https://api.stratus.run/v1/chat/completions \
  -H "Authorization: Bearer stratus_sk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "stratus-x1ac-small-gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "Current page: Google homepage with search box visible."
      },
      {
        "role": "user",
        "content": "Find hotels in NYC for December 15-18, 2024"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

Response Format

interface ChatCompletionResponse {
  id: string;                    // Completion ID
  object: 'chat.completion';     // Object type
  created: number;               // Unix timestamp
  model: string;                 // Model used
  choices: Choice[];             // Generated completions
  usage: Usage;                  // Token usage
  stratus?: StratusMetadata;     // Stratus-specific metadata
}

interface Choice {
  index: number;
  message: Message;
  finish_reason: 'stop' | 'length' | 'tool_calls' | 'content_filter';
}

interface Usage {
  prompt_tokens: number;
  completion_tokens: number;
  total_tokens: number;
}

interface StratusMetadata {
  stratus_model: string;                    // e.g., "x1ac-base"
  execution_llm: string;                    // e.g., "gpt-4o"
  action_sequence?: string[];               // Planned action names
  predicted_state_changes?: number[];       // Per-step embedding norm magnitude
  confidence_labels?: string[];             // "High" | "Medium" | "Low" per step
  overall_confidence?: number;              // Top-1 action softmax probability (0-1)
  steps_to_goal?: number;                   // Length of action_sequence
  planning_time_ms?: number;                // Time spent in world model
  execution_time_ms?: number;               // Time spent in LLM
  total_steps_executed?: number;            // Number of LLM calls in agentic loop
  execution_trace?: Array<{                 // Per-step trace
    step: number;
    action: string;
    response_summary: string;
  }>;
  brain_signal?: {                          // Policy head recommendation
    action_type: string;                    // Best action (e.g., "close_deal")
    confidence: number;                     // Softmax probability within available set
    plan_ahead: string[];                   // Predicted next 1-2 actions (lookahead)
    simulation_confirmed: boolean;          // World model validated this action
    goal_proximity?: number;                // Cosine sim of current state vs goal
  };
  confidence?: number;                      // Legacy — use overall_confidence
  key_source?: 'user' | 'formation';        // Where the LLM key came from
  formation_markup_applied?: number | null;  // 0.25 when Formation pool used; null for BYOK
}

Example Response

{
  "id": "stratus-chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1702500000,
  "model": "stratus-x1ac-small-gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'll help you find hotels in NYC. Let me search for that.\n\n**Action:** Type 'NYC hotels December 15-18 2024' in the search box\n**Next:** Click the search button"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 85,
    "completion_tokens": 42,
    "total_tokens": 127
  },
  "stratus": {
    "stratus_model": "x1ac-base",
    "execution_llm": "gpt-4o",
    "action_sequence": ["type", "click", "scroll", "click"],
    "predicted_state_changes": [0.42, 0.38, 0.15, 0.51],
    "confidence_labels": ["High", "High", "Medium", "High"],
    "overall_confidence": 0.92,
    "steps_to_goal": 4,
    "planning_time_ms": 45,
    "execution_time_ms": 890,
    "brain_signal": {
      "action_type": "type",
      "confidence": 0.94,
      "plan_ahead": ["click", "scroll"],
      "simulation_confirmed": true,
      "goal_proximity": 0.73
    },
    "key_source": "formation",
    "formation_markup_applied": 0.25
  }
}

State and Goal Convention

Stratus works best when you structure messages as: System message = Current State Describe where the agent currently is.

{
  "role": "system",
  "content": "Current page: Google homepage. Search box visible and focused."
}

User message = Goal Describe what the agent should achieve.

{
  "role": "user",
  "content": "Find hotels in NYC for December 15-18"
}

This convention helps Stratus:

Encode current state accurately
Plan toward the goal
Predict state transitions

Parameters

model (required)

Combined Stratus + LLM model name. Native format: stratus-x1ac-{size}-{llm} OpenRouter format: stratus-x1ac-{size}-{or-provider}/{or-model}

// Native models
model: 'stratus-x1ac-small-gpt-4o'
model: 'stratus-x1ac-base-claude-sonnet-4-20250514'

// OpenRouter models
model: 'stratus-x1ac-base-deepseek/deepseek-r1'
model: 'stratus-x1ac-base-meta-llama/llama-3.3-70b-instruct'
model: 'stratus-x1ac-base-google/gemini-2.5-pro'

See Models for the complete list of 2,050+ combinations.

messages (required)

Array of conversation messages.

messages: [
  { role: 'system', content: 'Current state description' },
  { role: 'user', content: 'Goal description' }
]

Best Practice:

Use system for state/context
Use user for goals/instructions
Include assistant for multi-turn conversations

temperature (optional)

Controls randomness in LLM execution (not planning).

Range: 0-2
Default: 1.0
Lower = more deterministic
Higher = more creative

temperature: 1.0

Note: Stratus planning is deterministic. Temperature only affects LLM execution.

max_tokens (optional)

Maximum tokens in completion.

max_tokens: 1000

top_p (optional)

Nucleus sampling parameter.

Range: 0-1
Default: 1.0

top_p: 0.9

stream (optional)

Stream response chunks (SSE format).

stream: true

Example:

const stream = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [...],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

stop (optional)

Stop sequences to end generation.

stop: ['\n\n', 'END']

n (optional)

Number of completions to generate for each prompt.

Range: 1 or higher
Default: 1

n: 1

Generating multiple completions (n > 1) increases token usage and latency proportionally. Use sparingly.

presence_penalty (optional)

Penalizes tokens based on whether they appear in the text so far.

Range: -2.0 to 2.0
Default: 0.0
Positive values reduce repetition
Negative values encourage repetition

presence_penalty: 0.0

frequency_penalty (optional)

Penalizes tokens based on their frequency in the text so far.

Range: -2.0 to 2.0
Default: 0.0
Positive values reduce repetition of common phrases
Negative values encourage reuse of phrases

frequency_penalty: 0.0

user (optional)

Unique identifier for the end-user, used for tracking and monitoring.

user: 'user-12345'

The user parameter helps with abuse detection and per-user analytics. Recommended for production applications.

tools (optional)

Array of tool (function) definitions for function calling. Maximum 100 tools per request.

tools: [
  {
    type: 'function',
    function: {
      name: 'get_current_weather',
      description: 'Get the current weather in a given location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'The city and state, e.g. San Francisco, CA'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit']
          }
        },
        required: ['location']
      }
    }
  }
]

See Tools & Function Calling for complete documentation.

tool_choice (optional)

Controls which tool (if any) is called by the model.

'none' - Don’t call any tools
'auto' - Model decides (default)
'required' - Model must call a tool
{type: 'function', function: {name: 'tool_name'}} - Force specific tool

tool_choice: 'auto'

See Tools & Function Calling for examples.

Tools & Function Calling

Stratus supports OpenAI-compatible function calling, allowing the model to request tool execution during planning and execution.

Overview

Function calling enables:

Structured outputs - Get JSON responses in defined schemas
External data access - Fetch real-time information (weather, stock prices, etc.)
Action execution - Trigger operations (send email, create ticket, etc.)
Multi-step workflows - Chain tool calls to accomplish complex tasks

Defining Tools

Tools are defined using JSON Schema:

const tools = [
  {
    type: 'function',
    function: {
      name: 'search_web',
      description: 'Search the web for information about a query',
      parameters: {
        type: 'object',
        properties: {
          query: {
            type: 'string',
            description: 'The search query'
          },
          max_results: {
            type: 'number',
            description: 'Maximum number of results to return',
            default: 10
          }
        },
        required: ['query']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'City name or coordinates'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit'],
            default: 'celsius'
          }
        },
        required: ['location']
      }
    }
  }
];

Tool Choice Strategies

Control tool selection with the tool_choice parameter:

Auto (Default)

Model decides whether to call a tool:

const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [{ role: 'user', content: 'What is the weather in Paris?' }],
  tools: tools,
  tool_choice: 'auto' // or omit - this is default
});

None

Prevent tool calls:

tool_choice: 'none'

Required

Force the model to call at least one tool:

tool_choice: 'required'

Use 'required' when you always want a structured tool call response, ensuring the model doesn’t just respond with text.

Specific Tool

Force a specific tool to be called:

tool_choice: {
  type: 'function',
  function: { name: 'get_weather' }
}

Complete Example

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stratus.run/v1',
  apiKey: process.env.STRATUS_API_KEY
});

// 1. Define tools
const tools = [
  {
    type: 'function',
    function: {
      name: 'get_current_weather',
      description: 'Get the current weather in a given location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'The city and state, e.g. San Francisco, CA'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit']
          }
        },
        required: ['location']
      }
    }
  }
];

// 2. Make request with tools
const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    { role: 'user', content: 'What is the weather like in Boston?' }
  ],
  tools: tools,
  tool_choice: 'auto'
});

const message = response.choices[0].message;

// 3. Check if tool was called
if (message.tool_calls && message.tool_calls.length > 0) {
  const toolCall = message.tool_calls[0];

  console.log('Tool called:', toolCall.function.name);
  console.log('Arguments:', toolCall.function.arguments);

  // 4. Execute the tool
  const functionArgs = JSON.parse(toolCall.function.arguments);
  const weatherData = await getCurrentWeather(
    functionArgs.location,
    functionArgs.unit
  );

  // 5. Send tool result back to model
  const followUp = await client.chat.completions.create({
    model: 'stratus-x1ac-small-gpt-4o',
    messages: [
      { role: 'user', content: 'What is the weather like in Boston?' },
      message, // Assistant's response with tool call
      {
        role: 'tool',
        tool_call_id: toolCall.id,
        content: JSON.stringify(weatherData)
      }
    ],
    tools: tools
  });

  console.log('Final response:', followUp.choices[0].message.content);
} else {
  console.log('Direct response:', message.content);
}

Tool Call Response Format

When a tool is called, the response includes tool_calls:

{
  "id": "stratus-chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1702500000,
  "model": "stratus-x1ac-small-gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 82,
    "completion_tokens": 18,
    "total_tokens": 100
  }
}

Handling Tool Results

After executing the tool, send the result back:

// Tool was called
const toolCall = response.choices[0].message.tool_calls[0];

// Execute your function
const result = await myFunction(JSON.parse(toolCall.function.arguments));

// Send result back
const messages = [
  ...previousMessages,
  response.choices[0].message, // Assistant's message with tool_calls
  {
    role: 'tool',
    tool_call_id: toolCall.id,
    content: JSON.stringify(result)
  }
];

const finalResponse = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: messages,
  tools: tools
});

Multiple Tools Example

const tools = [
  {
    type: 'function',
    function: {
      name: 'search_hotels',
      description: 'Search for hotels in a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' },
          checkin: { type: 'string', description: 'YYYY-MM-DD format' },
          checkout: { type: 'string', description: 'YYYY-MM-DD format' },
          guests: { type: 'number' }
        },
        required: ['location', 'checkin', 'checkout']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'get_hotel_details',
      description: 'Get detailed information about a specific hotel',
      parameters: {
        type: 'object',
        properties: {
          hotel_id: { type: 'string' }
        },
        required: ['hotel_id']
      }
    }
  },
  {
    type: 'function',
    function: {
      name: 'book_hotel',
      description: 'Book a hotel room',
      parameters: {
        type: 'object',
        properties: {
          hotel_id: { type: 'string' },
          checkin: { type: 'string' },
          checkout: { type: 'string' },
          guests: { type: 'number' },
          room_type: { type: 'string' }
        },
        required: ['hotel_id', 'checkin', 'checkout', 'guests']
      }
    }
  }
];

const response = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    {
      role: 'system',
      content: 'You are a hotel booking assistant. Help users find and book hotels.'
    },
    {
      role: 'user',
      content: 'Find me a hotel in San Francisco for next weekend, 2 guests'
    }
  ],
  tools: tools,
  tool_choice: 'auto'
});

// Model will likely call 'search_hotels' first

Parallel Tool Calls

The model can call multiple tools in parallel:

{
  "message": {
    "role": "assistant",
    "tool_calls": [
      {
        "id": "call_1",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"New York\"}"
        }
      },
      {
        "id": "call_2",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"Los Angeles\"}"
        }
      }
    ]
  }
}

Handle all tool calls and return results:

const toolCalls = message.tool_calls;
const toolMessages = [];

// Execute all tools
for (const toolCall of toolCalls) {
  const args = JSON.parse(toolCall.function.arguments);
  const result = await executeFunction(toolCall.function.name, args);

  toolMessages.push({
    role: 'tool',
    tool_call_id: toolCall.id,
    content: JSON.stringify(result)
  });
}

// Send all results back
const finalResponse = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [
    ...previousMessages,
    message,
    ...toolMessages
  ],
  tools: tools
});

Best Practices

1. Clear Tool Descriptions

// ❌ Vague
description: 'Gets data'

// ✅ Specific
description: 'Get current weather conditions including temperature, humidity, and forecast for a given location'

2. Validate Tool Arguments

const args = JSON.parse(toolCall.function.arguments);

// Validate required fields
if (!args.location) {
  throw new Error('Missing required parameter: location');
}

// Validate formats
if (args.date && !isValidDate(args.date)) {
  throw new Error('Invalid date format. Use YYYY-MM-DD');
}

3. Handle Tool Errors Gracefully

try {
  const result = await executeFunction(toolCall.function.name, args);
  content = JSON.stringify(result);
} catch (error) {
  content = JSON.stringify({
    error: true,
    message: error.message
  });
}

// Send error back to model
{
  role: 'tool',
  tool_call_id: toolCall.id,
  content: content
}

4. Limit Tool Count

Maximum 100 tools per request. For better performance, limit to 10-20 most relevant tools.

5. Use Enums for Constrained Values

parameters: {
  type: 'object',
  properties: {
    priority: {
      type: 'string',
      enum: ['low', 'medium', 'high', 'urgent'],
      description: 'Task priority level'
    }
  }
}

Streaming with Tools

Tool calls work with streaming:

const stream = await client.chat.completions.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [...],
  tools: tools,
  stream: true
});

let toolCalls = [];
let currentToolCall = null;

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta;

  if (delta?.tool_calls) {
    for (const toolCallDelta of delta.tool_calls) {
      if (toolCallDelta.index !== undefined) {
        if (!toolCalls[toolCallDelta.index]) {
          toolCalls[toolCallDelta.index] = {
            id: toolCallDelta.id,
            type: 'function',
            function: { name: '', arguments: '' }
          };
        }

        const toolCall = toolCalls[toolCallDelta.index];

        if (toolCallDelta.function?.name) {
          toolCall.function.name += toolCallDelta.function.name;
        }
        if (toolCallDelta.function?.arguments) {
          toolCall.function.arguments += toolCallDelta.function.arguments;
        }
      }
    }
  }
}

// Process accumulated tool calls
console.log('Tool calls:', toolCalls);

Messages API - Anthropic-format alternative with tools support
Models - Available models for function calling

Integration with Agent Frameworks

LangChain

import { ChatOpenAI } from '@langchain/openai';

const model = new ChatOpenAI({
  openAIApiKey: process.env.STRATUS_API_KEY,
  modelName: 'stratus-x1ac-small-gpt-4o',
  configuration: {
    baseURL: 'https://api.stratus.run/v1'
  }
});

const response = await model.invoke([
  { role: 'system', content: 'Current state...' },
  { role: 'user', content: 'Goal...' }
]);

AutoGPT / CrewAI

from openai import OpenAI

# Initialize with Stratus endpoint
llm_provider = OpenAI(
    base_url="https://api.stratus.run/v1",
    api_key=os.environ["STRATUS_API_KEY"]
)

# Use in your agent
agent = Agent(
    llm=llm_provider,
    model="stratus-x1ac-small-gpt-4o",
    ...
)

Error Handling

try {
  const response = await client.chat.completions.create({...});
} catch (error) {
  if (error.status === 400) {
    console.error('Invalid request:', error.message);
  } else if (error.status === 401) {
    console.error('Invalid API key');
  } else if (error.status === 429) {
    console.error('Rate limit exceeded');
  } else if (error.status === 503) {
    console.error('Model not available');
  }
}

See Error Reference for details.

Best Practices

1. Structure State and Goal Clearly

Good:

messages: [
  {
    role: 'system',
    content: 'Current page: Amazon product page. Product: "Laptop Backpack". Price: $49.99. "Add to Cart" button visible.'
  },
  {
    role: 'user',
    content: 'Add this product to cart'
  }
]

Not Ideal:

messages: [
  {
    role: 'user',
    content: 'I want to buy this laptop backpack'
  }
]

2. Use Appropriate Model Size

small: Simple navigation, prototyping
base: Start here - good balance
large: Complex multi-step tasks
xl/huge: Research, specialized domains

3. Monitor Planning Confidence

if (response.stratus.overall_confidence < 0.7) {
  // Low confidence - consider fallback or user confirmation
  console.warn('Low planning confidence:', response.stratus.overall_confidence);
}

4. Leverage Action Sequence

console.log('Planned actions:', response.stratus.action_sequence);
// ["type", "click", "wait", "scroll", "click"]

// Use for debugging, logging, or verification

Next Steps

Embeddings API - Get state embeddings
Error Reference - Handle errors properly
Quickstart Guide - Full integration example

​Endpoint

​Request Format

​Example Request

​Python

​cURL

​Response Format

​Example Response

​State and Goal Convention

​Parameters

​model (required)

​messages (required)

​temperature (optional)

​max_tokens (optional)

​top_p (optional)

​stream (optional)

​stop (optional)

​n (optional)

​presence_penalty (optional)

​frequency_penalty (optional)

​user (optional)

​tools (optional)

​tool_choice (optional)

​Tools & Function Calling

​Overview

​Defining Tools

​Tool Choice Strategies

​Auto (Default)

​None

​Required

​Specific Tool

​Complete Example

​Tool Call Response Format

​Handling Tool Results

​Multiple Tools Example

​Parallel Tool Calls

​Best Practices

​1. Clear Tool Descriptions

​2. Validate Tool Arguments

​3. Handle Tool Errors Gracefully

​4. Limit Tool Count

​5. Use Enums for Constrained Values

​Streaming with Tools

​Related Documentation

​Integration with Agent Frameworks

​LangChain

​AutoGPT / CrewAI

​Error Handling

​Best Practices

​1. Structure State and Goal Clearly

​2. Use Appropriate Model Size

​3. Monitor Planning Confidence

​4. Leverage Action Sequence

​Next Steps

Endpoint

Request Format

Example Request

Python

cURL

Response Format

Example Response

State and Goal Convention

Parameters

model (required)

messages (required)

temperature (optional)

max_tokens (optional)

top_p (optional)

stream (optional)

stop (optional)

n (optional)

presence_penalty (optional)

frequency_penalty (optional)

user (optional)

tools (optional)

tool_choice (optional)

Tools & Function Calling

Overview

Defining Tools

Tool Choice Strategies

Auto (Default)

None

Required

Specific Tool

Complete Example

Tool Call Response Format

Handling Tool Results

Multiple Tools Example

Parallel Tool Calls

Best Practices

1. Clear Tool Descriptions

2. Validate Tool Arguments

3. Handle Tool Errors Gracefully

4. Limit Tool Count

5. Use Enums for Constrained Values

Streaming with Tools

Related Documentation

Integration with Agent Frameworks

LangChain

AutoGPT / CrewAI

Error Handling

Best Practices

1. Structure State and Goal Clearly

2. Use Appropriate Model Size

3. Monitor Planning Confidence

4. Leverage Action Sequence

Next Steps