Messages (Anthropic Format)

Overview

The /v1/messages endpoint provides an Anthropic-compatible Messages API that seamlessly integrates with the Anthropic SDK while delivering Stratus world model predictions.

For Anthropic SDK users: Drop-in replacement - just change the base URL and API key.

Why Use This Endpoint?

Anthropic SDK Compatible

Use official Anthropic SDK with Stratus predictions

Format Parity

Full compatibility with Anthropic’s Messages API format

Streaming Support

Server-sent events (SSE) streaming fully supported

Tools & Function Calling

Complete tool use and function calling support

How It Works

Stratus internally converts between formats to deliver predictions:

Anthropic Request → OpenAI Format → Stratus World Model → OpenAI Format → Anthropic Response

This conversion is transparent - you don’t need to handle it. Just use the Anthropic SDK normally.

Authentication

Use your Stratus API key in the x-api-key header (Anthropic SDK convention):

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  baseURL: 'https://api.stratus.run',
  apiKey: process.env.STRATUS_API_KEY, // stratus_sk_live_*
});

Request Format

Basic Request

{
  "model": "stratus-x1ac-small-gpt-4o",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "What will happen if I click the login button?"
    }
  ]
}

With System Prompt

{
  "model": "stratus-x1ac-small-gpt-4o",
  "max_tokens": 1024,
  "system": "You are analyzing a web application interface.",
  "messages": [
    {
      "role": "user",
      "content": "Describe the current state"
    }
  ]
}

Parameters

model

string

required

Stratus model to use. See Models for the full list of 2,050+ combinations.Native examples: stratus-x1ac-small-gpt-4o, stratus-x1ac-base-claude-sonnet-4-20250514OpenRouter examples: stratus-x1ac-base-deepseek/deepseek-r1, stratus-x1ac-base-meta-llama/llama-3.3-70b-instruct

messages

array

required

Array of message objects with role and content. Roles: user or assistant.

max_tokens

integer

required

Maximum tokens to generate (1-4096). Note: Required in Anthropic format (unlike OpenAI).

system

string

System prompt (state description for Stratus world model predictions).

temperature

number

default:"1.0"

Sampling temperature (0.0-2.0). Higher = more random.

top_p

number

default:"1.0"

Nucleus sampling (0.0-1.0). Alternative to temperature.

top_k

integer

Top-k sampling. Limits token selection to top k options.

stream

boolean

default:"false"

Enable streaming responses via server-sent events (SSE).

stop_sequences

string[]

Stop generation when any sequence is encountered.

metadata

object

Optional metadata for request tracking.

tools

array

Tool definitions for function calling. See Tools & Function Calling.

tool_choice

object

Tool selection strategy: auto, any, or specific tool.

Response Format

Non-Streaming Response

{
  "id": "msg_01XYZ123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Clicking the login button will likely navigate you to the authentication page..."
    }
  ],
  "model": "stratus-x1ac-small-gpt-4o",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 24,
    "output_tokens": 156
  }
}

Response Fields

string

Unique message identifier (format: msg_*)

type

string

Always "message" for complete responses

role

string

Always "assistant" for responses

content

array

Array of content blocks. Each block has type and content (e.g., text, tool_use)

model

string

The Stratus model that generated the response

stop_reason

string

Why generation stopped: end_turn, max_tokens, stop_sequence, tool_use

usage

object

Token usage: input_tokens and output_tokens

Examples

Basic Prediction

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  baseURL: 'https://api.stratus.run',
  apiKey: process.env.STRATUS_API_KEY,
});

const message = await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  system: 'Current state: User is on the homepage with login button visible',
  messages: [
    { role: 'user', content: 'What happens if I click login?' }
  ],
});

console.log(message.content[0].text);

Streaming Response

const stream = await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  stream: true,
  messages: [
    { role: 'user', content: 'Predict the next 3 actions' }
  ],
});

for await (const event of stream) {
  if (event.type === 'content_block_delta') {
    process.stdout.write(event.delta.text);
  }
}

Conversation with History

const conversation = [
  { role: 'user', content: 'What is the current page?' },
  { role: 'assistant', content: 'You are on the product listing page.' },
  { role: 'user', content: 'If I search for "laptop", what will I see?' }
];

const message = await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  messages: conversation,
});

console.log(message.content[0].text);

Tools & Function Calling

The Messages endpoint supports full tool use and function calling.

Defining Tools

const message = await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  tools: [
    {
      name: 'get_weather',
      description: 'Get weather for a location',
      input_schema: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'City name'
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit'],
            description: 'Temperature unit'
          }
        },
        required: ['location']
      }
    }
  ],
  messages: [
    { role: 'user', content: 'What is the weather in San Francisco?' }
  ],
});

// Check if tool was used
if (message.stop_reason === 'tool_use') {
  const toolUse = message.content.find(block => block.type === 'tool_use');
  console.log('Tool:', toolUse.name);
  console.log('Input:', toolUse.input);
}

Tool Choice Strategies

// Auto - model decides
tool_choice: { type: 'auto' }

// Any - model must use a tool
tool_choice: { type: 'any' }

// Specific tool
tool_choice: { type: 'tool', name: 'get_weather' }

Tool Use Response

{
  "id": "msg_01ABC",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "I'll check the weather for you."
    },
    {
      "type": "tool_use",
      "id": "toolu_01XYZ",
      "name": "get_weather",
      "input": {
        "location": "San Francisco",
        "unit": "celsius"
      }
    }
  ],
  "stop_reason": "tool_use"
}

Handling Tool Results

// 1. Get tool use from response
const toolUse = message.content.find(block => block.type === 'tool_use');

// 2. Execute tool
const weatherData = await getWeather(toolUse.input.location);

// 3. Send result back
const followUp = await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  tools: [...], // same tools
  messages: [
    { role: 'user', content: 'What is the weather in San Francisco?' },
    { role: 'assistant', content: message.content },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: toolUse.id,
          content: JSON.stringify(weatherData)
        }
      ]
    }
  ],
});

Streaming Events

When stream: true, you receive a sequence of events:

Event Type	Description
`message_start`	Stream begins, includes initial message metadata
`content_block_start`	New content block starts
`content_block_delta`	Incremental content (text or tool input)
`content_block_stop`	Content block complete
`message_delta`	Message-level updates (e.g., usage)
`message_stop`	Stream complete

Example stream:

// message_start
{"type": "message_start", "message": {"id": "msg_01", "role": "assistant", ...}}

// content_block_start
{"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

// content_block_delta (multiple)
{"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "The"}}
{"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": " login"}}
{"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": " button"}}

// content_block_stop
{"type": "content_block_stop", "index": 0}

// message_stop
{"type": "message_stop"}

Comparison: Messages vs Chat Completions

Feature	`/v1/messages` (Anthropic)	`/v1/chat/completions` (OpenAI)
SDK	Anthropic SDK	OpenAI SDK
Format	Anthropic Messages API	OpenAI Chat Completions
max_tokens	Required	Optional (defaults to model max)
Streaming	SSE events	SSE events
Tools	`tools` with `input_schema`	`tools` with `function` schema
Auth Header	`x-api-key`	`Authorization: Bearer`

When to use Messages:

You’re already using Anthropic SDK in your codebase
You prefer Anthropic’s API conventions
You need Anthropic-specific features

When to use Chat Completions:

You’re using OpenAI SDK
You want the simpler, more common format
You’re familiar with OpenAI’s API

Error Handling

Errors follow Anthropic’s error format:

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens is required"
  }
}

Common error types:

invalid_request_error - Malformed request
authentication_error - Invalid API key
permission_error - Insufficient permissions
not_found_error - Model not found
rate_limit_error - Rate limit exceeded
api_error - Server error

See Errors for full error documentation.

Best Practices

1. Always Set max_tokens

Unlike OpenAI’s API, Anthropic format requires max_tokens:

// ❌ Will fail
await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  messages: [...]
});

// ✅ Correct
await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  messages: [...]
});

2. Use System Prompt for State

Provide current state in the system parameter:

const message = await client.messages.create({
  model: 'stratus-x1ac-small-gpt-4o',
  max_tokens: 1024,
  system: `Current state:
- Page: Checkout flow, step 2 of 3
- Cart: 2 items, $127.50 total
- User: Logged in, shipping address entered
- Next action options: proceed to payment, edit cart, apply coupon`,
  messages: [
    { role: 'user', content: 'Should I proceed to payment or review my cart?' }
  ],
});

3. Handle Streaming Properly

try {
  const stream = await client.messages.create({
    model: 'stratus-x1ac-small-gpt-4o',
    max_tokens: 1024,
    stream: true,
    messages: [...]
  });

  let fullText = '';

  for await (const event of stream) {
    if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
      fullText += event.delta.text;
      process.stdout.write(event.delta.text);
    }
  }

  console.log('\n\nFull response:', fullText);
} catch (error) {
  console.error('Stream error:', error);
}

4. Validate Tool Inputs

When using tools, validate inputs before execution:

const toolUse = message.content.find(block => block.type === 'tool_use');

if (toolUse) {
  // Validate against schema
  if (!toolUse.input.location) {
    throw new Error('Missing required parameter: location');
  }

  // Execute safely
  const result = await executeToolSafely(toolUse.name, toolUse.input);
}

Migration from Anthropic API

Switching from Anthropic’s API to Stratus is simple:

  import Anthropic from '@anthropic-ai/sdk';

  const client = new Anthropic({
-   apiKey: process.env.ANTHROPIC_API_KEY,
+   baseURL: 'https://api.stratus.run',
+   apiKey: process.env.STRATUS_API_KEY,
  });

  const message = await client.messages.create({
-   model: 'claude-3-5-sonnet-20241022',
+   model: 'stratus-x1ac-small-gpt-4o',
    max_tokens: 1024,
    messages: [
      { role: 'user', content: 'Hello!' }
    ],
  });

That’s it! All your existing code continues to work.

Performance

Metric	Value
Latency	Similar to `/v1/chat/completions`
Streaming	<100ms time to first token
Throughput	Same limits as other endpoints
Format Conversion	<1ms overhead (negligible)

Format conversion between Anthropic and OpenAI formats adds negligible overhead (<1ms).

Chat Completions - OpenAI-format alternative
Models - Available model combinations
Errors - Error handling guide

SDK Support

Works seamlessly with the official Anthropic SDK for TypeScript and Python. No custom SDK needed.

​Overview

​Why Use This Endpoint?

Anthropic SDK Compatible

Format Parity

Streaming Support

Tools & Function Calling

​How It Works

​Authentication

​Request Format

​Basic Request

​With System Prompt

​Parameters

​Response Format

​Non-Streaming Response

​Response Fields

​Examples

​Basic Prediction

​Streaming Response

​Conversation with History

​Tools & Function Calling

​Defining Tools

​Tool Choice Strategies

​Tool Use Response

​Handling Tool Results

​Streaming Events

​Comparison: Messages vs Chat Completions

​Error Handling

​Best Practices

​1. Always Set max_tokens

​2. Use System Prompt for State

​3. Handle Streaming Properly

​4. Validate Tool Inputs

​Migration from Anthropic API

​Performance

​Related Endpoints

SDK Support

Overview

Why Use This Endpoint?

How It Works

Authentication

Request Format

Basic Request

With System Prompt

Parameters

Response Format

Non-Streaming Response

Response Fields

Examples

Basic Prediction

Streaming Response

Conversation with History

Tools & Function Calling

Defining Tools

Tool Choice Strategies

Tool Use Response

Handling Tool Results

Streaming Events

Comparison: Messages vs Chat Completions

Error Handling

Best Practices

1. Always Set max_tokens

2. Use System Prompt for State

3. Handle Streaming Properly

4. Validate Tool Inputs

Migration from Anthropic API

Performance

Related Endpoints