Overview
The /v1/messages endpoint provides an Anthropic-compatible Messages API that seamlessly integrates with the Anthropic SDK while delivering Stratus world model predictions.
For Anthropic SDK users: Drop-in replacement - just change the base URL and API key.
Why Use This Endpoint?
Anthropic SDK Compatible Use official Anthropic SDK with Stratus predictions
Format Parity Full compatibility with Anthropic’s Messages API format
Streaming Support Server-sent events (SSE) streaming fully supported
Tools & Function Calling Complete tool use and function calling support
How It Works
Stratus internally converts between formats to deliver predictions:
Anthropic Request → OpenAI Format → Stratus World Model → OpenAI Format → Anthropic Response
This conversion is transparent - you don’t need to handle it. Just use the Anthropic SDK normally.
Authentication
Use your Stratus API key in the x-api-key header (Anthropic SDK convention):
import Anthropic from '@anthropic-ai/sdk' ;
const client = new Anthropic ({
baseURL: 'https://api.stratus.run' ,
apiKey: process . env . STRATUS_API_KEY , // stratus_sk_live_*
});
Basic Request
{
"model" : "stratus-x1ac-small-gpt-4o" ,
"max_tokens" : 1024 ,
"messages" : [
{
"role" : "user" ,
"content" : "What will happen if I click the login button?"
}
]
}
With System Prompt
{
"model" : "stratus-x1ac-small-gpt-4o" ,
"max_tokens" : 1024 ,
"system" : "You are analyzing a web application interface." ,
"messages" : [
{
"role" : "user" ,
"content" : "Describe the current state"
}
]
}
Parameters
Stratus model to use. See Models for the full list of 2,050+ combinations. Native examples: stratus-x1ac-small-gpt-4o, stratus-x1ac-base-claude-sonnet-4-20250514OpenRouter examples: stratus-x1ac-base-deepseek/deepseek-r1, stratus-x1ac-base-meta-llama/llama-3.3-70b-instruct
Array of message objects with role and content. Roles: user or assistant.
Maximum tokens to generate (1-4096). Note: Required in Anthropic format (unlike OpenAI).
System prompt (state description for Stratus world model predictions).
Sampling temperature (0.0-2.0). Higher = more random.
Nucleus sampling (0.0-1.0). Alternative to temperature.
Top-k sampling. Limits token selection to top k options.
Enable streaming responses via server-sent events (SSE).
Stop generation when any sequence is encountered.
Optional metadata for request tracking.
Tool selection strategy: auto, any, or specific tool.
Non-Streaming Response
{
"id" : "msg_01XYZ123" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"text" : "Clicking the login button will likely navigate you to the authentication page..."
}
],
"model" : "stratus-x1ac-small-gpt-4o" ,
"stop_reason" : "end_turn" ,
"stop_sequence" : null ,
"usage" : {
"input_tokens" : 24 ,
"output_tokens" : 156
}
}
Response Fields
Unique message identifier (format: msg_*)
Always "message" for complete responses
Always "assistant" for responses
Array of content blocks. Each block has type and content (e.g., text, tool_use)
The Stratus model that generated the response
Why generation stopped: end_turn, max_tokens, stop_sequence, tool_use
Token usage: input_tokens and output_tokens
Examples
Basic Prediction
import Anthropic from '@anthropic-ai/sdk' ;
const client = new Anthropic ({
baseURL: 'https://api.stratus.run' ,
apiKey: process . env . STRATUS_API_KEY ,
});
const message = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
system: 'Current state: User is on the homepage with login button visible' ,
messages: [
{ role: 'user' , content: 'What happens if I click login?' }
],
});
console . log ( message . content [ 0 ]. text );
Streaming Response
const stream = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
stream: true ,
messages: [
{ role: 'user' , content: 'Predict the next 3 actions' }
],
});
for await ( const event of stream ) {
if ( event . type === 'content_block_delta' ) {
process . stdout . write ( event . delta . text );
}
}
Conversation with History
const conversation = [
{ role: 'user' , content: 'What is the current page?' },
{ role: 'assistant' , content: 'You are on the product listing page.' },
{ role: 'user' , content: 'If I search for "laptop", what will I see?' }
];
const message = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
messages: conversation ,
});
console . log ( message . content [ 0 ]. text );
The Messages endpoint supports full tool use and function calling.
const message = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
tools: [
{
name: 'get_weather' ,
description: 'Get weather for a location' ,
input_schema: {
type: 'object' ,
properties: {
location: {
type: 'string' ,
description: 'City name'
},
unit: {
type: 'string' ,
enum: [ 'celsius' , 'fahrenheit' ],
description: 'Temperature unit'
}
},
required: [ 'location' ]
}
}
],
messages: [
{ role: 'user' , content: 'What is the weather in San Francisco?' }
],
});
// Check if tool was used
if ( message . stop_reason === 'tool_use' ) {
const toolUse = message . content . find ( block => block . type === 'tool_use' );
console . log ( 'Tool:' , toolUse . name );
console . log ( 'Input:' , toolUse . input );
}
// Auto - model decides
tool_choice : { type : 'auto' }
// Any - model must use a tool
tool_choice : { type : 'any' }
// Specific tool
tool_choice : { type : 'tool' , name : 'get_weather' }
{
"id" : "msg_01ABC" ,
"type" : "message" ,
"role" : "assistant" ,
"content" : [
{
"type" : "text" ,
"text" : "I'll check the weather for you."
},
{
"type" : "tool_use" ,
"id" : "toolu_01XYZ" ,
"name" : "get_weather" ,
"input" : {
"location" : "San Francisco" ,
"unit" : "celsius"
}
}
],
"stop_reason" : "tool_use"
}
// 1. Get tool use from response
const toolUse = message . content . find ( block => block . type === 'tool_use' );
// 2. Execute tool
const weatherData = await getWeather ( toolUse . input . location );
// 3. Send result back
const followUp = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
tools: [ ... ], // same tools
messages: [
{ role: 'user' , content: 'What is the weather in San Francisco?' },
{ role: 'assistant' , content: message . content },
{
role: 'user' ,
content: [
{
type: 'tool_result' ,
tool_use_id: toolUse . id ,
content: JSON . stringify ( weatherData )
}
]
}
],
});
Streaming Events
When stream: true, you receive a sequence of events:
Event Type Description message_startStream begins, includes initial message metadata content_block_startNew content block starts content_block_deltaIncremental content (text or tool input) content_block_stopContent block complete message_deltaMessage-level updates (e.g., usage) message_stopStream complete
Example stream:
// message_start
{ "type" : "message_start" , "message" : { "id" : "msg_01" , "role" : "assistant" , ... }}
// content_block_start
{ "type" : "content_block_start" , "index" : 0 , "content_block" : { "type" : "text" , "text" : "" }}
// content_block_delta (multiple)
{ "type" : "content_block_delta" , "index" : 0 , "delta" : { "type" : "text_delta" , "text" : "The" }}
{ "type" : "content_block_delta" , "index" : 0 , "delta" : { "type" : "text_delta" , "text" : " login" }}
{ "type" : "content_block_delta" , "index" : 0 , "delta" : { "type" : "text_delta" , "text" : " button" }}
// content_block_stop
{ "type" : "content_block_stop" , "index" : 0 }
// message_stop
{ "type" : "message_stop" }
Comparison: Messages vs Chat Completions
Feature /v1/messages (Anthropic)/v1/chat/completions (OpenAI)SDK Anthropic SDK OpenAI SDK Format Anthropic Messages API OpenAI Chat Completions max_tokens Required Optional (defaults to model max) Streaming SSE events SSE events Tools tools with input_schematools with function schemaAuth Header x-api-keyAuthorization: Bearer
When to use Messages:
You’re already using Anthropic SDK in your codebase
You prefer Anthropic’s API conventions
You need Anthropic-specific features
When to use Chat Completions:
You’re using OpenAI SDK
You want the simpler, more common format
You’re familiar with OpenAI’s API
Error Handling
Errors follow Anthropic’s error format:
{
"type" : "error" ,
"error" : {
"type" : "invalid_request_error" ,
"message" : "max_tokens is required"
}
}
Common error types:
invalid_request_error - Malformed request
authentication_error - Invalid API key
permission_error - Insufficient permissions
not_found_error - Model not found
rate_limit_error - Rate limit exceeded
api_error - Server error
See Errors for full error documentation.
Best Practices
1. Always Set max_tokens
Unlike OpenAI’s API, Anthropic format requires max_tokens:
// ❌ Will fail
await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
messages: [ ... ]
});
// ✅ Correct
await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
messages: [ ... ]
});
2. Use System Prompt for State
Provide current state in the system parameter:
const message = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
system: `Current state:
- Page: Checkout flow, step 2 of 3
- Cart: 2 items, $127.50 total
- User: Logged in, shipping address entered
- Next action options: proceed to payment, edit cart, apply coupon` ,
messages: [
{ role: 'user' , content: 'Should I proceed to payment or review my cart?' }
],
});
3. Handle Streaming Properly
try {
const stream = await client . messages . create ({
model: 'stratus-x1ac-small-gpt-4o' ,
max_tokens: 1024 ,
stream: true ,
messages: [ ... ]
});
let fullText = '' ;
for await ( const event of stream ) {
if ( event . type === 'content_block_delta' && event . delta . type === 'text_delta' ) {
fullText += event . delta . text ;
process . stdout . write ( event . delta . text );
}
}
console . log ( ' \n\n Full response:' , fullText );
} catch ( error ) {
console . error ( 'Stream error:' , error );
}
When using tools, validate inputs before execution:
const toolUse = message . content . find ( block => block . type === 'tool_use' );
if ( toolUse ) {
// Validate against schema
if ( ! toolUse . input . location ) {
throw new Error ( 'Missing required parameter: location' );
}
// Execute safely
const result = await executeToolSafely ( toolUse . name , toolUse . input );
}
Migration from Anthropic API
Switching from Anthropic’s API to Stratus is simple:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
- apiKey: process.env.ANTHROPIC_API_KEY,
+ baseURL: 'https://api.stratus.run',
+ apiKey: process.env.STRATUS_API_KEY,
});
const message = await client.messages.create({
- model: 'claude-3-5-sonnet-20241022',
+ model: 'stratus-x1ac-small-gpt-4o',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello!' }
],
});
That’s it! All your existing code continues to work.
Metric Value Latency Similar to /v1/chat/completions Streaming <100ms time to first token Throughput Same limits as other endpoints Format Conversion <1ms overhead (negligible)
Format conversion between Anthropic and OpenAI formats adds negligible overhead (<1ms).
SDK Support Works seamlessly with the official Anthropic SDK for TypeScript and Python. No custom SDK needed.