Skip to main content
Web navigation is where Stratus delivers its clearest wins. A standard LLM agent guesses each step from context. Stratus encodes the current page state into a 768-dim embedding, predicts what the next state will look like after each action, then hands a focused execution prompt to your LLM — using 68% fewer tokens with 2–3x higher task success.

10/10 Levels

Stratus-powered agents completed every level in our benchmark. Baseline: 4/10.

2.3× Score

8717 vs 3750 total points. The gap widens on tasks with cascading state transitions.

68% Fewer Tokens

World model compression keeps context tight — your LLM sees a focused plan, not raw page noise.

What We’re Building

A hotel booking agent that navigates the full flow:
  1. Search for “NYC hotels December 15–18”
  2. Filter by rating and price
  3. Select a hotel and check availability
  4. Fill in guest details and proceed to checkout
Each step transitions the page state. Stratus predicts those transitions upfront and embeds them into the execution prompt — so your LLM doesn’t get lost between steps.

Setup

npm install openai dotenv
# .env
STRATUS_API_KEY=stratus_sk_live_your-key-here

The Agent

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stratus.run/v1',
  apiKey: process.env.STRATUS_API_KEY
});

interface NavigationStep {
  state: string;
  goal: string;
}

async function navigate(step: NavigationStep) {
  const response = await client.chat.completions.create({
    model: 'stratus-x1ac-base-gpt-4o',
    messages: [
      { role: 'system', content: `Current state: ${step.state}` },
      { role: 'user', content: step.goal }
    ]
  });

  const { action_sequence, confidence, planning_time_ms } = response.stratus;

  return {
    action: response.choices[0].message.content,
    plan: action_sequence,
    confidence
  };
}

async function bookHotel() {
  const flow: NavigationStep[] = [
    {
      state: 'Google homepage. Search box visible and active.',
      goal: 'Search for "NYC hotels December 15-18"'
    },
    {
      state: 'Google search results. Hotel listings visible. Filter panel on left: Price, Rating, Amenities.',
      goal: 'Filter by 4+ star rating and sort by price low to high'
    },
    {
      state: 'Filtered results. Top result: "The Manhattan Hotel" $189/night, 4.7 stars, "Check Availability" button visible.',
      goal: 'Click Check Availability for The Manhattan Hotel'
    },
    {
      state: 'Hotel availability page. Date picker shows Dec 15-18 pre-filled. Room types: Standard ($189), Deluxe ($249), Suite ($399). "Book Now" buttons next to each.',
      goal: 'Select the Standard room and click Book Now'
    },
    {
      state: 'Booking form. Fields: First Name, Last Name, Email, Phone. "Continue to Payment" button at bottom.',
      goal: 'Fill in guest details: John Smith, john@example.com, +1-555-0100'
    }
  ];

  for (const [i, step] of flow.entries()) {
    const result = await navigate(step);
    console.log(`Step ${i + 1}: ${result.action}`);
    console.log(`  Plan: ${result.plan.join(' → ')}`);
    console.log(`  Confidence: ${result.overall_confidence}\n`);
  }
}

bookHotel();

State Description Quality

The quality of your system message is the single biggest lever for agent performance.
{ "role": "system", "content": "hotel website" }
Confidence: ~0.58. Stratus can’t predict the next state without knowing what’s visible.
Include: what UI elements are visible, current values of interactive fields, and what is NOT yet visible. Negative state is just as important as positive.

Handling Low Confidence

When confidence drops below 0.75, the agent hit a state it can’t predict well. Don’t retry blindly — inspect the state and add more detail.
const result = await navigate(step);

if (result.overall_confidence < 0.75) {
  console.warn(`Low confidence at step ${i}: ${result.overall_confidence}`);
  console.warn('Add more state detail:', step.state);
  // Optionally: request a screenshot, capture DOM state, or ask user
}

Choosing a Model

ModelBest ForLatency
stratus-x1ac-small-gpt-4o-miniPrototyping, simple linear flowsFastest
stratus-x1ac-small-gpt-4oMost navigation tasksFast
stratus-x1ac-base-gpt-4oComplex multi-step, forms with dependenciesModerate
stratus-x1ac-base-claude-sonnet-4-5Long-context flows, detailed reasoningModerate

Next Steps

Cascade Prediction

When actions trigger downstream effects — handle chains before they fire.

Temporal Sequencing

Order-sensitive workflows with concurrency constraints.

API Reference

Full chat completions docs and all parameters.