Web Navigation Agent

Web navigation is where Stratus delivers its clearest wins. A standard LLM agent guesses each step from context. Stratus encodes the current page state into a 768-dim embedding, predicts what the next state will look like after each action, then hands a focused execution prompt to your LLM — using 68% fewer tokens with 2–3x higher task success.

10/10 Levels

Stratus-powered agents completed every level in our benchmark. Baseline: 4/10.

2.3× Score

8717 vs 3750 total points. The gap widens on tasks with cascading state transitions.

68% Fewer Tokens

World model compression keeps context tight — your LLM sees a focused plan, not raw page noise.

What We’re Building

A hotel booking agent that navigates the full flow:

Search for “NYC hotels December 15–18”
Filter by rating and price
Select a hotel and check availability
Fill in guest details and proceed to checkout

Each step transitions the page state. Stratus predicts those transitions upfront and embeds them into the execution prompt — so your LLM doesn’t get lost between steps.

Setup

npm install openai dotenv

# .env
STRATUS_API_KEY=stratus_sk_live_your-key-here

The Agent

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stratus.run/v1',
  apiKey: process.env.STRATUS_API_KEY
});

interface NavigationStep {
  state: string;
  goal: string;
}

async function navigate(step: NavigationStep) {
  const response = await client.chat.completions.create({
    model: 'stratus-x1ac-base-gpt-4o',
    messages: [
      { role: 'system', content: `Current state: ${step.state}` },
      { role: 'user', content: step.goal }
    ]
  });

  const { action_sequence, confidence, planning_time_ms } = response.stratus;

  return {
    action: response.choices[0].message.content,
    plan: action_sequence,
    confidence
  };
}

async function bookHotel() {
  const flow: NavigationStep[] = [
    {
      state: 'Google homepage. Search box visible and active.',
      goal: 'Search for "NYC hotels December 15-18"'
    },
    {
      state: 'Google search results. Hotel listings visible. Filter panel on left: Price, Rating, Amenities.',
      goal: 'Filter by 4+ star rating and sort by price low to high'
    },
    {
      state: 'Filtered results. Top result: "The Manhattan Hotel" $189/night, 4.7 stars, "Check Availability" button visible.',
      goal: 'Click Check Availability for The Manhattan Hotel'
    },
    {
      state: 'Hotel availability page. Date picker shows Dec 15-18 pre-filled. Room types: Standard ($189), Deluxe ($249), Suite ($399). "Book Now" buttons next to each.',
      goal: 'Select the Standard room and click Book Now'
    },
    {
      state: 'Booking form. Fields: First Name, Last Name, Email, Phone. "Continue to Payment" button at bottom.',
      goal: 'Fill in guest details: John Smith, john@example.com, +1-555-0100'
    }
  ];

  for (const [i, step] of flow.entries()) {
    const result = await navigate(step);
    console.log(`Step ${i + 1}: ${result.action}`);
    console.log(`  Plan: ${result.plan.join(' → ')}`);
    console.log(`  Confidence: ${result.overall_confidence}\n`);
  }
}

bookHotel();

State Description Quality

The quality of your system message is the single biggest lever for agent performance.

Low Quality
High Quality

{ "role": "system", "content": "hotel website" }

Confidence: ~0.58. Stratus can’t predict the next state without knowing what’s visible.

{
  "role": "system",
  "content": "Hotel availability page. Dec 15-18 pre-filled. Standard room $189/night visible. 'Book Now' button below each room type. Stripe payment form not yet visible."
}

Confidence: ~0.94. Stratus knows exactly what transition to predict and which actions to plan.

Include: what UI elements are visible, current values of interactive fields, and what is NOT yet visible. Negative state is just as important as positive.

Handling Low Confidence

When confidence drops below 0.75, the agent hit a state it can’t predict well. Don’t retry blindly — inspect the state and add more detail.

const result = await navigate(step);

if (result.overall_confidence < 0.75) {
  console.warn(`Low confidence at step ${i}: ${result.overall_confidence}`);
  console.warn('Add more state detail:', step.state);
  // Optionally: request a screenshot, capture DOM state, or ask user
}

Choosing a Model

Model	Best For	Latency
`stratus-x1ac-small-gpt-4o-mini`	Prototyping, simple linear flows	Fastest
`stratus-x1ac-small-gpt-4o`	Most navigation tasks	Fast
`stratus-x1ac-base-gpt-4o`	Complex multi-step, forms with dependencies	Moderate
`stratus-x1ac-base-claude-sonnet-4-5`	Long-context flows, detailed reasoning	Moderate

Next Steps

Cascade Prediction

When actions trigger downstream effects — handle chains before they fire.

Temporal Sequencing

Order-sensitive workflows with concurrency constraints.

API Reference

Full chat completions docs and all parameters.

10/10 Levels

2.3× Score

68% Fewer Tokens

​What We’re Building

​Setup

​The Agent

​State Description Quality

​Handling Low Confidence

​Choosing a Model

​Next Steps

Cascade Prediction

Temporal Sequencing

API Reference

What We’re Building

Setup

The Agent

State Description Quality

Handling Low Confidence

Choosing a Model

Next Steps