HUD Documentation — Evaluations and RL Environments.

Build MCP environments that wrap any software for agent interaction. Think of it in three phases:

Phase 1: Environment - Wrap software in MCP tools
Phase 2: Tasks - Define evaluation scenarios
Phase 3: Agents - Run evaluations and training

Phase 1 · Create a project (2 min)

# Pick a template: blank, deep-research, browser
hud init my-env
cd my-env

Start development servers:

# Terminal 1 - Environment backend
cd environment && uv run uvicorn server:app --reload

# Terminal 2 - MCP server  
cd server && uv run hud dev

Edit-save-test flow

Open server/tools.py, add or tweak a tool.
Save – the mcp restarts instantly.
Visit http://localhost:8765/docs to test tools/

Phase 2 · Write Tasks (2 min)

Build your environment image first (in the global folder):

hud build

Create tasks.json using docker run:

{
  "prompt": "Complete task",
  "mcp_config": {
    "local": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "my-env:0.1.0"]
    }
  },
  ...your setup and evaluation tools
}

See Task System or the hud init README for details.

Phase 3: Run Agents

# Test with agents
hud eval tasks.json

# Deploy to registry
hud push

Cheatsheet

Action	Command
Create env	`hud init my-env -p blank`
Hot-reload dev	`hud dev --build`
Interactive test	`hud dev --interactive`
Troubleshoot	`hud debug my-env:dev`
Build image	`hud build`
Push to registry	`hud push`

Learn more →

Technical spec: Environment Spec
CLI reference: CLI Overview

Have fun – and remember: stderr for logs, stdout for MCP!

Available Environments

Browse ready-to-use environments and templates at hud.ai/environments.

Environment	Description
`hud-blank`	Minimal starter template
`hud-browser`	Browser automation with Playwright
`hud-remote-browser`	Cloud browser providers (Steel, Anchor, etc.)
`hud-deepresearch`	Deep research with web search
`hud-rubrics`	LLM-as-judge evaluations
`coding-template`	Full coding env with VNC, Postgres, Redis

Each environment is available as a GitHub template you can fork and customize.

Get Started

Core Concepts

SDK Reference

Environments

HUD Gateway

Beta Features

Agents

CLI Reference

Community

Build Environments

Phase 1 · Create a project (2 min)

Edit-save-test flow

Phase 2 · Write Tasks (2 min)

Phase 3: Run Agents

Cheatsheet

Learn more →

Available Environments

Get Started

Core Concepts

SDK Reference

Environments

HUD Gateway

Beta Features

Agents

CLI Reference

Community

​Phase 1 · Create a project (2 min)

​Edit-save-test flow

​Phase 2 · Write Tasks (2 min)

​Phase 3: Run Agents

​Cheatsheet

​Learn more →

​Available Environments

Phase 1 · Create a project (2 min)

Edit-save-test flow

Phase 2 · Write Tasks (2 min)

Phase 3: Run Agents

Cheatsheet

Learn more →

Available Environments