Skip to main content
Build MCP environments that wrap any software for agent interaction. Think of it in three phases:
  • Phase 1: Environment - Wrap software in MCP tools
  • Phase 2: Tasks - Define evaluation scenarios
  • Phase 3: Agents - Run evaluations and training

Phase 1 · Create a project (2 min)

# Pick a template: blank, deep-research, browser
hud init my-env
cd my-env
Start development servers:
# Terminal 1 - Environment backend
cd environment && uv run uvicorn server:app --reload

# Terminal 2 - MCP server  
cd server && uv run hud dev

Edit-save-test flow

  1. Open server/tools.py, add or tweak a tool.
  2. Save – the mcp restarts instantly.
  3. Visit http://localhost:8765/docs to test tools/

Phase 2 · Write Tasks (2 min)

Build your environment image first (in the global folder):
hud build
Create tasks.json using docker run:
{
  "prompt": "Complete task",
  "mcp_config": {
    "local": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "my-env:0.1.0"]
    }
  },
  ...your setup and evaluation tools
}
See Task System or the hud init README for details.

Phase 3: Run Agents

# Test with agents
hud eval tasks.json

# Deploy to registry
hud push

Cheatsheet

ActionCommand
Create envhud init my-env -p blank
Hot-reload devhud dev --build
Interactive testhud dev --interactive
Troubleshoothud debug my-env:dev
Build imagehud build
Push to registryhud push

Learn more →

Have fun – and remember: stderr for logs, stdout for MCP!

Available Environments

Browse ready-to-use environments and templates at hud.ai/environments.
EnvironmentDescription
hud-blankMinimal starter template
hud-browserBrowser automation with Playwright
hud-remote-browserCloud browser providers (Steel, Anchor, etc.)
hud-deepresearchDeep research with web search
hud-rubricsLLM-as-judge evaluations
coding-templateFull coding env with VNC, Postgres, Redis
Each environment is available as a GitHub template you can fork and customize.