HUD Documentation — Evaluations and RL Environments.

Tasks format

HUD tasksets can be provided in two primary formats (both supported):

A single JSON file containing a list of task objects (recommended)

[
  {
    "id": "browser_2048_128",
    "prompt": "Reach 128 in 2048.",
    "mcp_config": {
      "hud": {
        "url": "https://mcp.hud.ai/v3/mcp",
        "headers": {
          "Authorization": "Bearer ${HUD_API_KEY}",
          "Mcp-Image": "hudevals/hud-browser:0.1.3"
        }
      }
    },
    "setup_tool": {"name": "launch_app", "arguments": {"app_name": "2048"}},
    "evaluate_tool": {"name": "evaluate", "arguments": {"name": "game_2048_max_number", "arguments": {"target": 128}}}
  }
]

Save as 2048-basic.json and run:

hud eval 2048-basic.json
hud rl 2048-basic.json

JSONL file with one task object per line

prompt: instruction for the agent
mcp_config: where to run the environment (local docker or remote MCP)
setup_tool (optional): a tool call to prepare the environment
evaluate_tool: a tool call to compute reward
system_prompt (optional): extra guidance for the agent

Hosting on HuggingFace

You can host tasksets on the Hub and fetch them with:

hud get hud-evals/2048-basic

The command downloads the JSONL task file and places it in your project directory. This allows running the full dataset or training with simply:

hud eval hud-evals/2048-basic
hud rl hud-evals/2048-basic

Tips

Keep tasks self-contained; use setup_tool to open apps or load data
Ensure evaluate_tool returns a numeric reward per episode
Use small task counts to iterate quickly; scale up once stable

Agent Evals

Learn how to run benchmarks

Environment Spec

Deep-dive into MCP configs and tools

Get Started

Ideas

Environments

RL

Agents

CLI Reference

SDK Reference

Dataset Design

Tasks format

Hosting on HuggingFace

Tips

Agent Evals

Environment Spec

Get Started

Ideas

Environments

RL

Agents

CLI Reference

SDK Reference

​Tasks format

​Hosting on HuggingFace

​Tips

Agent Evals

Environment Spec

Tasks format

Hosting on HuggingFace

Tips