Skip to main content
Full interactive API docs with request/response schemas are available at api.hud.so/docs.
The HUD platform exposes a REST API for programmatically triggering scenario runs and retrieving trace results. All endpoints require an API key via HTTP Bearer authentication:
Authorization: Bearer sk-hud-your-api-key
Get your API key at hud.ai/settings/api-keys.

Run a Scenario

POST /agent/run triggers a scenario and optionally waits for the result.
curl -X POST https://api.hud.ai/agent/run \
  -H "Authorization: Bearer $HUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "env_name": "healthcare-admin-ui",
    "scenario_name": "healthcare-admin-ui:add-patient-condition",
    "scenario_args": {"patient_id": "P-007", "condition": "I10"},
    "model": "claude-sonnet-4-20250514",
    "max_steps": 50,
    "sync": true,
    "timeout": 300
  }'

Request Fields

FieldTypeDefaultDescription
env_namestringrequiredEnvironment name
scenario_namestringrequiredScenario to run (as defined in the environment)
scenario_argsobject{}Scenario argument values
modelstringrequiredModel name (e.g. gpt-4o, claude-sonnet-4-20250514)
max_stepsinteger100Maximum agent steps (1-500)
syncbooleantrueIf true, polls for completion and returns result. If false, returns immediately with trace_id
timeoutinteger300Timeout in seconds for sync mode (30-600s). Ignored if sync=false
evalset_idstringnullOptional evalset ID to create a task in
callbackobjectnullWebhook callback config (see Callbacks)

Response

{
  "trace_id": "030083f2-e046-4b20-a184-cfcf16cd8830",
  "trace_url": "https://hud.ai/trace/030083f2-e046-4b20-a184-cfcf16cd8830",
  "status": "completed",
  "reward": 0.6,
  "final_response": "I've added Essential hypertension (I10) to patient P-007.",
  "steps_taken": 4,
  "duration_seconds": 112.3,
  "error": null
}
When sync=false, only trace_id, trace_url, and status ("running") are returned. Use the trace read-back endpoints below to poll for the result.

Read Back Trace Results

After a run completes (or while polling with sync=false), use these endpoints to retrieve trace data.

Get Trace Details

GET /telemetry/traces/{trace_id} returns reward, status, scenario config, and optionally the full trajectory and logs.
curl -H "Authorization: Bearer $HUD_API_KEY" \
  "https://api.hud.ai/telemetry/traces/{trace_id}?include_trajectory=true"
Query Parameters:
ParameterTypeDefaultDescription
include_trajectorybooleanfalseInclude full trajectory (all agent actions and observations)
include_logsbooleanfalseInclude environment logs (container stdout/stderr)
include_rollout_logsbooleanfalseInclude orchestrator/worker logs
Response:
{
  "trace_id": "030083f2-e046-4b20-a184-cfcf16cd8830",
  "job_id": null,
  "status": "completed",
  "reward": 0.6,
  "error": null,
  "external_id": "0001",
  "scenario": "add-patient-condition",
  "scenario_args": [{"name": "patient_id", "value": "P-007"}],
  "prompt": "Add a new diagnosis of Essential hypertension...",
  "metadata": {},
  "trajectory": [...],
  "trajectory_length": 10
}

Get Full Trace Telemetry

GET /telemetry/trace/{trace_id} returns the full trace with signed screenshot URLs and complete trajectory steps.
curl -H "Authorization: Bearer $HUD_API_KEY" \
  "https://api.hud.ai/telemetry/trace/{trace_id}"
This is the heavier endpoint — it fetches all telemetry spans from storage, signs screenshot URLs, and returns the complete trajectory. Use GET /telemetry/traces/{trace_id} for a lighter read.

Batch Status Check

POST /telemetry/traces/status checks status for multiple traces in a single request. Useful for lightweight polling.
curl -X POST https://api.hud.ai/telemetry/traces/status \
  -H "Authorization: Bearer $HUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"trace_ids": ["trace-1", "trace-2", "trace-3"]}'
Response:
{
  "traces": {
    "trace-1": {"status": "completed", "full_trajectory": true},
    "trace-2": {"status": "running", "full_trajectory": false},
    "trace-3": {"status": "error", "full_trajectory": true}
  }
}

Task Management

Upload Tasks to a Taskset

POST /tasks/upload creates or updates tasks in a taskset (evalset). Tasks are resolved against their environment’s scenarios. If the taskset doesn’t exist, it’s created automatically.
curl -X POST https://api.hud.ai/tasks/upload \
  -H "Authorization: Bearer $HUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-checkout-taskset",
    "tasks": [
      {
        "slug": "checkout-laptop",
        "scenario": "checkout",
        "args": {"product_name": "Laptop"},
        "env": {"name": "my-store-env"}
      },
      {
        "slug": "checkout-phone",
        "scenario": "checkout",
        "args": {"product_name": "Phone"},
        "env": {"name": "my-store-env"}
      }
    ]
  }'

Request Fields

FieldTypeDescription
namestringTaskset name. Created if it doesn’t exist
tasksarrayList of tasks to upload
tasks[].slugstring | nullStable task identifier (stored as external_id). Used for upsert — if a task with this slug exists, it’s updated. Must not be a 4-digit number (reserved for auto-assigned IDs)
tasks[].scenariostringScenario name to run
tasks[].argsobjectScenario argument values
tasks[].envobjectEnvironment config (required)
tasks[].env.namestringEnvironment name (required for scenario resolution)
tasks[].env.includestring[]Tool whitelist (optional)
tasks[].env.excludestring[]Tool blacklist (optional)
tasks[].agent_configobject | nullAgent behavior overrides
tasks[].agent_config.system_promptstring | nullCustom system prompt
tasks[].validationarray | nullTool calls representing successful completion (stored in metadata)

Response

{
  "evalset_id": "a1b2c3d4-...",
  "evalset_name": "my-checkout-taskset",
  "tasks_created": 2,
  "tasks_updated": 0
}
Tasks with a matching slug in the same taskset are updated instead of duplicated.

Add Tasks by Evalset ID

POST /tasks/evalsets/{evalset_id}/tasks adds tasks to an existing taskset by its UUID. This endpoint uses the internal task format with explicit scenario_id references.
curl -X POST https://api.hud.ai/tasks/evalsets/{evalset_id}/tasks \
  -H "Authorization: Bearer $HUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tasks": [
      {
        "prompt": "Add item to cart and complete checkout",
        "tier": 1,
        "scenario_id": "scenario-uuid-here",
        "scenario_args": [
          {"name": "product_name", "value": "Laptop"}
        ]
      }
    ]
  }'

Request Fields

FieldTypeDescription
tasksarrayList of tasks (min 1)
tasks[].promptstring | nullTask prompt
tasks[].tierintegerTask difficulty tier
tasks[].scenario_idstring | nullScenario UUID to run
tasks[].scenario_argsarray | null[{"name": "...", "value": "..."}] argument values
tasks[].external_idstringTask identifier (auto-assigned if omitted)
tasks[].tagsstring[]Tags for organization
tasks[].env_configobject | nullTool include/exclude lists
tasks[].agent_configobject | nullAgent behavior overrides
tasks[].metadataobject | nullArbitrary metadata

Response

{
  "tasks_added": 1
}

Job Endpoints

If you create runs with skip_job=false, you can query job-level data.

Get Job Telemetry

GET /telemetry/job/{job_id} returns job metrics, all task runs with rewards, timing, and metadata.
curl -H "Authorization: Bearer $HUD_API_KEY" \
  "https://api.hud.ai/telemetry/job/{job_id}"

List Trace IDs for a Job

GET /telemetry/job/{job_id}/trace-ids returns all trace IDs associated with a job.
curl -H "Authorization: Bearer $HUD_API_KEY" \
  "https://api.hud.ai/telemetry/job/{job_id}/trace-ids"

Callbacks

Instead of polling, you can configure a webhook callback to receive results when a trace completes.
{
  "env_name": "healthcare-admin-ui",
  "scenario_name": "healthcare-admin-ui:add-patient-condition",
  "scenario_args": {},
  "model": "claude-sonnet-4-20250514",
  "sync": false,
  "callback": {
    "url": "https://your-server.com/webhook",
    "include_reward": true,
    "include_response": true,
    "include_trajectory": false,
    "retry_count": 3,
    "metadata": {"my_custom_id": "abc-123"}
  }
}
When the trace completes, HUD sends a POST to your URL:
{
  "trace_id": "030083f2-...",
  "status": "completed",
  "completed_at": "2026-02-25T06:00:27Z",
  "reward": 0.6,
  "response": "I've added Essential hypertension...",
  "metadata": {"my_custom_id": "abc-123"}
}

Callback Fields

FieldTypeDefaultDescription
urlstringrequiredURL to POST results to
include_rewardbooleantrueInclude reward in payload
include_responsebooleantrueInclude agent’s final response
include_trajectorybooleanfalseInclude full trajectory (can be large)
retry_countinteger3Number of retries on failure (0-10)
headersobjectnullCustom headers to include
metadataobjectnullArbitrary metadata echoed back in the callback

Async Polling Pattern

A common pattern for programmatic use: launch with sync=false, then poll the lightweight status endpoint until completion, then fetch the full result.
import httpx
import asyncio

API_KEY = "sk-hud-..."
BASE = "https://api.hud.ai"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

async def run_and_poll():
    async with httpx.AsyncClient() as client:
        # 1. Launch the run
        resp = await client.post(f"{BASE}/agent/run", headers=HEADERS, json={
            "env_name": "healthcare-admin-ui",
            "scenario_name": "healthcare-admin-ui:add-patient-condition",
            "scenario_args": {"patient_id": "P-007"},
            "model": "claude-sonnet-4-20250514",
            "sync": False,
        })
        trace_id = resp.json()["trace_id"]
        print(f"Launched: {resp.json()['trace_url']}")

        # 2. Poll for completion
        while True:
            status_resp = await client.post(
                f"{BASE}/telemetry/traces/status",
                headers=HEADERS,
                json={"trace_ids": [trace_id]},
            )
            trace_status = status_resp.json()["traces"].get(trace_id, {})
            if trace_status.get("status") in ("completed", "error"):
                break
            await asyncio.sleep(5)

        # 3. Fetch the result
        result = await client.get(
            f"{BASE}/telemetry/traces/{trace_id}",
            headers=HEADERS,
            params={"include_trajectory": "true"},
        )
        data = result.json()
        print(f"Reward: {data['reward']}")
        print(f"Steps: {data.get('trajectory_length')}")