1. Install
2. Set your API key
Get a key from hud.ai/project/api-keys - one key both routes models through the HUD gateway and traces every rollout.3. Write a task
Scaffold a complete, runnable example to start from:tasks.py:
tasks.py
4. Run it
hud eval spawns the environment locally, runs the claude agent, and grades it. Every rollout generates a replayable trace on hud.ai.
Next
Package & deploy
Build a portable image and run it anywhere.
Add capabilities
Give the agent a shell, browser, GUI, or robot to act on.
Design tasks for signal
Make tasks that actually train, not just test.
Run on any model
Claude, OpenAI, Gemini, or your own endpoint.