Why Environments, Not API Servers?
Your production API is a single live instance with shared state—you can’t run 500 tests against it in parallel without causing chaos. Environments spin up fresh for every evaluation: isolated, deterministic, reproducible. Run thousands in parallel, each starting from the exact state you define, each generating training data. An API server is a live system you observe. An environment is a sandbox you control.Tools
Start withhud init to scaffold an environment—works with existing codebases or from scratch:
@env.tool() and agents can call it:
Scripts
To evaluate an agent, you need two things: what to tell it, and how to score what it did. Scripts capture both with twoyield statements:
Evals
Call the environment with a scenario name and arguments to create a task:Mock Mode
Testing your agent loop without hitting real services? Mock mode returns fake responses based on tool schemas:env.mock() for local testing.