task.run /
taskset.run at execution time, and the same task and the same env.py run anywhere - only the
runtime changes.
Built-in runtimes
| Runtime | Where the env runs | When to reach for it |
|---|---|---|
LocalRuntime("env.py") | A child process from your source | Fastest iteration; local development |
DockerRuntime("my-env") | A fresh local container per rollout | Reproducibility and parity with production |
ModalRuntime("my-env") | A fresh Modal sandbox per rollout | Cloud scale, no infra to manage |
DaytonaRuntime("my-env") | A fresh Daytona sandbox per rollout | Cloud scale on Daytona |
Runtime("tcp://host:8765") | A substrate you already started | Attaching to a long-lived container or sandbox you own |
HUDRuntime() | A HUD-hosted env, leased by name and tunneled | Local agent loop against a deployed env |
HostedRuntime() | The whole rollout on a HUD-leased box | Agent and env run together off your machine |
from hud import LocalRuntime, DockerRuntime, HUDRuntime, HostedRuntime, Runtime); ModalRuntime and DaytonaRuntime import from hud.eval.
Omit
runtime= and it’s inferred from each task’s _source, the file its template was defined
in. When every task shares one _source, that source is served locally as LocalRuntime(source);
otherwise (mixed sources, or rows loaded from a file or the platform with no source) it falls back to
HUDRuntime(). Pass a runtime explicitly the moment you want something else.RuntimeConfig
RuntimeConfig carries the construction hints a container-based runtime needs: which image, how much
hardware, and what timeouts. Set it on the runtime (runtime_config=) or per row on
Task.runtime_config; the runtime merges the two and applies what it
supports.
| Field | Description |
|---|---|
image | Image to run. |
resources | RuntimeResources(cpu, memory_mb, gpu=RuntimeGPU(type, count)). |
limits | RuntimeLimits(startup_timeout_s, run_timeout_s). |
DockerRuntime, ModalRuntime, and DaytonaRuntime accept it (Docker
ignores limits; Daytona ignores run_timeout_s and resource overrides when booting from a snapshot).
LocalRuntime and HUDRuntime reject a per-task runtime_config.
Runtime directory
The constructor for each built-in runtime:LocalRuntime
path-.pyfile (or directory) that declares the env. The child’s working directory is the source’s directory, so sibling imports and relative data paths resolve.env- pin a specific env name when the source declares more than one. Defaults to the placed task’s env.ready_timeout- seconds to wait for the child to start serving.
DockerRuntime
image- image name to run; shorthand forruntime_config.image.port- port the image’s CMD serves inside the container (the scaffoldedDockerfile.hudserves8765).run_args- extradocker runflags, e.g.["--gpus", "all"]or["-e", "KEY=VAL"].runtime_config- aRuntimeConfig(image, resources) for finer control.
ModalRuntime
image_name- published Modal image name (the preferred durable handle), e.g.ModalRuntime("hud-libero-env").image- anImageto build lazily on first use, as an escape hatch.command- override the serving command (defaults to the scaffoldedhud serveentrypoint).workdir- working directory inside the sandbox. Left unset, Modal keeps the image’sWORKDIR.app_name/port/env_vars- Modal app name, in-sandbox serving port, and extra environment variables.
modal extra and a configured token.
DaytonaRuntime
snapshot_name- Daytona snapshot to boot from (the durable handle).image- Dockerfile/registry ref to build the snapshot once if it’s missing. Resources (cpu/memory/gpu) live on the snapshot.workdir/port- guest working directory and in-sandbox serving port.ssh_host/ssh_expires_minutes- SSH tunnel settings (Daytona exposes services over an SSH local-forward).
HUDRuntime
run_timeout- bound on one rollout end to end, including instance startup.runtime_url- override the runtime endpoint the tunnel connects to.
HostedRuntime
poll_interval- seconds between trace-status polls while the rollout runs remotely.run_timeout- bound on one rollout end to end, including instance provisioning and queueing.
HUDRuntime runs the agent loop locally against a tunneled env, HostedRuntime runs the
whole rollout off-box: the platform leases an instance, brings the env’s container up on it, and
runs the agent right next to it. This process only submits the rollout and polls its trace to
completion. It requires a gateway agent that can serialize its identity (Claude/OpenAI/Gemini).
Runtime
url- control-channel address of an already-running substrate (e.g.tcp://host:8765).params- connection-time data a transport may need (auth token, sandbox id).
Run on your own infra
A runtime is just a function: given a task, start a container somewhere and yield its control-channel URL. That one function is the whole integration surface for any provider - Modal, E2B, Runloop, your own Kubernetes:run.py
DockerRuntime and the rest are just built-in versions of this. Anything that starts your image and
hands back a URL plugs in with no change to the environment or the task - that’s what “run anywhere”
means concretely. Constructed directly, Runtime(url) yields itself with a no-op lifecycle, since
whoever provisioned the substrate owns teardown.
Placement can also vary per task: a runtime is called once per rollout with the task row being placed,
so one callable can route heavier rows to heavier substrates.