env.py, platform rows, Harbor
task dirs — is a frontend that loads into the same primitives (Environment,
Task, Taskset). Integrations are loaders, not converters: no codegen
roundtrip to run foreign tasks. The Harbor integration lives in the SDK repo at
integrations/harbor.py
— a recipe built only on the public SDK surface; copy it into your project or
run it from a checkout.
Prerequisites
- A Harbor task directory — each task has
task.toml+instruction.md, and usually anenvironment/(with aDockerfile) andtests/.
Load Harbor tasks
load(path) parses a Harbor task dir (or a dataset of them) into a Taskset
directly — one row per task dir (id = the dir name), sharing one declarative
Environment per distinct environment/ build context:
runtime=Runtime(url)); a docker provider that builds and runs each task’s
environment/ image is the planned follow-up:
Export HUD tasks to Harbor
export(source, out_dir) goes the other way: it turns a HUD task source (a
.py file/dir exposing Tasks, or a .json/.jsonl taskset next to its
env.py) into self-contained Harbor task folders:
| HUD | Harbor |
|---|---|
serving (python -m hud.environment.server) + task start | the baked image ENTRYPOINT serves the control channel and parks the run |
the agent works, writes answer.txt | the agent works in the container |
task evaluate (grade) | tests/test.sh grades the parked run, writes reward.txt |
ssh/mcp are exportable (Harbor is
shell-centric; rfb/cdp don’t map). The exported task grades over the HUD
control channel, so it needs Harbor’s default same-container verifier — don’t
set [verifier.environment] in task.toml.