Skip to main content
A capability is a connection the environment exposes; a harness attaches its own tools to it. The same environment serves a one-shot Q&A or a full computer-use rollout, depending on which capabilities a harness opens.
ProtocolWire idWhat it exposesSpun up with
sshssh/2Shell + files (bash, SFTP) in a sandboxed workspaceWorkspace (built in)
mcpmcp/2025-11-25Your own tools over the Model Context Protocolfastmcp
cdpcdp/1.3Browser control over the Chrome DevTools ProtocolChromium (playwright)
rfbrfb/3.8Full computer-use over VNC: screen + keyboard/mouseXvfb + x11vnc
robotopenpi/0Schema-driven robot observation/action loop over WebSocket (beta)robot bridge
from hud.capabilities import Capability

The Capability dataclass

A capability is (name, protocol, url, params) — concrete wire data carrying the real address of something serving the protocol.
FieldTypeDescription
namestrCapability name (e.g. "shell", "browser").
protocolstrWire protocol id (e.g. "ssh/2").
urlstrConnection URL.
paramsdictProtocol-specific connection params.
Each protocol has a factory (Capability.ssh, .mcp, .cdp, .rfb, .robot) that normalizes the URL and fills defaults; cap.to_manifest() / Capability.from_manifest(data) round-trip it.

Spinning up a capability

Every capability points at a daemon. For one that already exists, pass the factory to the constructor. For a daemon the environment runs itself, the pattern is always the same: start it in @env.initialize, block until it’s listening, publish its address with env.add_capability(...), and tear it down in @env.shutdown. The env doesn’t accept a client connection until every initialize hook returns, so waiting for the port closes the startup race. A small readiness helper the snippets below reuse:
import asyncio
import socket

async def _listening(host: str, port: int, timeout: float = 15.0) -> None:
    """Block until host:port accepts a connection — call before publishing."""
    loop = asyncio.get_running_loop()
    deadline = loop.time() + timeout
    while loop.time() < deadline:
        try:
            socket.create_connection((host, port), timeout=0.5).close()
            return
        except OSError:
            await asyncio.sleep(0.1)
    raise RuntimeError(f"nothing listening on {host}:{port}")
Bind every daemon to 127.0.0.1: a loopback capability is forwarded through the env’s one control port (see Bindings are always reachable), so nothing else needs publishing.

ssh — a sandboxed shell

The shell case is built in. A Workspace is a sandboxed directory the agent gets over ssh; env.workspace(root) starts it, publishes its ssh capability, and stops it with the env — one line, no hook:
env.py
from hud.environment import Environment

env = Environment(name="coder")
env.workspace("workspace")   # publishes "shell" (ssh/2) when the env serves
Use a relative path ("workspace", created next to env.py). Sandbox isolation (bwrap) is Linux-only — unisolated elsewhere, isolated in a built image.
To run a workspace yourself, drive its lifecycle and publish ws.capability() by hand:
env.py
from hud.environment import Environment, Workspace

env = Environment(name="coder")
ws = Workspace("workspace", host="127.0.0.1", port=0)   # port 0 → ephemeral

@env.initialize
async def _up():
    await ws.start()                          # binds, generates keys; idempotent
    env.add_capability(ws.capability("shell"))

@env.shutdown
async def _down():
    await ws.stop()

mcp — your own tools

Serve bespoke tools on a FastMCP server. The streamable-HTTP transport serves under /mcp, so that path is part of the published URL:
env.py
import asyncio

from fastmcp import FastMCP

from hud.capabilities import Capability
from hud.environment import Environment

server = FastMCP(name="tools")

@server.tool
def add(a: int, b: int) -> int:
    """Add two integers."""
    return a + b

env = Environment(name="calc")
_task: asyncio.Task | None = None

@env.initialize
async def _up():
    global _task
    if _task is None:                          # idempotent
        _task = asyncio.create_task(
            server.run_async(transport="http", host="127.0.0.1", port=8040)
        )
        await _listening("127.0.0.1", 8040)
    env.add_capability(Capability.mcp(name="tools", url="http://127.0.0.1:8040/mcp"))

@env.shutdown
async def _down():
    global _task
    if _task is not None:
        _task.cancel()
        _task = None
Capability.mcp accepts ws/wss/http/https URLs (no stdio) and an optional auth_token=.

cdp — a browser

Launch Chromium with a DevTools port. Playwright ships the binary (playwright install chromium); run it as a subprocess so the CDP endpoint is reachable at http://127.0.0.1:9222:
env.py
import asyncio
import tempfile

from playwright.async_api import async_playwright

from hud.capabilities import Capability
from hud.environment import Environment

env = Environment(name="browser")
_proc: asyncio.subprocess.Process | None = None

@env.initialize
async def _up():
    global _proc
    if _proc is None:
        pw = await async_playwright().start()
        _proc = await asyncio.create_subprocess_exec(
            pw.chromium.executable_path,
            "--headless=new",
            "--remote-debugging-port=9222",
            "--remote-debugging-address=127.0.0.1",
            "--no-first-run",
            "--user-data-dir=" + tempfile.mkdtemp(prefix="cdp_"),
        )
        await _listening("127.0.0.1", 9222)
    env.add_capability(Capability.cdp(name="browser", url="http://127.0.0.1:9222"))

@env.shutdown
async def _down():
    global _proc
    if _proc is not None:
        _proc.terminate()
        await _proc.wait()
        _proc = None
Capability.cdp defaults to port 9222 and takes an optional target_id=. (Add --no-sandbox only when running as root in a container.)

rfb — a virtual screen

Full computer-use is a VNC server over a virtual display. On Linux, Xvfb paints the framebuffer and x11vnc serves it (apt install xvfb x11vnc):
env.py
import asyncio

from hud.capabilities import Capability
from hud.environment import Environment

env = Environment(name="desktop")
_procs: tuple | None = None

@env.initialize
async def _up():
    global _procs
    if _procs is None:
        xvfb = await asyncio.create_subprocess_exec(
            "Xvfb", ":0", "-screen", "0", "1280x1024x24",
        )
        await asyncio.sleep(0.5)               # let the X server come up first
        vnc = await asyncio.create_subprocess_exec(
            "x11vnc", "-display", ":0", "-rfbport", "5900",
            "-localhost", "-forever", "-nopw",
        )
        await _listening("127.0.0.1", 5900)
        _procs = (xvfb, vnc)
    env.add_capability(Capability.rfb(name="screen", url="rfb://127.0.0.1", display=0))

@env.shutdown
async def _down():
    global _procs
    if _procs:
        for p in reversed(_procs):
            p.terminate()
            await p.wait()
        _procs = None
Capability.rfb listens on 5900 + display and takes an optional password=. Host multiple screens by publishing one rfb capability per display.

Capability.robot

Capability.robot(*, name="robot", url, contract)
The openpi/0 control loop (beta). This is an openpi-like protocol: it reuses openpi’s wire format (msgpack with transparent, recursive numpy serialization) and its flat observation/action naming schema (observation/... keys, actions), so an openpi policy server and a HUD env speak the same bytes. It differs fundamentally in role assignment — in openpi a policy server answers inference requests; here the environment is the server (it owns the world and pushes observations) and the agent is the client (it acts in the world, replying with actions). contract is the environment’s full self-describing schema — robot_type, control_rate, and every observation/action feature — carried in the manifest params so the agent wires itself with no shared config. The serving bridge binds an ephemeral loopback port, so publish this from an @env.initialize hook after await bridge.start():
@env.initialize
async def _up():
    await bridge.start()
    env.add_capability(Capability.robot(name="robot", url=bridge.url, contract=CONTRACT))
See Robots for the bridge, the harness, and the contract spec.

Workspace

Workspace is the standard shell daemon: a directory plus a bwrap-isolated SSH server (bash + chroot’d SFTP). Attach one with env.workspace(root, ...) and the environment brings it up (keys, socket, accept loop) when it serves, tearing it down on env.stop(). Extra kwargs configure the workspace — mounts, network, env vars, guest path, fixed ports, your own keys:
from hud.environment import Environment, Mount

env = Environment(name="coder")
env.workspace(
    "/workspace",
    network=True,
    mounts=[Mount("ro", src="/data", dst="/data")],
)
To run one yourself (outside an env), drive the lifecycle directly and publish ws.capability() as a concrete ssh capability:
MemberDescription
Workspace(root, *, host="127.0.0.1", port=0, mounts=(), network=False, env=None, user="agent", ...)Construct. port=0 binds an ephemeral port.
await ws.start()Start the SSH accept loop (idempotent).
ws.capability(name="shell")The resolved ssh Capability (materializes keys, binds the socket).
await ws.stop()Stop accepting sessions and release the socket.
ws.ssh_url / ws.ssh_host_pubkeyConnection address and host key.
ws.bwrap_availableWhether bwrap isolation is active.
Pass mounts=[Mount("ro", src=..., dst=...)] and network=True (both from hud.environment) to configure the sandbox.

Bindings are always reachable

Every address in the manifest is dialable from where the client runs. A loopback daemon (a workspace, a browser in the same container) is transparently forwarded through the env’s control port, so a container only ever publishes one port — bind your daemons to 127.0.0.1 and don’t worry about the rest.

Harness clients

A harness opens a capability to get a live client. The capability clients live in hud.capabilities:
ClientProtocol
SSHClientssh/2 (raw asyncssh connection via .conn)
MCPClientmcp/2025-11-25
CDPClientcdp/1.3
RFBClientrfb/3.8
RobotClientopenpi/0 — joins the registry on first open (the robot extra: numpy/openpi-client)
The bundled provider agents open these automatically based on which capabilities the manifest advertises (see Agents). To write your own harness, attach to the capability you need and define your tool spec.

See also

Environments

Environment reference

Agents

Tasks & Tasksets