Skip to main content
HUD provides native support for OpenAI’s coding tools (shell and apply_patch), enabling you to build powerful coding agents that can create, modify, and execute code.

Example Code

Follow along with the full working example on GitHub.

Overview

OpenAI’s Responses API includes specialized tools for coding tasks:
ToolPurposeHUD Implementation
shellExecute shell commands in a persistent bash sessionhud.tools.shell.ShellTool
apply_patchCreate, update, and delete files using V4A diff formathud.tools.apply_patch.ApplyPatchTool
When you register tools named shell or apply_patch in your environment, the OpenAIAgent automatically converts them to OpenAI’s native tool types for optimal performance.

Two Modes

HUD supports two execution modes for coding agents:
ModeTools Run OnInference ViaAPI Keys Required
Local (--local)Your machineOpenAI directlyOPENAI_API_KEY
Hub (default)HUD CloudHUD GatewayHUD_API_KEY
Both modes support traces on hud.ai if HUD_API_KEY is set.

Quick Start

Local Mode (No Docker)

Run coding agents directly on your machine without any infrastructure:
import hud
from hud.agents.openai import OpenAIAgent
from hud.tools.shell import ShellTool
from hud.tools.apply_patch import ApplyPatchTool

# Create environment with coding tools
env = hud.Environment("coding")
shell_tool = ShellTool()
apply_patch_tool = ApplyPatchTool(base_path="/path/to/workspace")

@env.tool()
async def shell(commands: list[str], timeout_ms: int | None = None):
    """Execute shell commands."""
    result = await shell_tool(commands=commands, timeout_ms=timeout_ms)
    return result.to_dict()

@env.tool()
async def apply_patch(type: str, path: str, diff: str | None = None):
    """Apply file patches."""
    result = await apply_patch_tool(type=type, path=path, diff=diff)
    return result.to_dict()

# Run with OpenAI agent (calls OpenAI directly)
agent = OpenAIAgent.create(model="gpt-5.1")

async with hud.eval(env(), name="coding-task") as ctx:
    result = await agent.run(ctx, max_steps=20)

Hub Mode (Cloud Execution)

Prerequisites: You must create the codex_environment_sandbox environment in hud.ai first before using hub mode. Go to hud.aiNewEnvironment → Import from hud-evals/codex_environment_sandbox. Once deployed, your environment will be accessible via connect_hub().
Connect to HUD Hub for full cloud execution and telemetry:
import hud
from hud.agents.openai import OpenAIAgent
from hud.settings import settings
from openai import AsyncOpenAI

# Connect to HUD Hub environment
env = hud.Environment()
env.connect_hub("codex_environment_sandbox")

# Use HUD Gateway for inference (full telemetry)
model_client = AsyncOpenAI(
    base_url=settings.hud_gateway_url,
    api_key=settings.api_key,
)
agent = OpenAIAgent.create(
    model="gpt-5.1",
    model_client=model_client,
    validate_api_key=False,
)

async with hud.eval(env(), name="coding-task") as ctx:
    result = await agent.run(ctx, max_steps=20)
The first request may take a few seconds while the environment spins up in the cloud. Subsequent requests will be faster.

Tool Specifications

Shell Tool

The ShellTool provides a persistent bash session for executing commands. Features:
  • Auto-restart on error (session automatically restarts if needed)
  • Dynamic timeout via timeout_ms parameter
  • Persistent environment (exported variables, working directory)
  • Concurrent command execution support
Input Schema:
{
    "commands": ["ls -la", "cat file.py"],  # List of commands
    "timeout_ms": 30000,                     # Optional timeout per command
    "max_output_length": 10000               # Optional output limit
}
Output Format:
{
    "output": [
        {
            "stdout": "file1.py\nfile2.py",
            "stderr": "",
            "outcome": {"type": "exit", "exit_code": 0}
        }
    ]
}

Apply Patch Tool

The ApplyPatchTool creates, updates, and deletes files using OpenAI’s V4A diff format. Operations:
OperationDescriptionDiff Required
create_fileCreate a new fileYes
update_fileModify existing fileYes
delete_fileRemove a fileNo
Input Schema:
{
    "type": "update_file",
    "path": "src/main.py",
    "diff": "..."  # V4A diff content
}
V4A Diff Format Example:
@@ def hello():
-    print("Hello")
+    print("Hello, World!")
Output Format:
{
    "status": "completed",  # or "failed"
    "output": "Updated src/main.py"
}

Agent Integration

The OpenAIAgent automatically detects shell and apply_patch tools and converts them to OpenAI’s native types:
# In hud/agents/openai.py
def _to_openai_tool(self, tool):
    if tool.name == "shell":
        return FunctionShellToolParam(type="shell")
    if tool.name == "apply_patch":
        return ApplyPatchToolParam(type="apply_patch")
    # ... regular function tools
This means:
  1. The model sees native shell and apply_patch tools
  2. Responses include shell_call and apply_patch_call output types
  3. The agent routes these back to your registered tools

Complete Example

Here’s the full local mode example with a working directory:
import asyncio
import os
import tempfile

from dotenv import load_dotenv
from openai import AsyncOpenAI

load_dotenv()  # Load .env file

import hud
from hud.agents.openai import OpenAIAgent
from hud.tools.shell import ShellTool
from hud.tools.apply_patch import ApplyPatchTool


async def main():
    # Set up working directory
    work_dir = "./codex_output"
    os.makedirs(work_dir, exist_ok=True)
    base_path = os.path.abspath(work_dir)

    # Initialize tools
    shell_tool = ShellTool()
    apply_patch_tool = ApplyPatchTool(base_path=base_path)

    # Create environment with local tools
    env = hud.Environment("local-codex")

    @env.tool()
    async def shell(
        commands: list[str],
        timeout_ms: int | None = None,
        max_output_length: int | None = None,
    ) -> dict:
        """Execute shell commands in a bash session."""
        import shlex
        # Change to working directory before executing
        safe_path = shlex.quote(base_path)
        prefixed_commands = [f"cd {safe_path} && {cmd}" for cmd in commands]
        result = await shell_tool(
            commands=prefixed_commands,
            timeout_ms=timeout_ms,
            max_output_length=max_output_length,
        )
        return result.to_dict()

    @env.tool()
    async def apply_patch(
        type: str,
        path: str,
        diff: str | None = None,
    ) -> dict:
        """Apply file operations using V4A diff format."""
        result = await apply_patch_tool(type=type, path=path, diff=diff)
        return result.to_dict()

    # Define scenario
    @env.scenario("coding_task")
    async def coding_task_scenario(task_description: str):
        yield f"""You are a skilled software developer. Complete the following task:

{task_description}

Use the available tools:
- `shell` to run commands (ls, cat, python, etc.)
- `apply_patch` to create or modify files

Work in the current directory. When done, verify your work runs correctly."""

        yield 1.0

    # Create agent
    agent = OpenAIAgent.create(model="gpt-5.1", verbose=True)

    # Run the task
    task = "Create a Python script called main.py that prints Hello World"
    eval_task = env("coding_task", task_description=task)

    async with hud.eval(eval_task, name="codex-coding-local") as ctx:
        await agent.run(ctx, max_steps=20)

    print(f"Reward: {ctx.reward}")
    print(f"Files created in: {base_path}")

    # Show created files
    for f in os.listdir(base_path):
        print(f"  - {f}")


asyncio.run(main())

CLI Usage

Setting Up API Keys

Create a .env file in your project root:
# For local mode (calls OpenAI directly)
OPENAI_API_KEY=sk-...

# For hub mode OR traces (recommended)
HUD_API_KEY=sk-hud-...
Get your keys:
If you have both keys set, you get local execution with cloud traces - the best of both worlds!

Running the Example

# Local mode - tools run on your machine
uv run python examples/06_codex_coding_agent.py --local

# Local mode with persistent output directory
uv run python examples/06_codex_coding_agent.py --local --work-dir ./codex_output

# Hub mode - full cloud execution (default)
uv run python examples/06_codex_coding_agent.py

# Custom task
uv run python examples/06_codex_coding_agent.py --local \
  --task "Create a Python script that prints the Fibonacci sequence up to 10 numbers"

# Verbose output
uv run python examples/06_codex_coding_agent.py --local --verbose

CLI Options

FlagDefaultDescription
--localOffRun locally (tools on your machine, OpenAI direct)
--taskHello World scriptThe coding task to complete
--modelgpt-5.1Codex-capable model (gpt-5.1, gpt-5.1-codex)
--work-dirTemp directoryWorking directory (local mode only)
--max-steps20Maximum agent steps
--verboseOffEnable verbose output

Security Considerations

The shell and apply_patch tools can execute arbitrary commands and modify files. Use them in sandboxed environments for untrusted tasks.

See Also