Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.hud.ai/llms.txt

Use this file to discover all available pages before exploring further.

Looking for specific tool implementations?This reference covers the tool system architecture and how to build custom tools. For documentation on built-in tools, see Scaffolding:

How Tools Work

HUD tools are async functions that:
  1. Receive structured input from agents (via MCP or native APIs)
  2. Execute actions against an environment, filesystem, or service
  3. Return ContentBlock lists — standardized MCP output (text, images, etc.)
Agent → Tool Call → BaseTool.__call__() → list[ContentBlock] → Agent
Tools integrate with providers through native specs — when Claude calls bash, it uses Anthropic’s native bash_20250124 API. When OpenAI calls shell, it uses their native format. HUD translates automatically.

BaseTool

All tools inherit from BaseTool. Implement __call__ to define behavior.
from hud.tools import BaseTool
from mcp.types import ContentBlock, TextContent

class MyTool(BaseTool):
    def __init__(self, config: str = "default"):
        super().__init__(
            name="my_tool",
            title="My Tool",
            description="Does something useful",
        )
        self.config = config
    
    async def __call__(self, query: str) -> list[ContentBlock]:
        result = await self._do_work(query)
        return [TextContent(text=result, type="text")]
    
    async def _do_work(self, query: str) -> str:
        return f"Processed: {query} with {self.config}"
Constructor Parameters:
ParameterTypeDescriptionDefault
envAnyStateful context (executor, browser, etc.)None
namestrTool name for MCP registrationAuto from class
titlestrHuman-readable display nameAuto from class
descriptionstrTool description for agentsAuto from docstring
Properties:
  • mcp — FastMCP FunctionTool wrapper for server registration
  • native_specs — Dict mapping AgentType to NativeToolSpec
Registration:
from hud.server import MCPServer

mcp = MCPServer(name="my-env")
mcp.add_tool(MyTool())  # Automatically wraps with .mcp

Native Tool Specs

Tools can declare native API mappings for specific providers. This enables zero-translation tool calls for supported agents.
from hud.tools import BaseTool
from hud.tools.native_types import NativeToolSpec
from hud.types import AgentType

class BashTool(BaseTool):
    native_specs = {
        AgentType.CLAUDE: NativeToolSpec(
            api_type="bash_20250124",
            api_name="bash",
            beta="computer-use-2025-01-24",
            role="shell",
        ),
    }
NativeToolSpec Fields:
FieldTypeDescription
api_typestrProvider’s tool type identifier
api_namestrProvider’s tool name
betastr | NoneRequired beta header (Anthropic)
rolestr | NoneLogical role for exclusion ("shell", "editor", "memory")
supported_modelslist[str] | NoneGlob patterns for compatible models
Role Exclusion: Tools with the same role are mutually exclusive — you can’t have both BashTool (Claude) and ShellTool (OpenAI) active. When an agent accepts one natively, others with the same role are excluded.
# Both have role="shell" — only one registers natively
env.add_tool(BashTool())    # Claude gets this natively
env.add_tool(ShellTool())   # OpenAI gets this natively

Tool Hooks

Modify tool behavior without subclassing using @tool.before and @tool.after:
from hud.tools import BashTool
from hud.tools.types import ToolError

bash = BashTool()

@bash.before
async def validate(command: str | None = None, **kwargs):
    """Runs before execution. Raise to block, return dict to modify args."""
    if command and "rm -rf /" in command:
        raise ToolError("Blocked dangerous command")
    # Return modified kwargs, or None to proceed unchanged
    return {"command": command.strip()} if command else None

@bash.after
async def audit(command: str | None = None, result=None, **kwargs):
    """Runs after execution. Return to modify result, None to keep original."""
    print(f"Executed: {command}")
    return None  # Keep original result
@tool.before:
  • Receives all tool arguments as kwargs
  • Return dict to modify arguments before execution
  • Return None to proceed unchanged
  • Raise exception to block execution
@tool.after:
  • Receives tool arguments plus result= (the return value)
  • Return modified result to change output
  • Return None to keep original result
Hooks stack in registration order:
@bash.before
async def first_validation(**kwargs): ...

@bash.before  
async def second_validation(**kwargs): ...  # Runs after first

Common Types

ContentBlock

MCP standard output format. Tools return list[ContentBlock].
from mcp.types import TextContent, ImageContent

# Text output
TextContent(text="Operation complete", type="text")

# Image output  
ImageContent(data="base64_data", mimeType="image/png", type="image")

ContentResult

Helper for building tool outputs with multiple content types:
from hud.tools.types import ContentResult

result = ContentResult(
    output="Success message",
    error="Error details if any",
    base64_image="screenshot_data",
)

# Convert to list[ContentBlock]
blocks = result.to_content_blocks()

# For text-only output (returns list[TextContent])
text_blocks = result.to_text_blocks()

ToolError

Raise to return an error to the agent:
from hud.tools.types import ToolError

async def __call__(self, path: str) -> list[ContentBlock]:
    if not path:
        raise ToolError("path is required")
    # ...
ToolError messages are returned to the agent as text content, not raised as exceptions.

EvaluationResult

For evaluation/scoring tools:
from hud.tools.types import EvaluationResult

result = EvaluationResult(
    reward=0.8,        # Score 0-1
    done=True,         # Task complete?
    content="Details", # Optional explanation
    info={"score": 80} # Metadata
)

BaseHub

Organize related tools into namespaced groups with a dispatcher pattern:
from hud.tools import BaseHub
from hud.tools.types import EvaluationResult

evaluators = BaseHub("evaluate")

@evaluators.tool("text_contains")
async def check_text(text: str, target: str) -> EvaluationResult:
    return EvaluationResult(
        reward=1.0 if target in text else 0.0,
        done=True,
    )

@evaluators.tool("url_matches")
async def check_url(url: str, expected: str) -> EvaluationResult:
    return EvaluationResult(
        reward=1.0 if url == expected else 0.0,
        done=True,
    )

# Mount on server — agents call: evaluate(name="text_contains", ...)
mcp.mount(evaluators)
Key Features:
  • Internal tools are hidden from MCP clients
  • Single dispatcher endpoint for all hub tools
  • Automatic resource catalog generation

AgentTool

Wrap a scenario as a callable tool for hierarchical agent systems:
from hud import Environment
from hud.tools import AgentTool

# Subagent environment
researcher = Environment(name="researcher")

@researcher.scenario("search")
async def search_web(query: str):
    yield f"Search for: {query}"
    # ... agent interaction ...

# Create orchestrator and add subagent as tool
orchestrator = Environment(name="orchestrator")

tool = AgentTool(
    researcher("search"),  # Task template
    model="gpt-4o-mini",
    name="web_search",
    description="Search the web for information",
)
orchestrator.add_tool(tool)
Constructor Parameters:
ParameterTypeDescription
taskTaskTask template from env("scenario")
modelstrModel for subagent (via gateway)
agenttype[MCPAgent]Custom agent class (alternative to model)
namestrTool name for orchestrator
descriptionstrTool description
Eval-Only Parameters: Parameters with | None = None are hidden from the orchestrator but available for evaluation scoring:
@env.scenario("investigate")
async def investigate(
    query: str,                          # Visible to orchestrator
    expected_finding: str | None = None, # Hidden — only for eval
):
    response = yield f"Investigate: {query}"
    if expected_finding:
        yield 1.0 if expected_finding in response else 0.0

Executors

Executors provide platform-specific implementations for computer control tools.

BaseExecutor

Abstract base for all executors:
from hud.tools.executors import BaseExecutor

class MyExecutor(BaseExecutor):
    async def click(self, x: int, y: int, **kwargs) -> None: ...
    async def write(self, text: str, **kwargs) -> None: ...
    async def press(self, keys: list[str]) -> None: ...
    async def screenshot(self) -> bytes: ...
    async def get_screen_size(self) -> tuple[int, int]: ...

Built-in Executors

ExecutorPlatformFeatures
PyAutoGUIExecutorCross-platformReal mouse/keyboard, screenshots
XDOExecutorLinux/X11Native X11, faster on Linux
from hud.tools.executors import PyAutoGUIExecutor, XDOExecutor
from hud.tools import HudComputerTool

# Cross-platform
computer = HudComputerTool(executor=PyAutoGUIExecutor())

# Linux with specific display
computer = HudComputerTool(executor=XDOExecutor(display_num=1))

Callback Functions

Monitor and hook into tool actions:
class MyTool(BaseTool):
    def __init__(self):
        super().__init__(name="my_tool")
        self.add_callback("action_complete", self._on_complete)
    
    async def _on_complete(self, **kwargs) -> None:
        print(f"Action completed: {kwargs}")
    
    async def __call__(self, action: str) -> list[ContentBlock]:
        result = await self._do_action(action)
        await self._trigger_callbacks("action_complete", action=action)
        return result
Callback Methods:
add_callback(event_type: str, callback: Callable)
remove_callback(event_type: str, callback: Callable)
_trigger_callbacks(event_type: str, **kwargs)  # Call from tool methods
Callbacks must be async def.

Complete Example

Custom database tool with validation and logging:
from hud.tools import BaseTool
from hud.tools.types import ContentResult, ToolError
from mcp.types import ContentBlock

class DatabaseTool(BaseTool):
    def __init__(self, connection_string: str):
        super().__init__(
            name="database",
            title="Database Query",
            description="Execute read-only SQL queries",
        )
        self.conn_string = connection_string
        self._conn = None
    
    async def __call__(
        self,
        query: str,
        limit: int = 100,
    ) -> list[ContentBlock]:
        if not query.strip().upper().startswith("SELECT"):
            raise ToolError("Only SELECT queries are allowed")
        
        try:
            conn = await self._get_connection()
            results = await conn.fetch(f"{query} LIMIT {limit}")
            
            return ContentResult(
                output=self._format_results(results)
            ).to_content_blocks()
            
        except Exception as e:
            raise ToolError(f"Query failed: {e}")
    
    async def _get_connection(self):
        if not self._conn:
            import asyncpg
            self._conn = await asyncpg.connect(self.conn_string)
        return self._conn
    
    def _format_results(self, rows: list) -> str:
        if not rows:
            return "No results"
        return "\n".join(str(dict(row)) for row in rows)

# Usage with hooks
db = DatabaseTool("postgresql://...")

@db.before
async def audit_query(query: str = "", **kwargs):
    print(f"Executing: {query[:50]}...")

@db.after  
async def log_result(result=None, **kwargs):
    print(f"Returned {len(result)} blocks")

See Also