Skip to main content
Looking for specific tool implementations?This reference covers the tool system architecture and how to build custom tools. For documentation on built-in tools, see the Tools Guide:

How Tools Work

HUD tools are async functions that:
  1. Receive structured input from agents (via MCP or native APIs)
  2. Execute actions against an environment, filesystem, or service
  3. Return ContentBlock lists — standardized MCP output (text, images, etc.)
Agent → Tool Call → BaseTool.__call__() → list[ContentBlock] → Agent
Tools integrate with providers through native specs — when Claude calls bash, it uses Anthropic’s native bash_20250124 API. When OpenAI calls shell, it uses their native format. HUD translates automatically.

BaseTool

All tools inherit from BaseTool. Implement __call__ to define behavior.
from hud.tools import BaseTool
from mcp.types import ContentBlock, TextContent

class MyTool(BaseTool):
    def __init__(self, config: str = "default"):
        super().__init__(
            name="my_tool",
            title="My Tool",
            description="Does something useful",
        )
        self.config = config
    
    async def __call__(self, query: str) -> list[ContentBlock]:
        result = await self._do_work(query)
        return [TextContent(text=result, type="text")]
    
    async def _do_work(self, query: str) -> str:
        return f"Processed: {query} with {self.config}"
Constructor Parameters:
ParameterTypeDescriptionDefault
envAnyStateful context (executor, browser, etc.)None
namestrTool name for MCP registrationAuto from class
titlestrHuman-readable display nameAuto from class
descriptionstrTool description for agentsAuto from docstring
Properties:
  • mcp — FastMCP FunctionTool wrapper for server registration
  • native_specs — Dict mapping AgentType to NativeToolSpec
Registration:
from hud.server import MCPServer

mcp = MCPServer(name="my-env")
mcp.add_tool(MyTool())  # Automatically wraps with .mcp

Native Tool Specs

Tools can declare native API mappings for specific providers. This enables zero-translation tool calls for supported agents.
from hud.tools import BaseTool
from hud.tools.native_types import NativeToolSpec
from hud.types import AgentType

class BashTool(BaseTool):
    native_specs = {
        AgentType.CLAUDE: NativeToolSpec(
            api_type="bash_20250124",
            api_name="bash",
            beta="computer-use-2025-01-24",
            role="shell",
        ),
    }
NativeToolSpec Fields:
FieldTypeDescription
api_typestrProvider’s tool type identifier
api_namestrProvider’s tool name
betastr | NoneRequired beta header (Anthropic)
rolestr | NoneLogical role for exclusion ("shell", "editor", "memory")
supported_modelslist[str] | NoneGlob patterns for compatible models
Role Exclusion: Tools with the same role are mutually exclusive — you can’t have both BashTool (Claude) and ShellTool (OpenAI) active. When an agent accepts one natively, others with the same role are excluded.
# Both have role="shell" — only one registers natively
env.add_tool(BashTool())    # Claude gets this natively
env.add_tool(ShellTool())   # OpenAI gets this natively

Tool Hooks

Modify tool behavior without subclassing using @tool.before and @tool.after:
from hud.tools import BashTool
from hud.tools.types import ToolError

bash = BashTool()

@bash.before
async def validate(command: str | None = None, **kwargs):
    """Runs before execution. Raise to block, return dict to modify args."""
    if command and "rm -rf /" in command:
        raise ToolError("Blocked dangerous command")
    # Return modified kwargs, or None to proceed unchanged
    return {"command": command.strip()} if command else None

@bash.after
async def audit(command: str | None = None, result=None, **kwargs):
    """Runs after execution. Return to modify result, None to keep original."""
    print(f"Executed: {command}")
    return None  # Keep original result
@tool.before:
  • Receives all tool arguments as kwargs
  • Return dict to modify arguments before execution
  • Return None to proceed unchanged
  • Raise exception to block execution
@tool.after:
  • Receives tool arguments plus result= (the return value)
  • Return modified result to change output
  • Return None to keep original result
Hooks stack in registration order:
@bash.before
async def first_validation(**kwargs): ...

@bash.before  
async def second_validation(**kwargs): ...  # Runs after first

Common Types

ContentBlock

MCP standard output format. Tools return list[ContentBlock].
from mcp.types import TextContent, ImageContent

# Text output
TextContent(text="Operation complete", type="text")

# Image output  
ImageContent(data="base64_data", mimeType="image/png", type="image")

ContentResult

Helper for building tool outputs with multiple content types:
from hud.tools.types import ContentResult

result = ContentResult(
    output="Success message",
    error="Error details if any",
    base64_image="screenshot_data",
)

# Convert to list[ContentBlock]
blocks = result.to_content_blocks()

# For text-only output (returns list[TextContent])
text_blocks = result.to_text_blocks()

ToolError

Raise to return an error to the agent:
from hud.tools.types import ToolError

async def __call__(self, path: str) -> list[ContentBlock]:
    if not path:
        raise ToolError("path is required")
    # ...
ToolError messages are returned to the agent as text content, not raised as exceptions.

EvaluationResult

For evaluation/scoring tools:
from hud.tools.types import EvaluationResult

result = EvaluationResult(
    reward=0.8,        # Score 0-1
    done=True,         # Task complete?
    content="Details", # Optional explanation
    info={"score": 80} # Metadata
)

BaseHub

Organize related tools into namespaced groups with a dispatcher pattern:
from hud.tools import BaseHub
from hud.tools.types import EvaluationResult

evaluators = BaseHub("evaluate")

@evaluators.tool("text_contains")
async def check_text(text: str, target: str) -> EvaluationResult:
    return EvaluationResult(
        reward=1.0 if target in text else 0.0,
        done=True,
    )

@evaluators.tool("url_matches")
async def check_url(url: str, expected: str) -> EvaluationResult:
    return EvaluationResult(
        reward=1.0 if url == expected else 0.0,
        done=True,
    )

# Mount on server — agents call: evaluate(name="text_contains", ...)
mcp.mount(evaluators)
Key Features:
  • Internal tools are hidden from MCP clients
  • Single dispatcher endpoint for all hub tools
  • Automatic resource catalog generation

AgentTool

Wrap a scenario as a callable tool for hierarchical agent systems:
from hud import Environment
from hud.tools import AgentTool

# Subagent environment
researcher = Environment(name="researcher")

@researcher.scenario("search")
async def search_web(query: str):
    yield f"Search for: {query}"
    # ... agent interaction ...

# Create orchestrator and add subagent as tool
orchestrator = Environment(name="orchestrator")

tool = AgentTool(
    researcher("search"),  # Task template
    model="gpt-4o-mini",
    name="web_search",
    description="Search the web for information",
)
orchestrator.add_tool(tool)
Constructor Parameters:
ParameterTypeDescription
taskTaskTask template from env("scenario")
modelstrModel for subagent (via gateway)
agenttype[MCPAgent]Custom agent class (alternative to model)
namestrTool name for orchestrator
descriptionstrTool description
Eval-Only Parameters: Parameters with | None = None are hidden from the orchestrator but available for evaluation scoring:
@env.scenario("investigate")
async def investigate(
    query: str,                          # Visible to orchestrator
    expected_finding: str | None = None, # Hidden — only for eval
):
    response = yield f"Investigate: {query}"
    if expected_finding:
        yield 1.0 if expected_finding in response else 0.0

Executors

Executors provide platform-specific implementations for computer control tools.

BaseExecutor

Abstract base for all executors:
from hud.tools.executors import BaseExecutor

class MyExecutor(BaseExecutor):
    async def click(self, x: int, y: int, **kwargs) -> None: ...
    async def write(self, text: str, **kwargs) -> None: ...
    async def press(self, keys: list[str]) -> None: ...
    async def screenshot(self) -> bytes: ...
    async def get_screen_size(self) -> tuple[int, int]: ...

Built-in Executors

ExecutorPlatformFeatures
PyAutoGUIExecutorCross-platformReal mouse/keyboard, screenshots
XDOExecutorLinux/X11Native X11, faster on Linux
from hud.tools.executors import PyAutoGUIExecutor, XDOExecutor
from hud.tools import HudComputerTool

# Cross-platform
computer = HudComputerTool(executor=PyAutoGUIExecutor())

# Linux with specific display
computer = HudComputerTool(executor=XDOExecutor(display_num=1))

Callback Functions

Monitor and hook into tool actions:
class MyTool(BaseTool):
    def __init__(self):
        super().__init__(name="my_tool")
        self.add_callback("action_complete", self._on_complete)
    
    async def _on_complete(self, **kwargs) -> None:
        print(f"Action completed: {kwargs}")
    
    async def __call__(self, action: str) -> list[ContentBlock]:
        result = await self._do_action(action)
        await self._trigger_callbacks("action_complete", action=action)
        return result
Callback Methods:
add_callback(event_type: str, callback: Callable)
remove_callback(event_type: str, callback: Callable)
_trigger_callbacks(event_type: str, **kwargs)  # Call from tool methods
Callbacks must be async def.

Complete Example

Custom database tool with validation and logging:
from hud.tools import BaseTool
from hud.tools.types import ContentResult, ToolError
from mcp.types import ContentBlock

class DatabaseTool(BaseTool):
    def __init__(self, connection_string: str):
        super().__init__(
            name="database",
            title="Database Query",
            description="Execute read-only SQL queries",
        )
        self.conn_string = connection_string
        self._conn = None
    
    async def __call__(
        self,
        query: str,
        limit: int = 100,
    ) -> list[ContentBlock]:
        if not query.strip().upper().startswith("SELECT"):
            raise ToolError("Only SELECT queries are allowed")
        
        try:
            conn = await self._get_connection()
            results = await conn.fetch(f"{query} LIMIT {limit}")
            
            return ContentResult(
                output=self._format_results(results)
            ).to_content_blocks()
            
        except Exception as e:
            raise ToolError(f"Query failed: {e}")
    
    async def _get_connection(self):
        if not self._conn:
            import asyncpg
            self._conn = await asyncpg.connect(self.conn_string)
        return self._conn
    
    def _format_results(self, rows: list) -> str:
        if not rows:
            return "No results"
        return "\n".join(str(dict(row)) for row in rows)

# Usage with hooks
db = DatabaseTool("postgresql://...")

@db.before
async def audit_query(query: str = "", **kwargs):
    print(f"Executing: {query[:50]}...")

@db.after  
async def log_result(result=None, **kwargs):
    print(f"Returned {len(result)} blocks")

See Also