Documentation Index
Fetch the complete documentation index at: https://docs.hud.ai/llms.txt
Use this file to discover all available pages before exploring further.
Looking for specific tool implementations?This reference covers the tool system architecture and how to build custom tools. For documentation on built-in tools, see Scaffolding:
HUD tools are async functions that:
- Receive structured input from agents (via MCP or native APIs)
- Execute actions against an environment, filesystem, or service
- Return
ContentBlock lists — standardized MCP output (text, images, etc.)
Agent → Tool Call → BaseTool.__call__() → list[ContentBlock] → Agent
Tools integrate with providers through native specs — when Claude calls bash, it uses Anthropic’s native bash_20250124 API. When OpenAI calls shell, it uses their native format. HUD translates automatically.
All tools inherit from BaseTool. Implement __call__ to define behavior.
from hud.tools import BaseTool
from mcp.types import ContentBlock, TextContent
class MyTool(BaseTool):
def __init__(self, config: str = "default"):
super().__init__(
name="my_tool",
title="My Tool",
description="Does something useful",
)
self.config = config
async def __call__(self, query: str) -> list[ContentBlock]:
result = await self._do_work(query)
return [TextContent(text=result, type="text")]
async def _do_work(self, query: str) -> str:
return f"Processed: {query} with {self.config}"
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
env | Any | Stateful context (executor, browser, etc.) | None |
name | str | Tool name for MCP registration | Auto from class |
title | str | Human-readable display name | Auto from class |
description | str | Tool description for agents | Auto from docstring |
Properties:
mcp — FastMCP FunctionTool wrapper for server registration
native_specs — Dict mapping AgentType to NativeToolSpec
Registration:
from hud.server import MCPServer
mcp = MCPServer(name="my-env")
mcp.add_tool(MyTool()) # Automatically wraps with .mcp
Tools can declare native API mappings for specific providers. This enables zero-translation tool calls for supported agents.
from hud.tools import BaseTool
from hud.tools.native_types import NativeToolSpec
from hud.types import AgentType
class BashTool(BaseTool):
native_specs = {
AgentType.CLAUDE: NativeToolSpec(
api_type="bash_20250124",
api_name="bash",
beta="computer-use-2025-01-24",
role="shell",
),
}
NativeToolSpec Fields:
| Field | Type | Description |
|---|
api_type | str | Provider’s tool type identifier |
api_name | str | Provider’s tool name |
beta | str | None | Required beta header (Anthropic) |
role | str | None | Logical role for exclusion ("shell", "editor", "memory") |
supported_models | list[str] | None | Glob patterns for compatible models |
Role Exclusion:
Tools with the same role are mutually exclusive — you can’t have both BashTool (Claude) and ShellTool (OpenAI) active. When an agent accepts one natively, others with the same role are excluded.
# Both have role="shell" — only one registers natively
env.add_tool(BashTool()) # Claude gets this natively
env.add_tool(ShellTool()) # OpenAI gets this natively
Modify tool behavior without subclassing using @tool.before and @tool.after:
from hud.tools import BashTool
from hud.tools.types import ToolError
bash = BashTool()
@bash.before
async def validate(command: str | None = None, **kwargs):
"""Runs before execution. Raise to block, return dict to modify args."""
if command and "rm -rf /" in command:
raise ToolError("Blocked dangerous command")
# Return modified kwargs, or None to proceed unchanged
return {"command": command.strip()} if command else None
@bash.after
async def audit(command: str | None = None, result=None, **kwargs):
"""Runs after execution. Return to modify result, None to keep original."""
print(f"Executed: {command}")
return None # Keep original result
@tool.before:
- Receives all tool arguments as kwargs
- Return
dict to modify arguments before execution
- Return
None to proceed unchanged
- Raise exception to block execution
@tool.after:
- Receives tool arguments plus
result= (the return value)
- Return modified result to change output
- Return
None to keep original result
Hooks stack in registration order:
@bash.before
async def first_validation(**kwargs): ...
@bash.before
async def second_validation(**kwargs): ... # Runs after first
Common Types
ContentBlock
MCP standard output format. Tools return list[ContentBlock].
from mcp.types import TextContent, ImageContent
# Text output
TextContent(text="Operation complete", type="text")
# Image output
ImageContent(data="base64_data", mimeType="image/png", type="image")
ContentResult
Helper for building tool outputs with multiple content types:
from hud.tools.types import ContentResult
result = ContentResult(
output="Success message",
error="Error details if any",
base64_image="screenshot_data",
)
# Convert to list[ContentBlock]
blocks = result.to_content_blocks()
# For text-only output (returns list[TextContent])
text_blocks = result.to_text_blocks()
Raise to return an error to the agent:
from hud.tools.types import ToolError
async def __call__(self, path: str) -> list[ContentBlock]:
if not path:
raise ToolError("path is required")
# ...
ToolError messages are returned to the agent as text content, not raised as exceptions.
EvaluationResult
For evaluation/scoring tools:
from hud.tools.types import EvaluationResult
result = EvaluationResult(
reward=0.8, # Score 0-1
done=True, # Task complete?
content="Details", # Optional explanation
info={"score": 80} # Metadata
)
BaseHub
Organize related tools into namespaced groups with a dispatcher pattern:
from hud.tools import BaseHub
from hud.tools.types import EvaluationResult
evaluators = BaseHub("evaluate")
@evaluators.tool("text_contains")
async def check_text(text: str, target: str) -> EvaluationResult:
return EvaluationResult(
reward=1.0 if target in text else 0.0,
done=True,
)
@evaluators.tool("url_matches")
async def check_url(url: str, expected: str) -> EvaluationResult:
return EvaluationResult(
reward=1.0 if url == expected else 0.0,
done=True,
)
# Mount on server — agents call: evaluate(name="text_contains", ...)
mcp.mount(evaluators)
Key Features:
- Internal tools are hidden from MCP clients
- Single dispatcher endpoint for all hub tools
- Automatic resource catalog generation
Wrap a scenario as a callable tool for hierarchical agent systems:
from hud import Environment
from hud.tools import AgentTool
# Subagent environment
researcher = Environment(name="researcher")
@researcher.scenario("search")
async def search_web(query: str):
yield f"Search for: {query}"
# ... agent interaction ...
# Create orchestrator and add subagent as tool
orchestrator = Environment(name="orchestrator")
tool = AgentTool(
researcher("search"), # Task template
model="gpt-4o-mini",
name="web_search",
description="Search the web for information",
)
orchestrator.add_tool(tool)
Constructor Parameters:
| Parameter | Type | Description |
|---|
task | Task | Task template from env("scenario") |
model | str | Model for subagent (via gateway) |
agent | type[MCPAgent] | Custom agent class (alternative to model) |
name | str | Tool name for orchestrator |
description | str | Tool description |
Eval-Only Parameters:
Parameters with | None = None are hidden from the orchestrator but available for evaluation scoring:
@env.scenario("investigate")
async def investigate(
query: str, # Visible to orchestrator
expected_finding: str | None = None, # Hidden — only for eval
):
response = yield f"Investigate: {query}"
if expected_finding:
yield 1.0 if expected_finding in response else 0.0
Executors
Executors provide platform-specific implementations for computer control tools.
BaseExecutor
Abstract base for all executors:
from hud.tools.executors import BaseExecutor
class MyExecutor(BaseExecutor):
async def click(self, x: int, y: int, **kwargs) -> None: ...
async def write(self, text: str, **kwargs) -> None: ...
async def press(self, keys: list[str]) -> None: ...
async def screenshot(self) -> bytes: ...
async def get_screen_size(self) -> tuple[int, int]: ...
Built-in Executors
| Executor | Platform | Features |
|---|
PyAutoGUIExecutor | Cross-platform | Real mouse/keyboard, screenshots |
XDOExecutor | Linux/X11 | Native X11, faster on Linux |
from hud.tools.executors import PyAutoGUIExecutor, XDOExecutor
from hud.tools import HudComputerTool
# Cross-platform
computer = HudComputerTool(executor=PyAutoGUIExecutor())
# Linux with specific display
computer = HudComputerTool(executor=XDOExecutor(display_num=1))
Callback Functions
Monitor and hook into tool actions:
class MyTool(BaseTool):
def __init__(self):
super().__init__(name="my_tool")
self.add_callback("action_complete", self._on_complete)
async def _on_complete(self, **kwargs) -> None:
print(f"Action completed: {kwargs}")
async def __call__(self, action: str) -> list[ContentBlock]:
result = await self._do_action(action)
await self._trigger_callbacks("action_complete", action=action)
return result
Callback Methods:
add_callback(event_type: str, callback: Callable)
remove_callback(event_type: str, callback: Callable)
_trigger_callbacks(event_type: str, **kwargs) # Call from tool methods
Callbacks must be async def.
Complete Example
Custom database tool with validation and logging:
from hud.tools import BaseTool
from hud.tools.types import ContentResult, ToolError
from mcp.types import ContentBlock
class DatabaseTool(BaseTool):
def __init__(self, connection_string: str):
super().__init__(
name="database",
title="Database Query",
description="Execute read-only SQL queries",
)
self.conn_string = connection_string
self._conn = None
async def __call__(
self,
query: str,
limit: int = 100,
) -> list[ContentBlock]:
if not query.strip().upper().startswith("SELECT"):
raise ToolError("Only SELECT queries are allowed")
try:
conn = await self._get_connection()
results = await conn.fetch(f"{query} LIMIT {limit}")
return ContentResult(
output=self._format_results(results)
).to_content_blocks()
except Exception as e:
raise ToolError(f"Query failed: {e}")
async def _get_connection(self):
if not self._conn:
import asyncpg
self._conn = await asyncpg.connect(self.conn_string)
return self._conn
def _format_results(self, rows: list) -> str:
if not rows:
return "No results"
return "\n".join(str(dict(row)) for row in rows)
# Usage with hooks
db = DatabaseTool("postgresql://...")
@db.before
async def audit_query(query: str = "", **kwargs):
print(f"Executing: {query[:50]}...")
@db.after
async def log_result(result=None, **kwargs):
print(f"Returned {len(result)} blocks")
See Also