HUD Documentation — Evaluations and RL Environments.

Computer tools let agents interact with GUIs—click, type, scroll, drag, screenshot. Each provider has its own computer use API. Pick the one that matches your agent.

Quick Reference

Tool	Agent	Default Resolution
`AnthropicComputerTool`	Claude	1280×720
`OpenAIComputerTool`	OpenAI / Operator	1920×1080
`GeminiComputerTool`	Gemini	1280×720
`HudComputerTool`	Any (function calling)	1280×720

AnthropicComputerTool

For Claude. Uses computer_20250124 native API.

from hud.tools import AnthropicComputerTool

computer = AnthropicComputerTool(
    display_width=1280,
    display_height=720,
    rescale_images=True,
)

Actions: screenshot, click, double_click, mouse_move, left_click_drag, type, key, scroll, wait

# Screenshot
await computer(action="screenshot")

# Click
await computer(action="click", coordinate=[640, 360])

# Type text
await computer(action="type", text="Hello, World!")

# Keyboard shortcut
await computer(action="key", text="ctrl+c")

# Scroll
await computer(action="scroll", coordinate=[640, 360], scroll_direction="down", scroll_amount=3)

OpenAIComputerTool

For OpenAI and Operator. Uses computer_use_preview native API.

from hud.tools import OpenAIComputerTool

computer = OpenAIComputerTool(
    display_width=1920,
    display_height=1080,
    rescale_images=False,
)

Actions: screenshot, click, double_click, scroll, type, wait, move, keypress, drag

# Click
await computer(type="click", x=500, y=300, button="left")

# Type
await computer(type="type", text="Hello!")

# Key press
await computer(type="keypress", keys=["ctrl", "v"])

# Scroll
await computer(type="scroll", x=500, y=300, scroll_x=0, scroll_y=-100)

# Drag
await computer(type="drag", path=[{"x": 100, "y": 100}, {"x": 300, "y": 300}])

HudComputerTool

Generic computer tool for any agent via function calling. Use when you need provider-agnostic control.

from hud.tools import HudComputerTool

computer = HudComputerTool(
    platform_type="auto",  # "auto", "xdo", or "pyautogui"
    width=1280,
    height=720,
)

Actions: screenshot, click, write, press, scroll, drag, move, wait

await computer(action="screenshot")
await computer(action="click", x=100, y=200)
await computer(action="write", text="Hello", enter_after=True)
await computer(action="press", keys=["ctrl", "c"])

Executors

Computer tools use executors for the actual system interaction:

Executor	Platform	Notes
`PyAutoGUIExecutor`	Cross-platform	Default on Windows/macOS
`XDOExecutor`	Linux/X11	Faster, uses xdotool

from hud.tools import HudComputerTool
from hud.tools.executors import XDOExecutor

executor = XDOExecutor(display_num=1)
computer = HudComputerTool(executor=executor)

Coordinate Scaling

Agent coordinates don’t have to match actual display resolution. The tool handles scaling:

# Agent works in 1280×720
computer = AnthropicComputerTool(display_width=1280, display_height=720)

# Actual display is 2560×1440
# Agent sends (640, 360) → Executor receives (1280, 720)

rescale_images=True resizes screenshots to agent dimensions. rescale_images=False keeps full resolution (coordinates still scale).

Customizing

Log all actions with hooks:

from hud.tools import AnthropicComputerTool

computer = AnthropicComputerTool()

@computer.after
async def log_action(action: str = "", result=None, **kwargs):
    print(f"Computer action: {action}")

env.add_tool(computer)

Or subclass for deeper control:

from typing import Any
from mcp.types import ContentBlock
from hud.tools import AnthropicComputerTool
from hud.tools.types import ToolError

class SafeComputerTool(AnthropicComputerTool):
    BLOCKED_ACTIONS = ["key"]  # Block keyboard shortcuts
    
    async def __call__(self, action: str, **kwargs: Any) -> list[ContentBlock]:
        if action in self.BLOCKED_ACTIONS:
            raise ToolError(f"Action '{action}' not allowed")
        return await super().__call__(action, **kwargs)

→ Grounding Tools — Resolve element descriptions to coordinates

Get Started

Essentials

Guides

Cookbooks

Advanced

Tools

SDK Reference

CLI Reference

Community

Computer Tools

Quick Reference

AnthropicComputerTool

OpenAIComputerTool

HudComputerTool

Executors

Coordinate Scaling

Customizing

Get Started

Essentials

Guides

Cookbooks

Advanced

Tools

SDK Reference

CLI Reference

Community

​Quick Reference

​AnthropicComputerTool

​OpenAIComputerTool

​HudComputerTool

​Executors

​Coordinate Scaling

​Customizing

Quick Reference

AnthropicComputerTool

OpenAIComputerTool

HudComputerTool

Executors

Coordinate Scaling

Customizing