Skip to main content
The HUD SDK provides a base MCPAgent class and several pre-built agent implementations for interacting with MCP environments.

Creating Agents

Use the create() factory method to instantiate agents with typed parameters:
from hud.agents import ClaudeAgent

agent = ClaudeAgent.create(
    checkpoint_name="claude-sonnet-4-5",
    max_tokens=8192,
    verbose=True,
)

result = await agent.run(task, max_steps=20)
Direct constructor calls with kwargs are deprecated. Use Agent.create() instead.

Base Class

MCPAgent

from hud.agents import MCPAgent
Abstract base class for all MCP-enabled agents. Handles the agent loop, MCP client lifecycle, tool discovery/filtering, and telemetry. Create Parameters (shared by all agents):
ParameterTypeDescriptionDefault
mcp_clientAgentMCPClientMCP client for server connectionsNone
auto_traceboolEnable automatic tracing spansTrue
auto_respondboolUse ResponseAgent to decide when to stop/continueFalse
verboseboolVerbose console logs for developmentFalse
Base Config (shared by all agents):
ParameterTypeDescriptionDefault
allowed_toolslist[str]Tool patterns to expose to the modelNone (all)
disallowed_toolslist[str]Tool patterns to hide from the modelNone
system_promptstrCustom system promptNone
append_setup_outputboolInclude setup output in first turnTrue
initial_screenshotboolInclude screenshot in initial contextTrue
response_tool_namestrLifecycle tool for submitting responsesNone
Key Methods:
@classmethod
def create(**kwargs) -> MCPAgent
    """Factory method to create an agent with typed parameters."""

async def run(prompt_or_task: str | Task | dict, max_steps: int = 10) -> Trace
    """Run agent with prompt or task. Returns Trace with results."""

async def call_tools(tool_call: MCPToolCall | list[MCPToolCall]) -> list[MCPToolResult]
    """Execute tool calls through MCP client."""

def get_available_tools() -> list[types.Tool]
    """Get filtered list of available tools."""

Pre-built Agents

ClaudeAgent

from hud.agents import ClaudeAgent
Claude-specific implementation using Anthropic’s API. Config Parameters:
ParameterTypeDescriptionDefault
checkpoint_namestrClaude model to use"claude-sonnet-4-5"
model_clientAsyncAnthropicAnthropic clientAuto-created
max_tokensintMaximum response tokens16384
use_computer_betaboolEnable computer-use beta featuresTrue
validate_api_keyboolValidate key on initTrue
Example:
from hud.agents import ClaudeAgent
from hud.datasets import LegacyTask

agent = ClaudeAgent.create(
    checkpoint_name="claude-sonnet-4-5",
    max_tokens=8192,
)

result = await agent.run(
    LegacyTask(
        prompt="Navigate to example.com",
        mcp_config={
            "hud": {
                "url": "https://mcp.hud.ai/v3/mcp",
                "headers": {
                    "Authorization": "Bearer ${HUD_API_KEY}",
                    "Mcp-Image": "hudpython/hud-remote-browser:latest"
                }
            }
        },
    )
)

OpenAIAgent

from hud.agents import OpenAIAgent
OpenAI agent using the Responses API for function calling. Config Parameters:
ParameterTypeDescriptionDefault
checkpoint_namestrModel to use"gpt-5.1"
model_clientAsyncOpenAIOpenAI clientAuto-created
max_output_tokensintMaximum response tokensNone
temperaturefloatSampling temperatureNone
reasoningReasoningReasoning configurationNone
tool_choiceToolChoiceTool selection strategyNone
parallel_tool_callsboolEnable parallel tool executionNone
validate_api_keyboolValidate key on initTrue
Example:
agent = OpenAIAgent.create(
    checkpoint_name="gpt-4o",
    max_output_tokens=2048,
    temperature=0.7,
)

OperatorAgent

from hud.agents import OperatorAgent
OpenAI Operator-style agent with computer-use capabilities. Extends OpenAIAgent. Config Parameters:
ParameterTypeDescriptionDefault
checkpoint_namestrModel to use"computer-use-preview"
environmentLiteral["windows","mac","linux","browser"]Computer environment"linux"
Inherits all OpenAIAgent parameters.

GeminiAgent

from hud.agents import GeminiAgent
Google Gemini agent with native computer-use capabilities. Config Parameters:
ParameterTypeDescriptionDefault
checkpoint_namestrGemini model to use"gemini-2.5-computer-use-preview-10-2025"
model_clientgenai.ClientGemini clientAuto-created
temperaturefloatSampling temperature1.0
top_pfloatTop-p sampling0.95
top_kintTop-k sampling40
max_output_tokensintMaximum response tokens8192
excluded_predefined_functionslist[str]Predefined functions to exclude[]
validate_api_keyboolValidate key on initTrue
Example:
agent = GeminiAgent.create(
    checkpoint_name="gemini-2.5-computer-use-preview-10-2025",
    temperature=0.7,
    max_output_tokens=4096,
)

OpenAIChatAgent

from hud.agents import OpenAIChatAgent
OpenAI-compatible chat.completions agent. Works with any endpoint implementing the OpenAI schema (vLLM, Ollama, Together, etc.). Config Parameters:
ParameterTypeDescriptionDefault
checkpoint_namestrModel name"gpt-5-mini"
openai_clientAsyncOpenAIOpenAI-compatible clientNone
api_keystrAPI key (if not using client)None
base_urlstrBase URL (if not using client)None
completion_kwargsdictExtra args for completions{}
Example:
from hud.agents import OpenAIChatAgent

# Using base_url and api_key
agent = OpenAIChatAgent.create(
    base_url="http://localhost:11434/v1",  # Ollama
    api_key="not-needed",
    checkpoint_name="llama3.1",
    completion_kwargs={"temperature": 0.2},
)

# Or with a custom client
from openai import AsyncOpenAI

agent = OpenAIChatAgent.create(
    openai_client=AsyncOpenAI(base_url="http://localhost:8000/v1"),
    checkpoint_name="served-model",
)

Usage Examples

Basic Task Execution

from hud.agents import ClaudeAgent
from hud.datasets import LegacyTask

agent = ClaudeAgent.create()

result = await agent.run(
    LegacyTask(
        prompt="Click the submit button",
        mcp_config={
            "hud": {
                "url": "https://mcp.hud.ai/v3/mcp",
                "headers": {
                    "Authorization": "Bearer ${HUD_API_KEY}",
                    "Mcp-Image": "hudpython/hud-remote-browser:latest"
                }
            }
        }
    ),
    max_steps=5,
)
print(f"Reward: {result.reward}, Done: {result.done}")

With Setup and Evaluation

task = LegacyTask(
    prompt="Find the price of the product",
    mcp_config={
        "hud": {
            "url": "https://mcp.hud.ai/v3/mcp",
            "headers": {
                "Authorization": "Bearer ${HUD_API_KEY}",
                "Mcp-Image": "hudpython/hud-remote-browser:latest"
            }
        }
    },
    setup_tool={
        "name": "setup",
        "arguments": {"url": "https://example.com"}
    },
    evaluate_tool={
        "name": "evaluate", 
        "arguments": {"check": "price_found"}
    }
)

agent = OperatorAgent.create()
result = await agent.run(task, max_steps=20)

Auto-Respond Mode

When auto_respond=True, the agent uses a ResponseAgent to decide whether to continue or stop after each model response:
agent = ClaudeAgent.create(
    auto_respond=True,  # Uses HUD inference gateway
    verbose=True,
)

See Also