Skip to main content
Version 0.4.73 - Latest stable release

What is HUD?

HUD connects AI agents to software environments using the Model Context Protocol (MCP). Whether you’re evaluating existing agents or building new environments, HUD provides the infrastructure.

Why HUD?

  • 🔌 MCP-native: Any agent can connect to any environment
  • 📡 Live telemetry: Debug every tool call at hud.ai
  • ⚡ HUD Gateway: Unified inference API for all LLMs
  • 🚀 Production-ready: From local Docker to cloud scale
  • 🎯 Built-in benchmarks: OSWorld-Verified, SheetBench-50, and more
  • 🔧 CLI tools: Create, develop, and run with hud init, hud dev, hud run, hud eval

Quick Example

import asyncio, os, hud
from hud.datasets import Task
from hud.agents import ClaudeAgent

async def main():
    # Define evaluation task with remote MCP
    task = Task(
        prompt="Win a game of 2048 by reaching the 128 tile",
        mcp_config={
            "hud": {
                "url": "https://mcp.hud.ai/v3/mcp",
                "headers": {
                    "Authorization": f"Bearer {os.getenv('HUD_API_KEY')}",
                    "Mcp-Image": "hudevals/hud-text-2048:0.1.3"
                }
            }
        },
        setup_tool={"name": "setup", "arguments": {"name": "board", "arguments": { "board_size": 4}}},
        evaluate_tool={"name": "evaluate", "arguments": {"name": "max_number", "arguments": {"target": 64}}}
    )
    
    # Run agent (auto-creates MCP client)
    agent = ClaudeAgent.create()
    result = await agent.run(task)
    print(f"Score: {result.reward}")

asyncio.run(main())

Community

Are you an enterprise building agents?

📅 Hop on a call or 📧 [email protected]