Skip to main content
HUD Gateway is an OpenAI-compatible inference service that provides a unified endpoint for accessing various LLM providers (Anthropic, OpenAI, Gemini, xAI, and more). It handles authentication, rate limiting, and credit management, allowing you to focus on building agents.

Quick Start

The gateway is available at https://inference.hud.ai. You can use it with any OpenAI-compatible client.

Using Python (OpenAI SDK)

from openai import AsyncOpenAI
import os

client = AsyncOpenAI(
    base_url="https://inference.hud.ai",
    api_key=os.environ["HUD_API_KEY"]
)

response = await client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Using curl

curl -X POST https://inference.hud.ai/chat/completions \
  -H "Authorization: Bearer <HUD_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Supported Models

HUD Gateway supports models from major providers. For an up-to-date list, visit hud.ai/models.

Anthropic

ModelRoutes
claude-sonnet-4-5chat, messages
claude-haiku-4-5chat, messages
claude-opus-4-5chat, messages
claude-opus-4-1chat, messages

OpenAI

ModelRoutes
gpt-5.1chat, responses
gpt-5-minichat, responses
gpt-4ochat, responses
gpt-4o-minichat, responses
operatorresponses

Google Gemini

ModelRoutes
gemini-3-pro-previewchat
gemini-2.5-prochat
gemini-2.5-computer-use-previewgemini

xAI

ModelRoutes
grok-4-1-fastchat

Z-AI (via OpenRouter)

ModelRoutes
z-ai/glm-4.5vchat

Routes

Different models support different API routes:
  • chat - OpenAI Chat Completions API (/chat/completions)
  • messages - Anthropic Messages API (/messages)
  • responses - OpenAI Responses API (/responses)
  • gemini - Google Gemini native API

Features

Unified Billing

When using HUD Gateway with your HUD API key, usage is automatically deducted from your HUD credits. This simplifies billing by consolidating multiple provider invoices into one.

Rate Limits

HUD Gateway automatically handles key rotation and rate limiting across our pool of enterprise keys.

Using with HUD Agents

You can use HUD Gateway with OpenAIChatAgent for any model that supports the chat route:
from hud.agents import OpenAIChatAgent
from hud.settings import settings

# Use any gateway model with OpenAIChatAgent
agent = OpenAIChatAgent.create(
    base_url=settings.hud_gateway_url,
    api_key=settings.api_key,
    checkpoint_name="grok-4-1-fast",  # or any chat-compatible model
)

result = await agent.run(task, max_steps=10)

Building Custom Agents with Tracing

For a complete example of building a custom agent that uses HUD Gateway with full tracing support, see the custom agent example. This example demonstrates:
  • Using the @instrument decorator to capture inference traces
  • Building a custom MCPAgent with HUD Gateway
  • Automatic token usage and latency tracking
View your traces on the HUD Dashboard.