HUD Documentation — Evaluations and RL Environments.

Most tasks yield a single text prompt. A chat-style task yields a list of messages instead, so the agent works against a multi-turn conversation. The Chat runner drives that conversation turn by turn and keeps the history for you.

Prerequisites

An environment and a task (see Tasks).
An agent to drive the turns (see Run on any model).

A chat-style task

A task’s prompt can be plain text or a list of PromptMessages. To accept a running conversation, take a messages parameter and yield it as the prompt:

tasks.py

from hud import Environment
from mcp.types import PromptMessage

env = Environment(name="assistant")

@env.template()
async def assistant(messages: list[PromptMessage]):
    answer = yield messages          # the conversation so far is the prompt
    yield 1.0 if answer else 0.0     # grade the final turn however you like

run.prompt becomes the message list, and agents consume it as normalized turns through run.prompt_messages.

Driving it with `Chat`

Chat wraps a concrete Task plus an Agent. Each send() appends the user message, runs the agent over a fresh run with the full history, appends the reply, and returns the Trace:

chat.py

import asyncio
from hud import Chat
from hud.agents import create_agent
from tasks import assistant

async def main():
    chat = Chat(assistant(messages=[]), create_agent("claude-sonnet-4-5"))
    r1 = await chat.send("Book me a flight")
    r2 = await chat.send("SFO to JFK")
    print(r2.content)            # the assistant's latest reply

asyncio.run(main())

Chat is imported from hud.eval (also re-exported as hud.Chat). The task’s messages argument is replaced with the running conversation on every send; pass runtime= to place each turn’s rollout (with no runtime it serves the task’s source locally when minted in-process, else uses the HUD runtime tunnel by the task’s env name).

Managing history

The conversation history is the public chat.messages list — persist it, restore it, or reset it directly:

Operation	Description
`await chat.send(message)`	Send a user turn; returns the reply `Trace`.
`chat.messages`	The history (`{"role", "content"}` dicts) — `json.dumps` it to persist, assign to restore, clear to reset.

Serving a chat

Chat is protocol-agnostic: any frontend — a web handler, a notebook, a wire protocol — just calls await chat.send(...). For example, behind FastAPI:

app = FastAPI()
chat = Chat(assistant(messages=[]), create_agent("claude-sonnet-4-5"))

@app.post("/api/chat")
async def chat_endpoint(message: str):
    result = await chat.send(message)
    return {"response": result.content}

For an A2A endpoint (sessions per context, agent card, citations transport), see the reference server in cookbooks/a2a-chat/server.py — copy and adapt it; the protocol adapter is deliberately not part of the SDK.

When to use chat vs. a single-turn task

Single-turn task — the default. One prompt, one graded answer. Use it for evals and training (see Tasks).
Chat task — when the interaction itself is the thing: assistants, tool-use dialogues, or anything where the agent needs prior turns. The grading model is the same — you still yield a reward.

Chat

Prerequisites

A chat-style task

Driving it with `Chat`

Managing history

Serving a chat

When to use chat vs. a single-turn task

See also

Tasks & Tasksets

Run on any model

Integrations

Types: Trace

​Prerequisites

​A chat-style task

​Driving it with Chat

​Managing history

​Serving a chat

​When to use chat vs. a single-turn task

​See also

Tasks & Tasksets

Run on any model

Integrations

Types: Trace

Prerequisites

A chat-style task

Driving it with `Chat`

Managing history

Serving a chat

When to use chat vs. a single-turn task

See also