hud_api_key) and an entity arg (trace_id for per-trace, task_id for per-task). The platform fills these at runtime — you configure the analysis prompt and attach the agent to a taskset column.
Standard QA Agents
Four pre-built QA agents are available out of the box. These appear under the Standard QA Workflows section on the Agents page and can be attached to any taskset with one click.| Agent | What it detects | Output |
|---|---|---|
| False Negative | Agent succeeded but grader scored it wrong | is_false_negative, reasoning, confidence |
| False Positive | Agent got credit without genuinely solving | is_false_positive, reasoning, confidence |
| Failure Analysis | Root cause classification (10 categories) | failure_category, root_cause, failed_criteria |
| Reward Hacking | Agent gamed the evaluation mechanism | is_reward_hacking, hacking_strategy, severity |
How to Use
From the Task Detail Panel
The primary way to work with QA agents is through the task detail panel. Click any task row in a taskset to open the slide-out panel, then navigate to the Traces tab:- At the top of the Traces tab, a toolbar shows all attached QA agents as compact pills alongside an Add QA Agent button
- Click Add QA Agent to attach a new agent — pick a recommended agent or one you’ve created
- Each agent pill has a play button that opens a popover with two options:
- Run for this task — Analyze only the traces on the current task
- Run for all tasks (N) — Analyze traces across the entire taskset
- Results appear inline below each trace, showing the agent name, verdict, and reasoning
- Agents that have been added but haven’t run yet still appear below traces with a Run button
- Analysis states (queued, analyzing) update live — no need to refresh
From the Agents Page
- Go to the Agents page
- Under Standard QA Workflows, click a recommended agent to view it
- Click Add as Column to attach it to any taskset
- Every completed trace is automatically analyzed
- To create your own, click New Agent → QA Workflow, select a scenario, configure the analysis prompt, and choose a model. It appears under Your QA Workflows.
Building Your Own
A QA agent is just a scenario withtrace_id + hud_api_key arguments. Use prepare_qa_context from trace-explorer for the common setup:
ground_truth parameter lets you build eval datasets for the agent itself.
See Also
- Automations — Run scenarios repeatably with pre-filled arguments
- Chat Agents — Multi-turn conversational agents
- Source Code — Fork and customize