NVIDIA Blackwell Leads AgentPerf, the First Agentic-AI Infra Benchmark: Trajectory-Replay Benchmarking (opens in new tab)
What: The AgentPerf benchmark from Artificial Analysis is the first test built for agentic-AI infrastructure: instead of timing one chat completion, it replays recorded multi-step agent trajectories to see how a serving system holds up under real agent load. Why: Agents don't send one prompt — they run long chains of model calls and tool executions, so a serving system's real job is sustaining many such runs at once. AgentPerf measures exactly that: concurrent agents held above a speed limit,...
Read the original article