Benchmarks in Microsoft Foundry (preview): Standardized model and agent quality checks (opens in new tab)
Introduction Benchmarks in Microsoft Foundry (preview) make that kind of measurement a first-class part of the development workflow. You can run well-known open-source benchmarks against any model deployment or agent in your project, compare runs side by side in the evaluation group view, and drive the whole flow from the portal or the REST API. Figure 1. Benchmarks appear in the Microsoft Foundry Evaluations list alongside your evaluations. How is this different from the model leaderboard? M...
Read the original article