Microsoft has open-sourced an AI evaluation framework that converts natural-language requirements into executable tests, expanding its push into enterprise AI governance as organizations struggle to validate agent behavior before production deployments systematically. The framework, called ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing), generates evaluation scenarios, datasets, metrics, and scorecards from written specifications, product requirements, and governan...

Read the original article