A curated, non-BS library of the best resources for evaluating agents (opens in new tab)
A curated, non-BS library of the best resources for building and evaluating AI agents — papers, blogs, talks, tools, benchmarks. Maintained by BenchFlow. - benchflow-ai/awesome-evals
Read the original article