Kareem-Rashed/rubric-eval: Independent framework to test, benchmark, and evaluate LLMs & AI agents locally. (opens in new tab)

Independent framework to test, benchmark, and evaluate LLMs & AI agents locally. - Kareem-Rashed/rubric-eval