Kareem-Rashed/rubric-eval: Independent framework to test, benchmark, and evaluate LLMs & AI agents locally. (opens in new tab)
Independent framework to test, benchmark, and evaluate LLMs & AI agents locally. - Kareem-Rashed/rubric-eval
Read the original article