ULTRASOUND Announced
bluesnews.com·6h
Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader
arxiv.org·2h
Claude Code vs Developer Skills: How Humans Still Win (And Ship)
pub.towardsai.net·1d
SinhalaMMLU: A Comprehensive Benchmark for Evaluating Multitask Language Understanding in Sinhala
arxiv.org·2h
Loading...Loading more...