Show HN: Evaluating Local LLMs as language translators for my app (opens in new tab)
How good are local LLMs at translation, and do you actually need the cloud? A reproducible benchmark of 24 on-device, self-hosted, and cloud models translating into English, with the low-resource case (Afrikaans) front and centre. The headline: on Afrikaans→English a local 18 GB model lands in a statistical tie with frontier cloud. Same blinded Tatoeba sentences, same prompt, greedy decoding, scored multi-reference with COMET (meaning) and chrF++ (surface). Built to pick a translation model f...
Read the original article