Blog - CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data (opens in new tab)
We are excited to announce the release of CommonLID, a language identification benchmark for the web, covering 109 languages. CommonLID was developed in collaboration with multiple open-source organizations and language community groups.
Read the original article