Title:ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Abstract:This paper describes Elyadata & LIA’s joint submission to the NADI multi-dialectal Arabic Speech Processing 2025. We participated in the Spoken Arabic Dialect Identification (ADI) and multi-dialectal Arabic ASR subtasks. Our submission ranked first for the ADI subtask and second for the multi-dialectal Arabic ASR subtask among all participants. Our ADI system is a fine-tuned Whisper-large-v3 encoder with data augmentation. This system obtained the highest ADI accuracy score of \textbf{79.83%} on the official test set. For multi-dialectal Arabic ASR, we fine-tuned SeamlessM4T-v2 Large (Egyptian variant) separately for each of the eight considered dialects. Over…
Title:ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Abstract:This paper describes Elyadata & LIA’s joint submission to the NADI multi-dialectal Arabic Speech Processing 2025. We participated in the Spoken Arabic Dialect Identification (ADI) and multi-dialectal Arabic ASR subtasks. Our submission ranked first for the ADI subtask and second for the multi-dialectal Arabic ASR subtask among all participants. Our ADI system is a fine-tuned Whisper-large-v3 encoder with data augmentation. This system obtained the highest ADI accuracy score of \textbf{79.83%} on the official test set. For multi-dialectal Arabic ASR, we fine-tuned SeamlessM4T-v2 Large (Egyptian variant) separately for each of the eight considered dialects. Overall, we obtained an average WER and CER of \textbf{38.54%} and \textbf{14.53%}, respectively, on the test set. Our results demonstrate the effectiveness of large pre-trained speech models with targeted fine-tuning for Arabic speech processing.
| Comments: | Published in Proceedings of the ArabicNLP 2025 Workshop (co-located with EMNLP 2025), Association for Computational Linguistics, 2025 |
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2511.10090 [cs.CL] |
| (or arXiv:2511.10090v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2511.10090 arXiv-issued DOI via DataCite (pending registration) | |
| Related DOI: | https://doi.org/10.18653/v1/2025.arabicnlp-sharedtasks.105 DOI(s) linking to related resources |
Submission history
From: Haroun Elleuch [view email] [v1] Thu, 13 Nov 2025 08:44:39 UTC (28 KB)