Authors:Peshawa J. Muhammad Ali (1 and 2), Navin Vincent (3), Saman S. Abdulla (4 and 5), Han N. Mohammed Fadhl (6), Anders Blilie (7 and 8), Kelvin Szolnoky (9), Julia Anna Mielcarz (3), Xiaoyi Ji (9), [Nita Mulliqi](https://arxiv.org/search/cs?searc…
Authors:Peshawa J. Muhammad Ali (1 and 2), Navin Vincent (3), Saman S. Abdulla (4 and 5), Han N. Mohammed Fadhl (6), Anders Blilie (7 and 8), Kelvin Szolnoky (9), Julia Anna Mielcarz (3), Xiaoyi Ji (9), Nita Mulliqi (3), Abdulbasit K. Al-Talabani (1), Kimmo Kartasalo (3) ((1) Department of Software Engineering, Faculty of Engineering, Koya University, Koya 44023, Kurdistan Region - F.R. Iraq, (2) Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, Koya University, Koya 44023, Kurdistan Region - F.R. Iraq, (3) Department of Medical Epidemiology and Biostatistics, SciLifeLab, Karolinska Institutet, Stockholm, Sweden, (4) College of Dentistry, Hawler Medical University, Erbil, Kurdistan Region, Iraq, (5) PAR Private Hospital, Erbil, Kurdistan Region, Iraq, (6) College of Dentistry, University of Sulaimani, Sulaymaniyah, Kurdistan Region, Iraq, (7) Department of Pathology, Stavanger University Hospital, Stavanger, Norway, (8) Faculty of Health Sciences, University of Stavanger, Stavanger, Norway, (9) Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden)
Abstract:Background: Artificial intelligence (AI) is improving the efficiency and accuracy of cancer diagnostics. The performance of pathology AI systems has been almost exclusively evaluated on European and US cohorts from large centers. For global AI adoption in pathology, validation studies on currently under-represented populations - where the potential gains from AI support may also be greatest - are needed. We present the first study with an external validation cohort from the Middle East, focusing on AI-based diagnosis and Gleason grading of prostate cancer. Methods: We collected and digitised 339 prostate biopsy specimens from the Kurdistan region, Iraq, representing a consecutive series of 185 patients spanning the period 2013-2024. We evaluated a task-specific end-to-end AI model and two foundation models in terms of their concordance with pathologists and consistency across samples digitised on three scanner models (Hamamatsu, Leica, and Grundium). Findings: Grading concordance between AI and pathologists was similar to pathologist-pathologist concordance with Cohen’s quadratically weighted kappa 0.801 vs. 0.799 (p=0.9824). Cross-scanner concordance was high (quadratically weighted kappa > 0.90) for all AI models and scanner pairs, including low-cost compact scanner. Interpretation: AI models demonstrated pathologist-level performance in prostate histopathology assessment. Compact scanners can provide a route for validation studies in non-digitalised settings and enable cost-effective adoption of AI in laboratories with limited sample volumes. This first openly available digital pathology dataset from the Middle East supports further research into globally equitable AI pathology. Funding: SciLifeLab and Wallenberg Data Driven Life Science Program, Instrumentarium Science Foundation, Karolinska Institutet Research Foundation.
| Comments: | 40 pages, 8 figures, 11 tables |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2512.17499 [cs.CV] |
| (or arXiv:2512.17499v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2512.17499 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Kimmo Kartasalo [view email] [v1] Fri, 19 Dec 2025 12:08:28 UTC (2,195 KB)