MammoClean: Toward Reproducible and Bias-Aware AI in Mammography through Dataset Harmonization
arxiv.org·12h
Flag this post

View PDF HTML (experimental)

Abstract:The development of clinically reliable artificial intelligence (AI) systems for mammography is hindered by profound heterogeneity in data quality, metadata standards, and population distributions across public datasets. This heterogeneity introduces dataset-specific biases that severely compromise the generalizability of the model, a fundamental barrier to clinical deployment. We present MammoClean, a public framework for standardization and bias quantification in mammography datasets. MammoClean standardizes case selection, image processing (including laterality and intensity correction), and unifies metadata into a consistent multi-view structure. We provide a com…

Similar Posts

Loading similar posts...