BAID: A Benchmark for Bias Assessment of AI Detectors

View PDF HTML (experimental)

Abstract:AI-generated text detectors have recently gained adoption in educational and professional contexts. Prior research has uncovered isolated cases of bias, particularly against English Language Learners (ELLs) however, there is a lack of systematic evaluation of such systems across broader sociolinguistic factors. In this work, we propose BAID, a comprehensive evaluation framework for AI detectors across various types of biases. As a part of the framework, we introduce over 200k samples spanning 7 major categories: demographics, age, educational grade level, dialect, formality, political leaning, and topic. We also generated synthetic versions of each sample with careful…

View PDF HTML (experimental)

Abstract:AI-generated text detectors have recently gained adoption in educational and professional contexts. Prior research has uncovered isolated cases of bias, particularly against English Language Learners (ELLs) however, there is a lack of systematic evaluation of such systems across broader sociolinguistic factors. In this work, we propose BAID, a comprehensive evaluation framework for AI detectors across various types of biases. As a part of the framework, we introduce over 200k samples spanning 7 major categories: demographics, age, educational grade level, dialect, formality, political leaning, and topic. We also generated synthetic versions of each sample with carefully crafted prompts to preserve the original content while reflecting subgroup-specific writing styles. Using this, we evaluate four open-source state-of-the-art AI text detectors and find consistent disparities in detection performance, particularly low recall rates for texts from underrepresented groups. Our contributions provide a scalable, transparent approach for auditing AI detectors and emphasize the need for bias-aware evaluation before these tools are deployed for public use.


Comments:	Accepted at the workshop on Agentic AI Benchmarks and Applications for Enterprise Tasks at AAAI 2026
Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2512.11505 [cs.AI]
	(or arXiv:2512.11505v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2512.11505 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Priyam Basu [view email] [v1] Fri, 12 Dec 2025 12:01:42 UTC (29 KB)

Submission history

Similar Posts