How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models
arxiv.org·2h
Flag this post

Computer Science > Computation and Language

arXiv:2511.06676 (cs)

View PDF

Abstract:Now that AI-driven moderation has become pervasive in everyday life, we often hear claims that “the AI is biased”. While this is often said jokingly, the light-hearted remark reflects a deeper concern. How can we be certain that an online post flagged as “inappropriate” was not simply the victim of a biased algorithm? This paper investigates this problem using a dual approach. First, I conduct a quantitative benchmark of a widely used toxicity model (unitary/toxic-bert) to measure performance disparity between text in African-American English (AAE) and Standard American English (SAE). The benchmark reveals a clear, systematic bias: on average, th…

Similar Posts

Loading similar posts...