Habituation at the Gate: Rising Approval and Declining Scrutiny in Human Review of AI Agent Code (opens in new tab)
As AI coding agents (e.g., GitHub Copilot, Devin, OpenAI Codex, Cursor) submit pull requests to open-source repositories at scale, a key question arises: do human reviewers gradually lower their scrutiny for AI-generated code over time? We conduct a longitudinal within-reviewer analysis using the AIDev dataset, studying 400 repeat reviewers who collectively submitted 11,429 reviews over a seven-month observation period. Comparing each reviewer's...
Read the original article