Opus 4.8 barely moved the leaderboard. It moved the one number that decides if your agents can be trusted. (opens in new tab)
Opus 4.8 shipped on 28 May 2026, 41 days after 4.7. Standard pricing did not move. Five dollars per million tokens in, twenty five out. SWE-bench Verified nudged from 87.6 to 88.6. SWE-bench Pro climbed from 64.3 to 69.2, about five points. On GDPval-AA it posted 1890, ahead of GPT-5.5. Anthropic's own word for the release is "modest". They are right, and I respect them for saying it plainly. A point of SWE-bench is not why you would move a working setup. If you are deciding whether to upgrad...
Read the original article