New Frontier Red Team blog: Phase 2 of Project Fetch, where we test how well Claude can program a robodog. (opens in new tab)
New Frontier Red Team blog: Phase 2 of Project Fetch, where we test how well Claude can program a robodog. Opus 4.7, on its own, was ~20x faster than last year's best human team aided by Opus 4.1. (The robodog, alas, still failed to fetch a beach ball.)
Read the original article