Here, we measure success by the fraction of the “performance gap” we can close between the weak model and the potential of the strong model. (opens in new tab)
Here, we measure success by the fraction of the “performance gap” we can close between the weak model and the potential of the strong model. After 7 days, human researchers closed it by 23%. Then, our Automated Alignment Researchers—Opus 4.6 with extra tools—closed it by 97%.
Read the original article