Humans Still Beat AI in the Long Horizon (opens in new tab)

Covers 2 stories including Implications of Large-Scale Test-Time Compute (5 minute read)Discussed on Hacker News

Agents can spend test-time compute by trying, observing, and revising. We derive an Elo reference for repeated sampling, then show that in a 2022 two-week coding marathon, current agents plateau within 24 hours while top humans keep improving.

Read the original article