not much happened today | AINews (opens in new tab)
**Anthropic's Mythos/Opus cycle** sparked mixed reactions with praise for **Claude Mythos**'s one-shot workflows and concerns over **Opus 4.8** benchmark regressions. **Opus 4.7** showed strong chemistry task performance, "making Claude a chemist." **Sakana AI** launched an **RSI Lab** focusing on recursive self-improvement under compute constraints, marking RSI as a formal research program. New benchmarks like **Agents' Last Exam (ALE)** and **SWE-Marathon** test agents on long-horizon, econ...
Read the original article