BashArena: A Control Setting for Highly Privileged AI Agents
lesswrong.com·11h
🆕New AI
Preview
Report Post

Published on December 18, 2025 6:19 PM GMT

We’ve just released BashArena, a new high-stakes control setting we think is a major improvement over the settings we’ve used in the past. In this post we’ll discuss the strengths and weaknesses of BashArena, and what we’ve learned about how to make settings for high stakes control research.

BashArena

Overview

BashArena is a control setting: a dataset designed to support the kind of adversarial game we studied in the original AI control paper and Ctrl-Z. In general, control settings require two types of tasks – main tasks and side tasks. Main tasks r…

Similar Posts

Loading similar posts...