Claude Opus blew me away...
imgur.com·1w·
Discuss: r/ClaudeAI
🕹️Terminal Games
Preview
Report Post

When I get bored I let the AI play games, where most of the time I have to babysit the AI all the way through but it's fun seeing how much of the game they can handle.

One of the games I choose to play is "Prose & Codes" on Steam. It's just a substitution Cipher using various categories of public domain books as the subject matter for the ciphers.

I've tested Haiku 4.5 (thinking and non thinking), Claude Sonnet 3.5, 3.7, 4.0, 4.5 and Opus 4.1. Haiku 4.5 actually performed better than Sonnet 3.5 or 3.7... maybe on par with Sonnet 4.0. Sonnet 4.5 and Opus 4.1 were both better at the game, but all three needed me to manage the game state. If I am not in charge of showing them the current game state, then they eventually...

Similar Posts

Loading similar posts...