Meta-agentic Prisoner's Dilemmas
lesswrong.com·18h
Flag this post

Published on November 5, 2025 4:44 PM GMT

Crosspost from my blog.

In the classic Prisoner's Dilemma (https://www.lesswrong.com/w/prisoner-s-dilemma), there are two agents with the same beliefs and decision theory, but with different values. To get the best available outcome, they have to help each other out (even if they don't intrinsically care about the other's values); and they have to do so even though, if the one does not help the other, there's no way for the other to respond with a punishment afterward.

A classic line of reasoning, from the perspective of one of the prisoners, goes something li...

Similar Posts

Loading similar posts...