Early stage goal-directednesss
lesswrong.com·2d
Flag this post

Published on October 21, 2025 5:41 PM GMT

A fairly common question is “why should we expect powerful systems to be coherent agents with perfect game theory?”

There was a short comment exchange on The title is reasonable that I thought made a decent standalone post.

Originally in the post I said:

Goal Directedness is pernicious. Corrigibility is anti-natural.

The way an AI would develop the ability to think extended, useful creative research thoughts that you might fully outsource to, is via becoming perniciously goal directed. You can’t do months or years of openended research without fractally noticing subproblems, figuring out new goals,…

Similar Posts

Loading similar posts...