A Conflict Between AI Alignment and Philosophical Competence
lesswrong.com·4h
📟Embedded programming systems
Preview
Report Post

Published on December 27, 2025 9:32 PM GMT

(This argument reduces my hope that we will have AIs that are both aligned with humans in some sense and also highly philosophically competent, which aside from achieving a durable AI pause, has been my main hope for how the future turns out well. As this is a recent realization[1], I’m still pretty uncertain how much I should update based on it, or what its full implications are.)

Being a good alignment researcher seems to require a correct understanding of the nature of values. However metaethics is currently an unsolved problem…

Similar Posts

Loading similar posts...