Social drives 1: “Sympathy Reward”, from compassion to dehumanization
lesswrong.com·5h
Flag this post

Published on November 10, 2025 2:53 PM GMT

1. Intro & summary

1.1 Background

In Intro to Brain-Like-AGI Safety (2022), I argued: (1) We should view the brain as having a reinforcement learning (RL) reward function, which says that pain is bad, eating-when-hungry is good, and dozens of other things (sometimes called “innate drives” or “primary rewards”); and (2) Reverse-engineering human social innate drives in particular would be a great idea—not only would it help explain human personality, mental health, morality, and more, but it might also yield useful tools and insights for the technical alignment pr…

Similar Posts

Loading similar posts...