Emotion Concepts and their Function in a Large Language Model (opens in new tab)

Covered by 6 sources including lesswrong.com, thetransmitter.orgDiscussed on Hacker News, r/LocalLLaMA, r/artificial, r/singularity, and DEV

Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why this is the case in Claude Sonnet 4.5 and explore implications for alignment-relevant behavior. We find internal representations of emotion concepts, which encode the broad concept of a particular emotion and generalize across contexts and behaviors it might be linked to. These representations track the operative emotion concept at a given token position in a conversation, activating in accordance...

Emotion Concepts and their Function in a Large Language Model (opens in new tab)

Covered in 7 articles

When Emotion Descriptors Fail: AI-Native Functions of Emotion Vectors

Belief manifolds, and how to steer along them

What can AI teach us about ‘emotions’?