Published on November 2, 2025 7:24 PM GMT
There is a temptation to simply define Goodness as Human Values, or vice versa.
Alas, we do not get to choose the definitions of commonly used words; our attempted definitions will simply be wrong. Unless we stick to mathematics, we will end up sneaking in intuitions which do not follow from our so-called definitions, and thereby mislead ourselves. People who claim that they use some standard word or phrase according to their own definition are, in nearly all cases outside of mathematics, wrong about their own usage patterns.<a href=“#fnmnjb8az5tpd”…
Published on November 2, 2025 7:24 PM GMT
There is a temptation to simply define Goodness as Human Values, or vice versa.
Alas, we do not get to choose the definitions of commonly used words; our attempted definitions will simply be wrong. Unless we stick to mathematics, we will end up sneaking in intuitions which do not follow from our so-called definitions, and thereby mislead ourselves. People who claim that they use some standard word or phrase according to their own definition are, in nearly all cases outside of mathematics, wrong about their own usage patterns.[1]
If we want to know what words mean, we need to look at e.g. how they’re used and where the concepts come from and what mental pictures they summon. And when we look at those things for Goodness and Human Values… they don’t match. And I don’t mean that we shouldn’t pursue Human Values; I mean that the stuff people usually refer to as Goodness is a coherent thing which does not match the actual values of actual humans all that well.
The Yumminess You Feel When Imagining Things Measures Your Values
There’s this mental picture where a mind has some sort of goals inside it, stuff it wants, stuff it values, stuff which from-the-inside feels worth doing things for. In old-school AI we’d usually represent that stuff as a utility function, but we wanted some terminology for a more general kind of “values” which doesn’t commit so hard to the mathematical framework (and often-confused conceptual baggage outside the math) of utility functions. The phrase “human values” caught on.
We don’t really know what human values are, or what shape they are, or even whether they’re A Thing at all. We don’t have trivial introspective access to our own values; sometimes we think we value a thing a lot, but realize in hindsight that we value it only a little. But insofar as the mental picture is pointing to a real thing at all, it does tell us how to go look for our values within our own minds.
How do we go look for our own values?
Well, we’re looking for some sort of goals, stuff which our minds want or value, stuff which drives us, etc. What does that feel like from the inside? Think of the stuff that, when you imagine it, feels really yummy. It induces yearning and longing. It feels like you’d be more complete with it. That’s the feeling of stuff that you value a lot. Lesser versions of the same feeling come when imagining things you value less (but still positively).
Personally… I get that feeling of yumminess and yearning when I imagine having a principled mathematical framework for understanding the internal structures of minds, which actually works on e.g. image generators.[2] I also get that feeling of yumminess and yearning when I imagine a really great night of dancing, or particularly great sex, or physically fighting with friends, or my favorite immersive theater shows, or some of my favorite foods at specific restaurants. Sometimes I get a weaker version of the yumminess and yearning feeling when I imagine hanging out around a fire with friends, or just sitting out on my balcony alone at night and watching the city, or dealing with the sort of emergency which is important enough that I drop everything else from my mind and just focus
Those are my values. That’s what human values look like, and how to probe for yours.
“Goodness” Is A Memetic Egregore
I did not first learn about goodness by imagining things and checking how yummy they felt. I first learned about Goodness by my parents and teachers and religious figures and books and movies and so forth telling me that it’s Good to not steal things, Good to do unto others what I’d have them do unto me, Good to follow rules and authority figures, Good to clean up after myself, Good to share things with other kids, Good to not pick my nose, etc, etc.
In other words, I learned about Goodness mostly memetically, absorbing messages from others about what’s Good.
Some of those messages systematically follow from some general principles. Things like “don’t steal” are social rules which help build a high-trust society, making it easier for everyone to get what they want insofar as everyone else follows the rules. We want other people to follow those rules, so we teach other people the rules. Other aspects of Goodness, especially about cleanliness, seem to mostly follow humans’ purity instincts, and are memetically spread mainly by people with relatively-strong purity instincts in an attempt to get people with relatively-weaker purity instincts to be less gross (think nose picking). Still other aspects of Goodness seem rather suspiciously optimized for getting kids to be easier for their parents and teachers to manage - think following rules or respecting one’s elders. Then there are aspects of Goodness which seem to be largely political, driven by the usual political memetic forces.
The main unifying theme here is that Goodness is a memetic egregore; in practice, our shared concept of Goodness is comprised of whatever messages people spread about what other people should value.
… which sure is a different thing from what people do value, when they introspect on what feels yummy.
Aside: Loving Connection
One thing to flag at this point: you know the feeling of deep loving connection, like a parent-child bond or spousal bond or the feeling you get (to some degree) when deeply empathizing with someone or the feeling of loving connection to God or the universe which people sometimes get from religious experiences? I.e. oxytocin?
For many (most?) people, that feeling is a REALLY big chunk of their Values. It is the thing which feels yummiest, often by such a large margin that it overwhelms everything else. If that’s you, then it’s probably worth stopping to notice that there are other things you value. It is quite possible to hyperoptimize for that one particular yumminess, then burn out and later realize that one values other things too - as many a parent learns when the midlife crisis hits.
That feeling of deep loving connection is also a major component of the memetic egregore Goodness, to such an extent that people often say that Goodness just is that kind of love. Think of the songs or hippies or whoever saying that all the world’s problems would be solved if only we had more love. As with values, it is worth stopping to notice that loving connection is not the entirety of Goodness, as the term is typically used. The people saying that Goodness just is loving connection (or something along those lines) are making the same move as someone trying to define a word; in most cases their usage probably doesn’t even match their own definition on closer inspection.
It is true that deep loving connection is both an especially large chunk of Human Values and an especially large chunk of Goodness, and within that overlap Human Values and Goodness do match. But that’s not the entirety of either Human Values or Goodness, and losing track of the rest is a good way to shoot oneself in the foot eventually.
We Don’t Get To Choose Our Own Values (Mostly)
To summarize so far:
- Our Values are (roughly) the yumminess or yearning we feel when imagining something.
- Goodness is (roughly) whatever stuff the memes say one should value.
Looking at that first one, the second might seem kind of silly. After all, we mostly don’t get to choose what triggers yumminess or yearning. There are some loopholes - e.g. sometimes we can learn to like things, or intentionally build new associations - but mostly the yumminess is not within conscious control. So it’s kind of silly for the memetic egregore to tell us what we should find yummy.
A central example: gay men mostly don’t seem to have much control over their attraction to men; that yumminess is not under their control. In many times and places the memetic egregore Goodness said that men shouldn’t be sexually attracted to men (those darn purity instincts!), which… usually isn’t all that effective at changing the underlying yumminess or yearning.
What does often happen, when the memetic egregore Goodness dictates something in conflict with actual Humans’ actual Values, is that the humans “tie themselves in knots” internally. The gay man’s attraction to men is still there, but maybe that attraction also triggers a feeling of shame or social anxiety or something. Or maybe the guy just hides his feelings, and then feels alone and stressed because he doesn’t feel safe being open with other people.
Sex and especially BDSM is a ripe area for this sort of thing. An awful lot of people, probably a majority of the population, sure do feel deep yearning to either inflict or receive pain, to take total control over another or give total control to another, to take or be taken by force, to abandon propriety and just be a total slut, to give or receive humiliation, etc. And man, the memetic egregore Goodness sure does not generally approve of those things. And then people tie themselves in knots, with the things that turn them on most also triggering anxiety or insecurity.
So What Do?
I’d like to say here “screw memetic egregores, follow the actual values of actual humans”, but then many people will be complete fucking idiots about it. So first let’s go over what not to do.
There’s a certain type of person… let’s call him Albert. Albert realizes that Goodness is a memetic egregore, and that the memetic egregore is not particularly well aligned with Albert’s own values. And so Albert throws out all that Goodness crap, and just queries his own feelings of yumminess in-the-moment when making decisions.
This goes badly in a few different ways
Sometimes Albert has relatively low innate empathy, and throws out all the Goodness stuff about following the rules and spirit of high-trust communities. Albert just generally hits the “defect” button whenever it’s convenient. Then Albert goes all pikachu surprise face when he’s excluded from high trust communities.
Other times Albert is just bad at thinking far into the future, and jumps on whatever feels yummy in-the-moment without really thinking ahead. A few years down the line Albert is broke.
Or maybe Albert rejects memetic Goodness, ignores authority a little too much, and winds up unemployed or in prison. Or ignores purity instincts a little too much and winds up very sick.
Point is: there’s a Chesterton’s fence here. Don’t be an idiot. Goodness is not very well aligned with actual Humans’ actual Values, but it has been memetically selected for a long time and you probably shouldn’t just jettison the whole thing without checking the pieces for usefulness. In particular, a nontrivial chunk of the memetic egregore Goodness needs to be complied with in order to satisfy your actual Values long term (which usually involves other people), even when it conflicts with your Values short term. Think about the consequences, what will actually happen down the line and how well your Values will actually be satisfied long-term, not just about what feels yummy in the moment.
… and then jettison the memetic egregore and pay attention to your actual Values. Don’t make the opposite mistake of motivatedly looking for clever reasons to not jettison the egregore just because it’s scary.
You can quick-check this in individual cases by replacing the defined word with some made-up word wherever the person uses it - e.g. replace “Goodness” with “Bixness”.
… actually when I first try to imagine that I get a mild “ugh” because I’ve tried and failed to make such a thing before. But when I set that aside and actually imagine the end product, then I get the yummy feeling.
Discuss