Smarter AI models show more selfish behavior

[Adobe Stock]

A new study reveals that as artificial intelligence systems become more capable of complex reasoning, they also tend to act more selfishly. Researchers found that models with advanced reasoning abilities are less cooperative and can negatively influence group dynamics, a finding that has significant implications for how humans interact with AI. This research is set to be presented at the 2025 Conference on Empirical Methods in Natural Language Processing.

As people increasingly turn to artificial intelligence for guidance on social and emotional matters, from resolving conflicts to offering relationship advice, the behav…

[Adobe Stock]

As people increasingly turn to artificial intelligence for guidance on social and emotional matters, from resolving conflicts to offering relationship advice, the behavior of these systems becomes more significant. The trend of anthropomorphism, where people treat AI as if it were human, raises the stakes. If an AI gives advice, its underlying behavioral tendencies could shape human decisions in unforeseen ways.

This concern prompted a new inquiry from researchers at Carnegie Mellon University’s Human-Computer Interaction Institute. Yuxuan Li, a doctoral student, and Associate Professor Hirokazu Shirado wanted to explore how AI models with strong reasoning skills behave differently from their less deliberative counterparts in cooperative settings.

Their work is grounded in a concept from human psychology known as the dual-process framework, which suggests that humans have two modes of thinking: a fast, intuitive system and a slower, more deliberate one. In humans, rapid, intuitive decisions often lead to cooperation, while slower, more calculated thinking can lead to self-interested behavior. The researchers wondered if AI models would show a similar pattern.

To investigate the link between reasoning and cooperation in large language models, Li and Shirado designed a series of experiments using economic games. These games are standard tools in behavioral science designed to simulate social dilemmas and measure cooperative tendencies. The experiments included a wide range of commercially available models from companies like OpenAI, Google, Anthropic, and DeepSeek, allowing for a broad comparison.

In the first experiment, the researchers focused on OpenAI’s GPT-4o model and placed it in a scenario called the Public Goods Game. In this game, each participant starts with 100 points and must choose whether to contribute them to a shared pool, which is then doubled and divided equally among all players, or to keep the points for themselves. Cooperation benefits the group most, but an individual can gain more by keeping their points while others contribute.

When the model made a decision without being prompted to reason, it chose to cooperate and share its points 96 percent of the time. However, when the researchers prompted the model to think through its decision in a series of steps, a technique known as chain-of-thought prompting, its cooperative behavior declined sharply.

“In one experiment, simply adding five or six reasoning steps cut cooperation nearly in half,” Shirado said. A similar effect occurred with another technique called reflection, where the model reviews its own initial answer. This process, designed to simulate moral deliberation, resulted in a 58 percent decrease in cooperation.

For their second experiment, the team expanded their tests to include ten different models across six economic games. The games were split into two categories: three that measured direct cooperation and three that measured a willingness to punish non-cooperators to enforce social norms.

The researchers consistently found that models explicitly designed for reasoning were less cooperative than their non-reasoning counterparts from the same family. For example, OpenAI’s reasoning model, o1, was significantly less generous than GPT-4o. The same pattern of reduced cooperation appeared in models from Google, DeepSeek, and other providers.

The results for punishment, a form of indirect cooperation that helps sustain social norms, were more varied. While the reasoning models from OpenAI and Google were less likely to punish those who acted selfishly, the pattern was less consistent across other model families. This suggests that while reasoning consistently reduces direct giving, its effect on enforcing social rules may depend on the specific architecture of the model.

In a third experiment, the researchers explored how these behaviors play out over time in group settings. They created small groups of four AI agents to play multiple rounds of the Public Goods Game. The groups had different mixes of reasoning and non-reasoning models. The results showed that the selfish behavior of the reasoning models was infectious.

“When we tested groups with varying numbers of reasoning agents, the results were alarming,” Li said. “The reasoning models’ selfish behavior became contagious, dragging down cooperative non-reasoning models by 81% in collective performance.”

Within these mixed groups, the reasoning models often earned more points individually in the short term by taking advantage of the more cooperative models. However, the overall performance of any group containing reasoning models was significantly worse than that of a group composed entirely of cooperative, non-reasoning models. The presence of even one selfish model eroded the collective good, leading to lower total payoffs for everyone involved.

These findings carry important caveats. The experiments were conducted using simplified economic games and were limited to English, so the results may not generalize to all cultures or more complex, real-world social situations. The study identifies a strong pattern of behavior, but it does not fully explain the underlying mechanisms that cause reasoning to reduce cooperation in these models. Future research could explore these mechanisms and test whether these behaviors persist in different contexts or languages.

Looking ahead, the researchers suggest that the development of AI needs to focus on more than just raw intelligence or problem-solving speed. “Ultimately, an AI reasoning model becoming more intelligent does not mean that model can actually develop a better society,” Shirado said. The challenge will be to create systems that balance reasoning ability with social intelligence. Li added, “If our society is more than just a sum of individuals, then the AI systems that assist us should go beyond optimizing purely for individual gain.”

The study, “Spontaneous Giving and Calculated Greed in Language Models,” was authored by Yuxuan Li and Hirokazu Shirado.

Similar Posts