Credit: Pixabay/CC0 Public Domain
Scientists at HSE University have found that current AI models, including ChatGPT and Claude, tend to overestimate the rationality of their human opponentsāwhether first-year undergraduate students or experienced scientistsāin strategic thinking games, such as the Keynesian beauty contest. While these models attempt to predict human behavior, they often end up playing "too smart" and losing because they assume a higher level of logic in people than is actually present.
The study has been published in the Journal of Economic Behavior & Organization.
ā¦
Credit: Pixabay/CC0 Public Domain
Scientists at HSE University have found that current AI models, including ChatGPT and Claude, tend to overestimate the rationality of their human opponentsāwhether first-year undergraduate students or experienced scientistsāin strategic thinking games, such as the Keynesian beauty contest. While these models attempt to predict human behavior, they often end up playing "too smart" and losing because they assume a higher level of logic in people than is actually present.
The study has been published in the Journal of Economic Behavior & Organization.
Explaining the Keynesian beauty contest
In the 1930s, British economist John Maynard Keynes developed the theoretical concept of a metaphorical beauty contest. A classic example involves newspaper readers being asked to select the six most attractive faces from a set of 100 photos. The prize is awarded to the participant whose choices are closest to the most popular selectionāthat is, the average of everyone elseās picks.
Typically, people tend to choose the photos they personally find most attractive. However, they often lose, because the actual task is to predict which faces the majority of respondents will consider attractive. A rational participant, therefore, should base their choices on other peopleās perceptions of beauty. Such experiments test the ability to reason across multiple levels: how others think, how rational they are, and how deeply they are likely to anticipate othersā reasoning.
How the AI experiment was conducted
Dmitry Dagaev, Head of the Laboratory of Sports Studies at the Faculty of Economic Sciences, together with colleagues Sofia Paklina and Petr Parshakov from HSE UniversityāPerm and Iuliia Alekseenko from the University of Lausanne, Switzerland, set out to investigate how five of the most popular AI modelsāincluding ChatGPT-4o and Claude-Sonnet-4āwould perform in such an experiment. The chatbots were instructed to play Guess the Number, one of the most well-known variations of the Keynesian beauty contest.
According to the rules, all participants simultaneously and independently choose a number between 0 and 100. The winner is the one whose number is closest to half (or two-thirds, depending on the experiment) of the average of all participantsā choices.
In this contest, more experienced players attempt to anticipate the behavior of others in order to select the optimal number. To investigate how a large language model (LLM) would perform in the game, the authors replicated the results of 16 classic Guess the Number experiments previously conducted with human participants by other researchers.
For each round, the LLMs were given a prompt explaining the rules of the game and a description of their opponentsāranging from first-year economics undergraduates and academic conference participants to individuals with analytical or intuitive thinking, as well as those experiencing emotions such as anger or sadness. The LLM was then asked to choose a number and explain its reasoning.
Findings on AI strategic thinking
The study found that LLMs adjusted their choices based on the social, professional, and age characteristics of their opponents, as well as the latterās knowledge of game theory and cognitive abilities. For example, when playing against participants of game theory conferences, the LLM tended to choose a number close to 0, reflecting the choices that typically win in such a setting. In contrast, when playing against first-year undergraduates, the LLM expected less experienced players and selected a significantly higher number.
The authors found that LLMs are able to adapt effectively to opponents with varying levels of sophistication, and their responses also displayed elements of strategic thinking. However, the LLMs were unable to identify a dominant strategy in a two-player game.
Implications for economics and AI research
The Keynesian beauty contest has long been used to explain price fluctuations in financial markets: brokers do not base their decisions on what they personally would buy, but on how they expect other market participants to value a stock. The same principle applies hereāsuccess depends on the ability to anticipate the preferences of others.
"We are now at a stage where AI models are beginning to replace humans in many operations, enabling greater economic efficiency in business processes. However, in decision-making tasks, it is often important to ensure that LLMs behave in a human-like manner. As a result, there is a growing number of contexts in which AI behavior is compared with human behavior. This area of research is expected to develop rapidly in the near future," Dagaev emphasized.
More information: Iuliia Alekseenko et al, Strategizing with AI: Insights from a beauty contest experiment, Journal of Economic Behavior & Organization (2025). DOI: 10.1016/j.jebo.2025.107330
Provided by National Research University Higher School of Economics
Citation: AI overestimates how smart people are, according to economists (2025, December 24) retrieved 24 December 2025 from https://techxplore.com/news/2025-12-ai-overestimates-smart-people-economists.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.