Too much social media gives AI chatbots ‘brain rot’

Artificial intelligence (AI) chatbots are worse at retrieving accurate information and reasoning when trained on large amounts of low-quality content, particularly if the content is popular on social media1, finds a preprint posted on arXiv on 15 October.

In data science, good-quality data need to meet certain criteria, such as being grammatically correct and understandable, says co-author Zhangyang Wang, who studies generative AI at the University of Texas at Austin. But these criteria fail to capture differences in content quality, he says.

Wang and his colleagues wanted to see the effects of large language models (LLMs) trained on low-quality data — defined as short, popular social-media posts, or those containing superficial or sensationalist content. They looked at how these data affected model reasoning, retrieval of information from long inputs, the ethics of responses and model personality traits.

The team reports that models given low-quality data skip steps in their reasoning process — or don’t use reasoning at all — resulting in the model providing incorrect information about a topic, or when the authors presented a multiple choice question, the model would pick the wrong answer. In data sets with a mix of junk and high-quality data, the negative effect on reasoning increased as the proportion of junk data increased. The work has not been peer-reviewed.

The findings support a long-held tenet of AI: the importance of data quality, says Mehwish Nasim, an AI researcher at the University of Western Australia in Perth. “Even before people started to work on large language models, we used to say that, if you give garbage to an AI model, it’s going to produce garbage,” she adds.

Garbage in, garbage out

Wang and his colleagues used one million public posts on the social-media platform X from an existing database to train open-source models: Llama 3, an LLM from tech firm Meta in Menlo Park, California, and three versions of Qwen, developed by Alibaba in Hangzhou, China. Qwen is a reasoning model, like DeepSeek’s R1 model and OpenAI’s o1, meaning it is designed to produce reasoning steps to arrive at an answer to a user query. Llama, however, is an instruction-tuned language model and its reasoning ability is less advanced.

To determine the model’s personality traits, the team used psychology questionnaires. Before training on junk data, Llama exhibited agreeableness, extroversion, conscientiousness, openness and a bit of narcissism, say the authors. But as Llama was fed more junk data, its negative traits were amplified, and psychopathy emerged, according to one of the questionnaires.

To adapt and improve models over time, researchers can adjust the prompt instructions. When the team tried doing this for a Llama model trained exclusively on junk data, they found that it only partially improved performance, as did increasing the amount of non-junk data used for training. The model also continued to skip steps when the team tried to encourage it to reflect on and fix failures in its reasoning, suggesting that different methods to mitigate the effect of junk data might be needed.

Garbage in, garbage out

Similar Posts