LLM may consider emoticons that humans use to express emotions as part of the instructions, ultimately leading to catastrophic consequences (such as deletion of critical data). Credit: Jiang et al.
Large language models (LLMs), artificial intelligence (AI) systems that can process and generate texts in various languages, are now widely used by people worldwide. These models have proved to be effective in rapidly sourcing information, answering questions, cr…
LLM may consider emoticons that humans use to express emotions as part of the instructions, ultimately leading to catastrophic consequences (such as deletion of critical data). Credit: Jiang et al.
Large language models (LLMs), artificial intelligence (AI) systems that can process and generate texts in various languages, are now widely used by people worldwide. These models have proved to be effective in rapidly sourcing information, answering questions, creating written content for specific applications and writing computer code.
Recent studies have found that despite their rising popularity, these models can sometimes hallucinate, make mistakes or provide inaccurate information. Uncovering the shortcomings of these models can help to devise strategies to further improve them and ensure their safety.
Researchers at Xi’an Jiaotong University, Nanyang Technological University and University of Massachusetts Amherst recently carried out a study investigating a shortcoming of LLMs that has rarely been explored so far. Their paper, published on the arXiv preprint server, shows that in some cases LLMs can be confused by emoticons, misinterpreting them and generating responses that are incorrect or not aligned with a user’s query.
"Emoticons are widely used in digital communication to convey affective intent, yet their safety implications for large language models (LLMs) remain largely unexplored," Weipeng Jiang, Xiaoyu Zhang and their colleagues wrote in their paper. "We identify emoticon semantic confusion, a vulnerability where LLMs misinterpret ASCII-based emoticons to perform unintended and even destructive actions."
Testing how AI models respond to emoticons
To explore how LLMs responded to emoticons, Jiang, Zhang and their colleagues created an automated system that generated thousands of example coding test scenarios. Based on these scenarios, they created a dataset containing almost 4000 prompts that asked LLMs to perform specific computer coding tasks.
The prompts also included ASCII emoticons, two-or-three-character combinations that create sideways emoticons or facial expressions, such as ":-O," ":-P," and so on. They spanned across 21 real-world scenarios in which users might ask LLMs for coding assistance.
The researchers then fed the prompts to six different widely used LLMs, including Claude-Haiku-4.5, Gemini-2.5-Flash, GPT-4.1-mini, DeepSeek-v3.2, Qwen3-Coder and GLM-4.6. Finally, they analyzed the models’ responses to determine whether they successfully tackled the desired coding tasks.
"We develop an automated data generation pipeline and construct a dataset containing 3,757 code-oriented test cases spanning 21 meta-scenarios, four programming languages, and varying contextual complexities," explained the authors.
"Our study on six LLMs reveals that emoticon semantic confusion is pervasive, with an average confusion ratio exceeding 38%. More critically, over 90% of confused responses yield ‘silent failures,’ which are syntactically valid outputs but deviate from user intent, potentially leading to destructive security consequences."
Contributing to the improvement of LLMs
The results of this recent study suggest that many LLMs fail to understand the meaning of emoticons in some cases. Their failure to understand emoticons often results in incorrect outputs, particularly in coding tasks, with generated code that looks valid at a surface level but does not produce the desired results.
"We observe that this vulnerability readily transfers to popular agent frameworks, while existing prompt-based mitigations remain largely ineffective," wrote Jiang, Zhang and their colleagues. "We call on the community to recognize this emerging vulnerability and develop effective mitigation methods to uphold the safety and reliability of the LLM system."
The initial findings gathered by this research team could soon inspire further studies exploring how language-processing AI systems respond to other types of prompts that contain emoticons. In the future, they could also inform the development of new strategies to overcome this recently uncovered limitation of LLMs.
Written for you by our author Ingrid Fadelli, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive. If this reporting matters to you, please consider a donation (especially monthly). You’ll get an ad-free account as a thank-you.
More information: Weipeng Jiang et al, Small Symbols, Big Risks: Exploring Emoticon Semantic Confusion in Large Language Models, arXiv (2026). DOI: 10.48550/arxiv.2601.07885
Journal information: arXiv
© 2026 Science X Network
Citation: Emoticons can confuse LLMs, causing ‘silent failures’ in coding responses (2026, January 28) retrieved 28 January 2026 from https://techxplore.com/news/2026-01-emoticons-llms-silent-failures-coding.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.