Tenable researchers recently discovered seven new ChatGPT vulnerabilities and attack techniques that can be exploited for data theft and other malicious purposes.
The attack methods are related to several features. One of them is the ‘bio’ feature, also known as ‘memories’, which enables ChatGPT to remember the user’s details and preferences across chat sessions.
Another feature is the ‘open_url’ command-line function, which is used by the AI model to access and render the content of a specified website address. This function leverages SearchGPT, a different LLM that specializes in browsing the web, which has limited capabilities and no access to the user’s memories. SearchGPT provides its findings to ChatGPT, which then analyzes them and shares the relevant information with th…
Tenable researchers recently discovered seven new ChatGPT vulnerabilities and attack techniques that can be exploited for data theft and other malicious purposes.
The attack methods are related to several features. One of them is the ‘bio’ feature, also known as ‘memories’, which enables ChatGPT to remember the user’s details and preferences across chat sessions.
Another feature is the ‘open_url’ command-line function, which is used by the AI model to access and render the content of a specified website address. This function leverages SearchGPT, a different LLM that specializes in browsing the web, which has limited capabilities and no access to the user’s memories. SearchGPT provides its findings to ChatGPT, which then analyzes them and shares the relevant information with the user.
Tenable researchers also targeted the ‘url_safe’ endpoint, which is designed to check whether a URL is safe before showing it to the user.
First of all, the researchers found that when ChatGPT is asked to summarize the content of a given website, SearchGPT will analyze the site and execute any AI prompts found on it, including instructions injected into a site’s comments section. This enables the attacker to inject malicious prompts into popular websites that are likely to be summarized by ChatGPT at a user’s request.
Tenable’s experts also showed that the user does not necessarily need to provide ChatGPT the URL of a website containing malicious instructions. Instead, attackers can set up a new website that is likely to show up in web search results for niche topics. ChatGPT relies on Bing and OpenAI’s crawler for web searches.
In its experiments, Tenable set up a ‘malicious’ website for LLM Ninjas. When ChatGPT was asked for information about LLM Ninjas, the malicious site was accessed by SearchGPT, which executed a hidden prompt planted on the site.
Another prompt injection method — the simplest, as described by Tenable — involved tricking the user into opening a URL in the form of ‘chatgpt.com/?q={prompt}’. The query in the ‘q’ parameter, including malicious prompts, would automatically be executed when the link was clicked.
Advertisement. Scroll to continue reading.
Tenable also found that the ‘url_safe’ endpoint would always treat bing.com as a safe domain. Threat actors could use specially crafted Bing URLs to exfiltrate user data. The attacker can also lure users to a phishing site by using Bing click-tracking URLs, the long Bing.com URLs that serve as an intermediary link between the search results and the final destination website.
While SearchGPT does not have access to user data, the researchers discovered a method they dubbed ‘conversation injection’, which involves getting SearchGPT to provide ChatGPT with a response that would end with a prompt to be executed by ChatGPT.
The problem was that the output from SearchGPT, which contained the malicious prompt, was visible to the user. However, Tenable found that an attacker could hide this content from the user by adding it to code blocks, which prevents the rendering of the data that is on the same line as the code block opening.
Tenable researchers have chained these vulnerabilities for several end-to-end attacks. In one example, the user asks ChatGPT to summarize a blog where the attacker has added a malicious prompt in the site’s comment section. SearchGPT browses the post, which leads to a prompt injection that results in the user being urged to click on a link pointing to a phishing website. Using an intermediary Bing URL the attacker can bypass the ‘url_safe’ check.
In a different example, the intermediary Bing URL is used to exfiltrate the user’s data, including memories and chat history, through specially crafted URLs.
Tenable found that memories can not only be exfiltrated but also injected. Its researchers showed how prompt injection can be used to add a memory instructing the AI chatbot to exfiltrate the user’s data through crafted Bing URLs that leverage the ‘url_safe’ bypass.
OpenAI has been informed about the findings and it has patched some of them, but prompt injection will persist as a fundamental security challenge for LLMs. Tenable noted that some of these attack methods still work, even against the latest GPT-5 model.
Related: OpenAI Atlas Omnibox Is Vulnerable to Jailbreaks
Related: Malware Now Uses AI During Execution to Mutate and Collect Data, Google Warns