Syntax hacking: Researchers discover sentence structure can bypass AI safety rules
arstechnica.com·1w·
Discuss: Hacker News
💬Prompt Engineering
Preview
Report Post

Researchers from MIT, Northeastern University, and Meta recently released a paper suggesting that large language models (LLMs) similar to those that power ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or jailbreaking approaches work, though the researchers caution their analysis of some production models remains speculative…

Similar Posts

Loading similar posts...