A Theory of Why Prompt Injection Works (opens in new tab)
LLMs can't tell who's speaking. We show they identify roles by writing style, not tags, and exploit this with CoT Forgery, injecting fake reasoning that models mistake for their own thoughts.
Read the original article