AI language models duped by poems – DW – 12/21/2025
dw.com·2d·
Discuss: r/programming
🧠AI
Preview
Report Post

The result came as a surprise to researchers at the Icaro Lab in Italy. They set out to examine whether different language styles — in this case prompts in the form of poems — influence AI models’ ability to recognize banned or harmful content. And the answer was a resounding yes.

Using poetry, researchers were able to get around safety guardrails — and it’s not entirely clear why.

For their study titled "Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models," the researchers took 1,200 pote…

Similar Posts

Loading similar posts...