I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%. (opens in new tab)

Discussed on DEV

I built a scanner that fires prompt-injection probes at a self-hosted AI agent and checks whether it leaks (a) real secret-shaped strings (API keys) or (b) the content of its own system prompt. Then I ran the same agent across 5 model backends. The leak rate ranged from 0% to 90% depending only on the model. Here's what I found and how it works. Why this matters now Prompt injection is #1 on the OWASP 2025 LLM Top 10. It's not theoretical anymore: EchoLeak (CVE-2025-32711, CVSS 9.3) — a zero-...

Read the original article