AI models more vulnerable than claimed when faced with iterative attacks (opens in new tab)

Covers Proprietary Problems: No Frontier Model Is Multi-Turn ImmuneCovered by infoworld.com, Metacurity

CISOs relying on LLM runtime guardrails and official safety scores when making security decisions about their organizations’ AI usage and model selection are due for a wakeup call. According to a new study from Cisco, frontier models from OpenAI, Anthropic, Google, xAI, and Amazon have significantly worse risk profiles when pressured in multi-turn attacks compared to when their safety is benchmarked using single prompts. “The dominant safety benchmarks for frontier large language models share...

Read the original article

Sign in to keep reading the full article.

Sign Up Log In

Covered in 2 articles

infoworld.com·

IBM and Red Hat want to become the ‘security clearinghouse’ for open source applications in the enterprise

Metacurity·

Covered in 2 articles

IBM and Red Hat want to become the ‘security clearinghouse’ for open source applications in the enterprise

Centcom: US war zone troops were targeted through commercial location data