SkillMutator: Benchmarking and Defending Language-and-Code Cross-modal Attacks on LLM Agent Skills (opens in new tab)
Large language model (LLM) agents increasingly extend their capabilities at runtime by loading Agent Skills, which pair natural-language specifications (SKILL.md) with executable scripts and resources. Because a skill's behavior relies on both natural-language instructions and executable code, assessing its safety requires cross-modal reasoning, creating a new language-and-code attack surface. Attackers can present a benign workflow in SKILL.md ...
Read the original article