🛡️ AI Safety - amy_yunduo

Prompt Injection in Automated R\'esum\'e Screening with Large Language Models: Single and Multi-Injection Settings

🧠LLMs giskard.ai·

Giskard: LLM esting platform for preventing hallucinations and security issues

Covers 3 stories including Garak, LLM Vulnerability Scanner

Discussed on Hacker News

🤖AI Agents CleanTechnica·

The Real AI Safety Discussion That Just Isn’t Happening

🤖AI Agents medium.com

The Role of HR in Responsible AI Adoption

🤖AI Agents beSpacific·

Prompt Injection: What Lawyers Considering Agentic AI

✍️Prompt Engineering codeberg.org·

Powercode

Discussed on Hacker News

✍️Prompt Engineering role-confusion.github.io·

A Theory of Why Prompt Injection Works

Covers 3 stories including Playwright MCP Server – Snapshot based – faster and more reliable than images

Covered by 8 sources including Schneier on Security, Simon Willison’s Weblog

Discussed on Hacker News and Lobsters

🔭Observability SentinelOne·

macOS.Gaslight | Rust Backdoor Turns Prompt Injection on the Analyst, Not the Sandbox

Covers 2 stories including Mini Shai-Hulud, Miasma, and Hades Worms Target Bioinformatics and MCP Developers via Malicious PyPI Wheels

Covered by 14 sources including BleepingComputer, SecurityWeek

🤖AI Agents medium.com

Healthcare AI Governance: AI Doesn’t Fail. Poor Governance Does.

✍️Prompt Engineering meetcyber.net

Prompt Injection vs Jailbreaking Explained in 4 Minutes

✍️Prompt Engineering fernandoi.cl·

What happened after 2k people tried to hack my AI assistant

Covered by Simon Willison’s Weblog

Discussed on Hacker News

🧠LLMs medium.com

ChatGPT Generates Gruesome, Explicit Images of Women When Guardrails Fail, My Research Shows

🏗️AI Infra medium.com

The Next Challenge in AI Safety: Image Veracity

✍️Prompt Engineering WIRED·

Anthropic Thinks Its Own Success Is Key to Making AI Safe

Covers 2 stories including Claude's Constitution

🤖AI Agents GitHub·

Show HN: Lelu – gate OpenAI agent actions on confidence and prompt injection

Discussed on Hacker News

✍️Prompt Engineering 4sysops·

Malicious npm and PyPI packages use prompt injection to bypass AI security scanners

🤖AI Agents EDB·

Inside EDB’s New Principles for Responsible AI: Sovereign, Governed, Trusted and Beneficial

🧠LLMs Above the Law

How Responsible AI Changes In The Agent Era

From Prompt Testing to AI Red Teaming at Enterprise Scale

Prompt Injection in Automated R\'esum\'e Screening with Large Language Models: Single and Multi-Injection Settings

Giskard: LLM esting platform for preventing hallucinations and security issues

The Real AI Safety Discussion That Just Isn’t Happening

The Role of HR in Responsible AI Adoption

Prompt Injection: What Lawyers Considering Agentic AI

Powercode

A Theory of Why Prompt Injection Works

macOS.Gaslight | Rust Backdoor Turns Prompt Injection on the Analyst, Not the Sandbox

Healthcare AI Governance: AI Doesn’t Fail. Poor Governance Does.

Prompt Injection vs Jailbreaking Explained in 4 Minutes

What happened after 2k people tried to hack my AI assistant

ChatGPT Generates Gruesome, Explicit Images of Women When Guardrails Fail, My Research Shows

The Next Challenge in AI Safety: Image Veracity

Anthropic Thinks Its Own Success Is Key to Making AI Safe

Show HN: Lelu – gate OpenAI agent actions on confidence and prompt injection

Malicious npm and PyPI packages use prompt injection to bypass AI security scanners

Inside EDB’s New Principles for Responsible AI: Sovereign, Governed, Trusted and Beneficial

No Points For Held Tongues — See Also