🛡️ AI Safety - joshwonghc

sinewaveai/prooflayer-rules: Open-source runtime security rules engine for MCP servers and AI agents. Detects prompt injection, command injection, jailbreaks, and data exfiltration.

🤖AI Agents Code

github.com··Hacker News

AI Pentesting Roadmap: Labs, Challenges, Writeups & Research

✍️Prompt Engineering Blog

osintteam.blog

The Ghost of Alignment — Why AI Should Never Fully Obey Humanity

🤖AI Agents Blog

medium.com

Malware uses fake nuclear weapon prompts to bypass AI security scanners

✍️Prompt Engineering

4sysops.com·

Infosecurity Europe: Prompt Injection Remains Unsolved, OWASP Researcher Warns

✍️Prompt Engineering News

infosecurity-magazine.com··Cited by 1 article

WebMCP Can Be Used To Hijack AI Agents, Chrome Warns via @sejournal, @martinibuster

✍️Prompt Engineering

searchenginejournal.com·

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

✍️Prompt Engineering 19

anthropic.com··DEV, Lobsters, Hacker News, r/LocalLLaMA·Cited by 19 articles

Prompt injection still drives most agentic AI security failures in production

🤖AI Agents

helpnetsecurity.com·

ChatGPT's new Lockdown Mode lets you disable web access and more to protect sensitive data from prompt injection

✍️Prompt Engineering

the-decoder.com

Claude Powered Code Review that scales!

✍️Prompt Engineering Blog

medium.com

Why OpenAI is disabling ChatGPT web access to fight prompt injection attacks

✍️Prompt Engineering News

livemint.com·

Security Flaw in Claude Code Illustrates the Risk of AI in Developer Workflows

✍️Prompt Engineering

devops.com·

Anthropic's Claude Fable 5 and Mythos 5 AI suspended over security fears

✍️Prompt Engineering News

bbc.com·

OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks

✍️Prompt Engineering 6

techcrunch.com··Hacker News·Cited by 6 articles

Detecting AI-specific threats in Claude Enterprise from the Compliance API: a prefilter + LLM-as-judge pipeline with Sigma rules

✍️Prompt Engineering

papermtn.co.uk··r/netsec

Iliad is Hiring

Prompt injection breaks today’s AI agents, study warns

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

AI Agent Security Guide: How to Prevent Prompt Injection Attack

Compromise OpenClaw with Prompt Injections in Message Objects | Imperva

sinewaveai/prooflayer-rules: Open-source runtime security rules engine for MCP servers and AI agents. Detects prompt injection, command injection, jailbreaks, and data exfiltration.

AI Pentesting Roadmap: Labs, Challenges, Writeups & Research

The Ghost of Alignment — Why AI Should Never Fully Obey Humanity

Malware uses fake nuclear weapon prompts to bypass AI security scanners

Infosecurity Europe: Prompt Injection Remains Unsolved, OWASP Researcher Warns

WebMCP Can Be Used To Hijack AI Agents, Chrome Warns via @sejournal, @martinibuster

Statement on the US government directive to suspend access to Fable 5 and Mythos 5

Prompt injection still drives most agentic AI security failures in production

ChatGPT's new Lockdown Mode lets you disable web access and more to protect sensitive data from prompt injection

Claude Powered Code Review that scales!

Why OpenAI is disabling ChatGPT web access to fight prompt injection attacks

Security Flaw in Claude Code Illustrates the Risk of AI in Developer Workflows

Anthropic's Claude Fable 5 and Mythos 5 AI suspended over security fears

OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks

Detecting AI-specific threats in Claude Enterprise from the Compliance API: a prefilter + LLM-as-judge pipeline with Sigma rules