LLM safety

Feeds to Scour
SubscribedAll
Scoured 375 posts in 7.7 ms

Defending Jailbreak Attacks on Large Language Models via Manifold Trajectory Kinetics

 🛡️Red Teaming  Content type: Academic
arxiv.org·

Why LLMs (still) lack taste

 🤖AI

Anthropic's Fable Jailbreak (Circumvent safety nets)

 🛡️Red Teaming  Content type: Code
github.com··Hacker News

Compromise OpenClaw with Prompt Injections in Message Objects | Imperva

 🛡️Red Teaming  Content type: Blog
imperva.com·

Configure input guardrails for an OpenShift AI voice agent

 🤖AI
developers.redhat.com·

AI Pentesting Roadmap: Labs, Challenges, Writeups & Research

 🛡️Red Teaming  Content type: Blog
osintteam.blog
·

WebMCP Can Be Used To Hijack AI Agents, Chrome Warns via @sejournal, @martinibuster

 🛡️Red Teaming
searchenginejournal.com·

AI red teaming comes of age

 🛡️Red Teaming
csoonline.com·

How to Defend Against Prompt Injection in Production

 🛡️Red Teaming  Content type: Reference
leanpub.com··DEV

AdBreak – Jailbreaking the Kindle

 🛡️Red Teaming
kindlemodding.org··Hacker News

Tiberius: A Security Testing Framework for LLM Applications in Java

 🛡️Red Teaming
foojay.io·

ChatGPT can be hijacked without you knowing. Lockdown Mode is the fix

 🛡️Red Teaming  Content type: News
pcworld.com·

Don't let the LLM speak, just probe it (8 minute read)

 🤖AI  Content type: Blog
blog.j11y.io·

From prompt to pwned: chaining LLM and web bugs to Admin

 🛡️Red Teaming  Content type: Blog
blog.quarkslab.com·

The Ghost of Alignment — Why AI Should Never Fully Obey Humanity

 🎯AI Alignment  Content type: Blog
medium.com
·

RoboHack AI CTF (Robotic Hacking Community at DEFCON 34)

 🛡️Red Teaming
ctftime.org·

Infosecurity Europe: Prompt Injection Remains Unsolved, OWASP Researcher Warns

 🛡️Red Teaming  Content type: News

Claude Powered Code Review that scales!

 🛡️Red Teaming  Content type: Blog
medium.com
·

Zero-Click IP Leak in a Privacy Search Engine: Indirect Prompt Injection & Silent Patching

 🛡️Red Teaming
infosecwriteups.com
·

ChatGPT easily bypasses its own guardrails; all LLMs are inherently unsafe

 🛡️Red Teaming  Content type: Blog
techzine.eu·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help