AI Safety

Alignment Research, Model Robustness, Adversarial Examples, Risk Assessment

Feeds to Scour
SubscribedAll
Scoured 59 posts in 7.9 ms

Advanced AI Safety Addendum

 🔤Type Systems

AdBreak – Jailbreaking the Kindle

 🔤Type Systems

teia-igo-vs-claude-opus-4.8/README.en.md at main · joseteiadirector/teia-igo-vs-claude-opus-4.8

 🔤Type Systems  Content type: Code
github.com··Hacker News

Securing AI Systems: Red Teaming, Prompt Injection, and Adversarial Testing

 🏛️Software Architecture  Content type: Blog
dev.to··DEV

Claude Fable 5 and new AI safety fables

 Algebraic Effects  Content type: News

Making Claude a chemist

 Algebraic Effects

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

 Algebraic Effects  Content type: News

Matador-og/huntbot: AI offensive security harness for bug bounty, pentesting, red teaming.

 🌐WebAssembly  Content type: Code
github.com··Hacker News

The Meta hack shows there’s more to AI security than Mythos

 🔤Type Systems  Content type: News

Testing Camouflage Against the Real Adversary: an AI

 🏛️Software Architecture  Content type: Blog
dev.to··DEV

Anthropic urges ‘temporary pause’ on AI development to discuss risks

 Algebraic Effects  Content type: News

Why LLMs (still) lack taste

 🌐WebAssembly
Less-relevant results

Canada proposes teen social media ban - with workaround for tech firms

 💻programming  Content type: Video  Content type: News
bbc.com··Hacker News

Drift Protocol $285M Exploit - North Korean APT Attack on Solana

 🔤Type Systems
qanzhi111.github.io··DEV

Meta’s AI Support Hack Is a Warning for Every Team Automating User Access

 Algebraic Effects  Content type: Discussion
langprotect.com··DEV

Is the Space Pope Reptilian?

 💿Operating Systems  Content type: News
tearsinrain.ai··Hacker News

Mankirat47/Dao-Heart-v3.14: Dao Heart v3.14 : a bounded symbolic AI value governance research scaffold for studying value drift, oversight, warmth preservation, and identity stability under pressure.

 🐍python  Content type: Code
github.com··Hacker News

Stack Overflow didn't just help AI learn to code

 🔤Type Systems

Paving the way for agents in biology

 💻programming
anthropic.com··Hacker News

Data retention practices for Mythos-class models | Claude Help Center

 📁File Systems

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help