AI Safety Evals

Feeds to Scour
SubscribedAll
Scoured 57 posts in 90.5 ms

Iliad is Hiring

 🧠Rationality
lesswrong.com·
Less-relevant results

Microsoft updates AI agent security taxonomy with seven new failure modes

 🔧MCP
4sysops.com·

Autonomous Pentesting vs Autonomous Red Teaming: What's the Difference?

 🔐Infosec
malware.news·

Securing AI Systems: Red Teaming, Prompt Injection, and Adversarial Testing

 🛡️LLM Security  Content type: Blog
dev.to··DEV

Controversial smut as an AI alignment issue

 ⚖️Ethics  Content type: News  Content type: Blog

Latest technical articles & videos.

 🤖Large Language Models
certdepot.net·

Sam Altman said automating everything will be 'unfulfilling' and 'dangerous'

 🎭Anthropic Claude  Content type: News
businessinsider.com
·

Neglected Basics of AI Alignment

 🛡️LLM Security
lesswrong.com·

Infosecurity Europe: Practical Lessons From Lloyds' Agentic AI Security Playbook

 🎯Agentic AI Red Teaming  Content type: News

Meta’s AI Support Hack Is a Warning for Every Team Automating User Access

 🕳LLM Vulnerabilities  Content type: Discussion
langprotect.com··DEV

Solstice Sync: A Dynamic Gemini AI Alignment Puzzle

 ⚖️Ethics  Content type: Blog
dev.to··DEV

Anthropic Urges Global Pause in AI Development, Flags 'Self-Improvement' Risk

 🎭Anthropic Claude
slashdot.org·

Updating the taxonomy of failure modes in agentic AI systems: What a year of red teaming taught us

 💻WMI Abuse
malware.news·

AI Security Tools: May 2026

 🔓Vulnerability Research  Content type: Blog
medium.com
·

On Slop

 🎯Pen Testing
lesswrong.com·

An Anthropic employee's 2-sentence quote crystallizes the state of AI confusion at work

 🎭Anthropic Claude  Content type: News
businessinsider.com
·

Learnings from starting an AI safety research team

 🛡️AI Safety
lesswrong.com·

Anthropic proposes global development pause to mitigate recursive AI risks

 🎭Anthropic Claude
4sysops.com·

Book of Cron Job

 🐚Shell Scripting
lesswrong.com·

Is it unethical to work on robotics capabilities research?

 🚀Frontier AI
lesswrong.com·

No more posts from buckman's subscribed feeds.

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help