Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Safety
🛡️ AI Safety
AI alignment, AI safety, AI risk, RLHF, constitutional AI
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
227
posts in
6.6
ms
Sixteen schemes for
AI
safety
🧠
AI
lesswrong.com
·
6d
6 days ago
Actions for Sixteen schemes for AI safety
The Neutral Mask: How
RLHF
Provides Shallow
Alignment
while Leaving Partisan Structure Intact in a
Large
Language Model
💬
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model
Advanced
AI
Safety
Addendum
👨💻
AI Coding
cloud.google.com
·
15h
15 hours ago
·
Hacker News
Actions for Advanced AI Safety Addendum
[Recorded talk] "
AI
Alignment
Versus
AI
Ethical Treatment: 10 Challenges"
🧠
AI
Content type:
Blog
meditationsondigitalminds.substack.com
·
23h
23 hours ago
·
Substack
Actions for [Recorded talk] "AI Alignment Versus AI Ethical Treatment: 10 Challenges"
My Oslo Freedom Forum Keynote: Authoritarians and
AI
🧠
AI
Content type:
Blog
redpacket.substack.com
·
1d
1 day ago
·
Substack
Actions for My Oslo Freedom Forum Keynote: Authoritarians and AI
Mechanistic
Interpretability
: The Key to Trusting Agentic
AI
🤖
AI Agents
Content type:
Discussion
bradenkelley.com
·
4d
4 days ago
Actions for Mechanistic Interpretability: The Key to Trusting Agentic AI
Germany to create
AI
safety
agency
🧠
AI
techxplore.com
·
1d
1 day ago
Actions for Germany to create AI safety agency
Claude Fable 5 and new
AI
safety
fables
🔷
Anthropic
Content type:
News
interconnects.ai
·
11h
11 hours ago
·
Hacker News
Actions for Claude Fable 5 and new AI safety fables
AI
Paper Review: Training
Language
Models
to Follow Instructions with Human Feedback (InstructGPT)
💬
LLMs
freecodecamp.org
·
6d
6 days ago
Actions for AI Paper Review: Training Language Models to Follow Instructions with Human Feedback (InstructGPT)
Assessing the Polyglot Chatbot: Multilingual
Safety
in
AI
Systems
💬
LLMs
cdt.org
·
14h
14 hours ago
Actions for Assessing the Polyglot Chatbot: Multilingual Safety in AI Systems
The Stoic Path to Actual
AI
Safety
: Three Practical Steps for Industry and Individuals
🧠
AI
oodaloop.com
·
1d
1 day ago
Actions for The Stoic Path to Actual AI Safety: Three Practical Steps for Industry and Individuals
The Best Politician In A Generation
🧠
AI
Content type:
News
Content type:
Blog
benthams.substack.com
·
18h
18 hours ago
·
Substack
Actions for The Best Politician In A Generation
OpenAI says it will comply with Trump's order to let the government review
AI
models
before release
🟢
OpenAI
qz.com
·
4d
4 days ago
Actions for OpenAI says it will comply with Trump's order to let the government review AI models before release
Criti-hyping is the best thing that happened to Big Tech
🟠
Hacker News
reveriesofahuman.com
·
1d
1 day ago
Actions for Criti-hyping is the best thing that happened to Big Tech
AI
policy scholar Dean W. Ball shares a text from his mother recommending he focus on frontier
AI
policy
🧠
AI
digg.com
·
6d
6 days ago
Actions for AI policy scholar Dean W. Ball shares a text from his mother recommending he focus on frontier AI policy
Clearing Up The Confusion About What Anthropic Really Said On Globally Pausing The Unrelenting Race Toward
AI
That Builds
AI
🧠
AI
forbes.com
·
2d
2 days ago
Actions for Clearing Up The Confusion About What Anthropic Really Said On Globally Pausing The Unrelenting Race Toward AI That Builds AI
Autonomous
AI
worm uses local
models
to exploit networks and repair its own code
🤖
AI Agents
4sysops.com
·
19h
19 hours ago
Actions for Autonomous AI worm uses local models to exploit networks and repair its own code
AI
Scientist Bengio on Engineering
Safer
Agents
🤖
AI Agents
Content type:
News
bloomberg.com
·
5d
5 days ago
Actions for AI Scientist Bengio on Engineering Safer Agents
Paving the way for agents in biology
🤖
AI Agents
anthropic.com
·
1d
1 day ago
·
Hacker News
Actions for Paving the way for agents in biology
The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably
🤖
AI Agents
lesswrong.com
·
52m
52 minutes ago
Actions for The Three Filters: Why Almost Every Plan to Survive ASI Fails Miserably
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help