Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Safety
🛡️ AI Safety
Specific
AI alignment, safety, responsible AI, AGI risk
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
304
posts in
13.2
ms
Assessing the Polyglot Chatbot: Multilingual
Safety
in
AI
Systems
🤖
AI
cdt.org
·
22h
22 hours ago
Actions for Assessing the Polyglot Chatbot: Multilingual Safety in AI Systems
Cheap
Reward
Hacking
Detection
🕵️
Vulnerability Research
Content type:
Academic
arxiv.org
·
1d
1 day ago
·
Hacker News
Actions for Cheap Reward Hacking Detection
From oversight to coercion: How authoritarian governments are twisting
AI
safety
to get tech companies to fall in line
🤖
AI
theconversation.com
·
6d
6 days ago
Actions for From oversight to coercion: How authoritarian governments are twisting AI safety to get tech companies to fall in line
The Stoic Path to Actual
AI
Safety
: Three Practical Steps for Industry and Individuals
🤖
AI
oodaloop.com
·
2d
2 days ago
Actions for The Stoic Path to Actual AI Safety: Three Practical Steps for Industry and Individuals
The technical community can't be the main character in
AI
safety
anymore
🔐
Security
substackcdn.com
·
3d
3 days ago
·
Substack
Actions for The technical community can't be the main character in AI safety anymore
Germany's National Security Council greenights an
AI
Safety
Institute
modeled
after the UK's AISI
🔐
Security
the-decoder.com
·
7h
7 hours ago
Actions for Germany's National Security Council greenights an AI Safety Institute modeled after the UK's AISI
Import
AI
460:
Reward
hacking
society, RSI data from Anthropic; and RL-based quadcopter racing
🤖
AI
Content type:
News
Content type:
Blog
importai.substack.com
·
2d
2 days ago
·
Substack
Actions for Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing
AI
Scientist Bengio on Engineering
Safer
Agents
🤖
AI
Content type:
News
bloomberg.com
·
5d
5 days ago
Actions for AI Scientist Bengio on Engineering Safer Agents
Clearing Up The Confusion About What Anthropic Really Said On Globally Pausing The Unrelenting Race Toward
AI
That Builds
AI
🤖
AI
forbes.com
·
2d
2 days ago
Actions for Clearing Up The Confusion About What Anthropic Really Said On Globally Pausing The Unrelenting Race Toward AI That Builds AI
new mantra just dropped
🤖
AI
aphie.xyz
·
7h
7 hours ago
Actions for new mantra just dropped
Complex Objects: Why
AI
Safety
Can’t Just Think in Posts
🤖
AI
Content type:
Blog
medium.com
·
5d
5 days ago
Actions for Complex Objects: Why AI Safety Can’t Just Think in Posts
Paving the way for agents in biology
🧬
Biohacking
anthropic.com
·
2d
2 days ago
·
Hacker News
Actions for Paving the way for agents in biology
AI
Scientist Bengio: Building Systems We Don't Know How to Control
🤖
AI
Content type:
News
bloomberg.com
·
5d
5 days ago
Actions for AI Scientist Bengio: Building Systems We Don't Know How to Control
Anthropic's
Model
Naming, Extrapolated
🤖
AI
samwilkinson.io
·
22h
22 hours ago
·
Hacker News
Actions for Anthropic's Model Naming, Extrapolated
I Started an
AI
Safety
Research
Org and Think These 7 Things Matter
🚀
Startup Strategy
lesswrong.com
·
4h
4 hours ago
Actions for I Started an AI Safety Research Org and Think These 7 Things Matter
Proxy
Reward
Internalization and
Mechanistic
Exploitation: A Learned Precursor to
Reward
Hacking
and Its Generalization
✍️
Prompt Engineering
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization
What Will Canada’s
AI
Strategy Mean for Jobs and
Safety
?
🚀
Startup Strategy
Content type:
News
thetyee.ca
·
5d
5 days ago
Actions for What Will Canada’s AI Strategy Mean for Jobs and Safety?
KiloBench - Because Your Benchmark Score Doesn't Pay the Bill
✍️
Prompt Engineering
Content type:
News
Content type:
Blog
blog.kilo.ai
·
2d
2 days ago
Actions for KiloBench - Because Your Benchmark Score Doesn't Pay the Bill
AI
Safety
— Genuine or Performative?
🤖
AI
Content type:
Blog
medium.com
·
4d
4 days ago
Actions for AI Safety — Genuine or Performative?
In policy paper, OpenAI diverges from White House on
AI
safety
🤖
AI
siliconangle.com
·
6d
6 days ago
Actions for In policy paper, OpenAI diverges from White House on AI safety
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help