Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Safety
🛡️ AI Safety
Specific
AI alignment, safety, responsible AI, AGI risk
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
292
posts in
11.5
ms
Sixteen schemes for
AI
safety
🧠
AI Research
lesswrong.com
·
6d
6 days ago
Actions for Sixteen schemes for AI safety
My Oslo Freedom Forum Keynote: Authoritarians and
AI
🤖
AI
Content type:
Blog
redpacket.substack.com
·
1d
1 day ago
·
Substack
Actions for My Oslo Freedom Forum Keynote: Authoritarians and AI
Advanced
AI
Safety
Addendum
🤖
AI
cloud.google.com
·
15h
15 hours ago
·
Hacker News
Actions for Advanced AI Safety Addendum
[Recorded talk] "
AI
Alignment
Versus
AI
Ethical Treatment: 10 Challenges"
🤖
AI
Content type:
Blog
meditationsondigitalminds.substack.com
·
23h
23 hours ago
·
Substack
Actions for [Recorded talk] "AI Alignment Versus AI Ethical Treatment: 10 Challenges"
Reward
Hacking
, The Loophole Lesson: Winning the Signal, Losing the Reason
🧬
Biohacking
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Reward Hacking, The Loophole Lesson: Winning the Signal, Losing the Reason
Reproducing, Analyzing, and Detecting
Reward
Hacking
in Rubric-Based Reinforcement Learning
🧬
Biohacking
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning
Germany to create
AI
safety
agency
🔐
Security
techxplore.com
·
1d
1 day ago
Actions for Germany to create AI safety agency
Less-relevant results
The Best Politician In A Generation
🤖
AI
Content type:
News
Content type:
Blog
benthams.substack.com
·
18h
18 hours ago
·
Substack
Actions for The Best Politician In A Generation
Mechanistic
Interpretability
: The Key to Trusting Agentic
AI
🤖
AI
Content type:
Discussion
bradenkelley.com
·
4d
4 days ago
Actions for Mechanistic Interpretability: The Key to Trusting Agentic AI
Claude Fable 5 and new
AI
safety
fables
🔐
Security
Content type:
News
interconnects.ai
·
11h
11 hours ago
·
Hacker News
Actions for Claude Fable 5 and new AI safety fables
Criti-hyping is the best thing that happened to Big Tech
☁️
SaaS
reveriesofahuman.com
·
1d
1 day ago
Actions for Criti-hyping is the best thing that happened to Big Tech
Model
Evaluations
: Prove Your Routing Policy Actually Works
✍️
Prompt Engineering
Content type:
Blog
digitalocean.com
·
5d
5 days ago
Actions for Model Evaluations: Prove Your Routing Policy Actually Works
Assessing the Polyglot Chatbot: Multilingual
Safety
in
AI
Systems
🤖
AI
cdt.org
·
14h
14 hours ago
Actions for Assessing the Polyglot Chatbot: Multilingual Safety in AI Systems
Leaderboard Integrity Update at terminal-bench
🕵️
Vulnerability Research
tbench.ai
·
6d
6 days ago
·
Hacker News
Actions for Leaderboard Integrity Update at terminal-bench
The Stoic Path to Actual
AI
Safety
: Three Practical Steps for Industry and Individuals
🤖
AI
oodaloop.com
·
1d
1 day ago
Actions for The Stoic Path to Actual AI Safety: Three Practical Steps for Industry and Individuals
OpenAI says it will comply with Trump's order to let the government review
AI
models
before release
🤖
AI
qz.com
·
4d
4 days ago
Actions for OpenAI says it will comply with Trump's order to let the government review AI models before release
Import
AI
460:
Reward
hacking
society, RSI data from Anthropic; and RL-based quadcopter racing
🤖
AI
Content type:
News
Content type:
Blog
importai.substack.com
·
1d
1 day ago
·
Substack
Actions for Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing
Iliad is Hiring
✍️
Prompt Engineering
lesswrong.com
·
2d
2 days ago
Actions for Iliad is Hiring
Aaronontheweb/dotnet-slopwatch: Catch naughty LLM
reward-hacking
and bad behavior for .NET coding
⚙️
MLOps
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for Aaronontheweb/dotnet-slopwatch: Catch naughty LLM reward-hacking and bad behavior for .NET coding
AI
policy scholar Dean W. Ball shares a text from his mother recommending he focus on frontier
AI
policy
🤖
AI
digg.com
·
6d
6 days ago
Actions for AI policy scholar Dean W. Ball shares a text from his mother recommending he focus on frontier AI policy
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help