Skip to main content
Scour
Discover
Docs
Login
Sign Up
Discover
About
Docs
Changelog
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Safety
🛡️ AI Safety
AI alignment, AI risk, existential risk, responsible AI
Filter Results
Timeframe
Choose a timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
226
posts in
23.2
ms
🔬
Science
lesswrong.com
·
2d
2 days ago
Thoughts on Likelihood of
Existential
Risks
by Misaligned AIs
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Thoughts on Likelihood of Existential Risks by Misaligned AIs
🤖
LLMs
arxiv.org
·
6d
6 days ago
Reward
Hacking
in Language Model Agents: Revisiting
AI
Safety Gridworlds
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Reward Hacking in Language Model Agents: Revisiting AI Safety Gridworlds
🎮
Gamification
medium.com
·
17h
17 hours ago
Reward
hacking
in Reinforcement learning
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Reward hacking in Reinforcement learning
⚖️
AI Regulation
E-International Relations
·
9h
9 hours ago
Interview – Andrea Miotti
Covers
2 stories
See all stories this covers
including
When AI Builds Itself
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Interview – Andrea Miotti
🔬
ROM Hacking
medium.com
·
2d
2 days ago
What I Learned Studying Whether Fine-Tuning Breaks a Transformer’s “Copy
Mechanism
”
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for What I Learned Studying Whether Fine-Tuning Breaks a Transformer’s “Copy Mechanism”
⚖️
AI Governance
science.org
·
3d
3 days ago
Researchers
caught in the crossfire as companies and government grapple over
AI
safety
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Researchers caught in the crossfire as companies and government grapple over AI safety
🤖
AI
GitHub
·
9h
9 hours ago
Open source
AI
projects from Banco Santander
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Open source AI projects from Banco Santander
⚖️
AI Governance
tehnologijaviews.medium.com
·
2d
2 days ago
Is the US Government’s Anthropic Ban Actually Helping the Brand? A Surprising Turn in
AI
Regulation
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Is the US Government’s Anthropic Ban Actually Helping the Brand? A Surprising Turn in AI Regulation
⚖️
AI Ethics
stevekinney.com
·
3d
3 days ago
Some Thoughts on
AI
Safety
Covers
10 stories
See all stories this covers
including
Goodhart's Law
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Some Thoughts on AI Safety
⚖️
AI Ethics
medium.com
·
3d
3 days ago
Ninety Percent of Physicians Trust Their Clinical
AI
. They Catch a Third of Its Dangerous Errors.
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Ninety Percent of Physicians Trust Their Clinical AI. They Catch a Third of Its Dangerous Errors.
⚖️
AI Regulation
lesswrong.com
·
16h
16 hours ago
On revolutionary love in
AI
safety
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for On revolutionary love in AI safety
🤖
AI
TechRadar
·
6d
6 days ago
'
AI
will probably most likely lead to the end of the world, but in the meantime, there’ll be great companies' - quote of the day by OpenAI CEO Sam Altman
Covers
Sam Altman May Control Our Future—Can He Be Trusted?
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for 'AI will probably most likely lead to the end of the world, but in the meantime, there’ll be great companies' - quote of the day by OpenAI CEO Sam Altman
⚖️
AI Regulation
ft.com
·
3d
3 days ago
Letter: Argentina’s
AI
fix widens the gap it is meant to close
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Letter: Argentina’s AI fix widens the gap it is meant to close
⚖️
AI Governance
CNBC
·
4d
4 days ago
Synthesia CEO: Creating a coalition around an
AI
code of conduct will help build the
AI
future we all want
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Synthesia CEO: Creating a coalition around an AI code of conduct will help build the AI future we all want
⚖️
AI Governance
highcapacity.org
·
6d
6 days ago
Mythos, China, and a New Era of
AI
Regulation
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Mythos, China, and a New Era of AI Regulation
🔭
Futurism
lesswrong.com
·
1d
1 day ago
The Cookie Monster Explains
AI
Safety
Covers
10 stories
See all stories this covers
including
Anthropic confidentially submits draft S-1 to the SEC
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Cookie Monster Explains AI Safety
⚖️
AI Regulation
BetaKit
·
5d
5 days ago
AI
experts want a middle-power coalition to set
safety
guardrails. Some think Canada could lead it
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for AI experts want a middle-power coalition to set safety guardrails. Some think Canada could lead it
⚖️
AI Regulation
lesswrong.com
·
9h
9 hours ago
We made a map of the doom debate. Here's how the breakdown works.
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for We made a map of the doom debate. Here's how the breakdown works.
⚖️
Tech Policy
tehnologijaviews.medium.com
·
4d
4 days ago
The Trump Administration’s Push to Block
AI
Jailbreaks: A
Safety
Measure or Political Theater?
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Trump Administration’s Push to Block AI Jailbreaks: A Safety Measure or Political Theater?
📊
Statistics
arxiv.org
·
6d
6 days ago
Circuit Tracing in Autoregressive Protein Language Models
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Circuit Tracing in Autoregressive Protein Language Models
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous post
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Discover
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Not for me
Report