Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Safety
🛡️ AI Safety
Specific
alignment, AI risk, RLHF, model safety
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
75
posts in
6.5
ms
Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance
🧠
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Multilingual Sentiment Aware Text Summarization A Reinforcement Learning Approach for Consistency Maintenance
Less-relevant results
scMTG reconstructs single-cell temporal dynamics with Markov transition generators
🧠
LLMs
Content type:
Academic
biorxiv.org
·
3d
3 days ago
Actions for scMTG reconstructs single-cell temporal dynamics with Markov transition generators
Neglected Basics of
AI
Alignment
🧠
LLMs
lesswrong.com
·
3d
3 days ago
Actions for Neglected Basics of AI Alignment
Designer babies. Self-improving
AI
. Are we ready for either?
🔭
Tech Research
Content type:
News
vox.com
·
7h
7 hours ago
Actions for Designer babies. Self-improving AI. Are we ready for either?
Op Ed: Consultant Tony O’Connor On The Agentic Trojan Horse
🤖
AI Agents
thecompanydime.com
·
2d
2 days ago
Actions for Op Ed: Consultant Tony O’Connor On The Agentic Trojan Horse
Who Elected Anthropic?
✍️
Prompt Engineering
Content type:
Blog
vizierprime.substack.com
·
6d
6 days ago
·
Substack
Actions for Who Elected Anthropic?
A Regret Minimization Framework on Preference Learning in
Large
Language
Models
🧠
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for A Regret Minimization Framework on Preference Learning in Large Language Models
Coelho Mollo and Millière: The Vector Grounding
Problem
✍️
Prompt Engineering
philosophyofbrains.com
·
5d
5 days ago
Actions for Coelho Mollo and Millière: The Vector Grounding Problem
OpenClaw Won: How Big Tech Adopted the
AI
Agent
🤖
AI Agents
thelettertwo.com
·
2d
2 days ago
Actions for OpenClaw Won: How Big Tech Adopted the AI Agent
Representation-Aware Advantage Estimation: Your Reward
Model
Provides More Than A Scalar Output
🧠
LLMs
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Representation-Aware Advantage Estimation: Your Reward Model Provides More Than A Scalar Output
Iliad is Hiring
✍️
Prompt Engineering
lesswrong.com
·
3d
3 days ago
Actions for Iliad is Hiring
High Dynamic Range DIY Air Testing
✍️
Prompt Engineering
jefftk.com
·
1d
1 day ago
Actions for High Dynamic Range DIY Air Testing
SecureBio Detection is Hiring Software Engineers
⚙️
Backend Dev
jefftk.com
·
5d
5 days ago
Actions for SecureBio Detection is Hiring Software Engineers
SLUUG Talk: Demystifying
Large
Language
Models
on Linux
🧠
LLMs
Content type:
Code
github.com
·
3d
3 days ago
·
DEV
Actions for SLUUG Talk: Demystifying Large Language Models on Linux
VFUSE: Virulent Feature Understanding with Sparse autoEncoders
🧠
LLMs
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for VFUSE: Virulent Feature Understanding with Sparse autoEncoders
Alignment
Defends LLMs from Property Inference Attacks
🧠
LLMs
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Alignment Defends LLMs from Property Inference Attacks
Trajectory Geometry of Transformer Representations Across Layers
🧠
LLMs
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for Trajectory Geometry of Transformer Representations Across Layers
Beyond
Safety
Through Filtering: Toward Responsible Training on Human Distress
🛠️
Developer Tools
Content type:
Blog
compliancearchitecture.substack.com
·
6d
6 days ago
·
r/OpenAI
Actions for Beyond Safety Through Filtering: Toward Responsible Training on Human Distress
When Attribution Patching Lies: Diagnosis and a Second-Order Correction
🧠
LLMs
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for When Attribution Patching Lies: Diagnosis and a Second-Order Correction
Sixteen schemes for
AI
safety
🤖
AI Agents
lesswrong.com
·
6d
6 days ago
Actions for Sixteen schemes for AI safety
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help