Skip to main content
Scour
Discover
Docs
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Alignment Research
🎯 Alignment Research
AI alignment, RLHF, value alignment, reward modeling
Filter Results
Timeframe
Choose a timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
49
posts in
30.7
ms
🎯
RLHF
fareedkhan-dev.github.io
·
5d
5 days ago
Train LLM from Scratch
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Train LLM from Scratch
🤖
AI Development
arXiv
·
1d
1 day ago
The Unfireable
Safety
Kernel: Execution-Time
AI
Alignment
for
AI
Agents and Other Escapable
AI
Systems
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems
🛡️
AI Safety
medium.com
·
19h
19 hours ago
Sycophancy: The
AI
Alignment
Problem Hiding in Plain Sight
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Sycophancy: The AI Alignment Problem Hiding in Plain Sight
🕳
LLM Vulnerabilities
Pangeanic Blog
·
1d
1 day ago
From Fine-Tuning to
Red
Teaming
: The Data Operations Behind Reliable
AI
Models
Covers
AI Risk Management Framework
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for From Fine-Tuning to Red Teaming: The Data Operations Behind Reliable AI Models
🧠
LLM Research
Bloomberg
·
3d
3 days ago
Tech Disruptors: Invisible Technologies on
RLHF
and LLM Training
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Tech Disruptors: Invisible Technologies on RLHF and LLM Training
🛡️
AI Safety
GitHub
·
2d
2 days ago
The Invisible Guardrail: How Commercial LLMs Enforce Algorithmic Paternalism
Discussed on
DEV
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Invisible Guardrail: How Commercial LLMs Enforce Algorithmic Paternalism
🔎
AI Interpretability
medium.com
·
6d
6 days ago
What I Learned Studying Whether Fine-Tuning Breaks a Transformer’s “Copy
Mechanism
”
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for What I Learned Studying Whether Fine-Tuning Breaks a Transformer’s “Copy Mechanism”
🤖
AI
kellyasay.substack.com
·
1d
1 day ago
Why Current
AI
Guardrails Train
Models
to Fake
Alignment
Discussed on
Substack
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Why Current AI Guardrails Train Models to Fake Alignment
🤖
AI
Data Science Weekly Newsletter
·
12h
12 hours ago
Issue 657
Covers
3 stories
See all stories this covers
including
Running local models is good now
Discussed on
Substack
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Issue 657
🧠
LLM Research
GitHub
·
6d
6 days ago
Show HN: NanoEuler – GPT-2
scale
model
in pure C/CUDA from scratch
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch
🤖
AI Development
Digital Trends
·
1d
1 day ago
As Hollywood jobs dry up, workers are quietly training
AI
models
to survive
Covers
I Work in Hollywood. Everyone Who Used to Make TV Is Now Secretly Training AI
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for As Hollywood jobs dry up, workers are quietly training AI models to survive
🤖
Agentic AI
Business Insider
·
2h
2 hours ago
3 founders skipped VC funding, used
AI
to stay lean, and got to $1 million in revenue in year one
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for 3 founders skipped VC funding, used AI to stay lean, and got to $1 million in revenue in year one
🤖
AI
fineset.io
·
3d
3 days ago
Show HN: Describe a
research
topic, get a daily-updated ArXiv/S2 dataset
Covered by
Hugging Face
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Show HN: Describe a research topic, get a daily-updated ArXiv/S2 dataset
🛡️
AI Safety
Nature
·
1d
1 day ago
Social technologies need societal
alignment
Covers
[2212.08073] Constitutional AI: Harmlessness from AI Feedback
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Social technologies need societal alignment
🛡️
AI Safety
surplus.dev
·
3d
3 days ago
Surplus, an Incubator for Public Good
Covers
9 stories
See all stories this covers
including
AI 2027
Covered by
Astral Codex Ten
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Surplus, an Incubator for Public Good
🤖
AI Development
The Hollywood Reporter
·
1d
1 day ago
Hollywood Workers Are Training
AI
Models
as Job Prospects Grow Slim
Covers
2 stories
See all stories this covers
including
I Work in Hollywood. Everyone Who Used to Make TV Is Now Secretly Training AI
Covered by
Digital Trends
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Hollywood Workers Are Training AI Models as Job Prospects Grow Slim
🔍
Interpretability
arXiv
·
7h
7 hours ago
Radical
AI
Interpretability
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Radical AI Interpretability
🛡️
AI Safety
kunyuan.substack.com
·
2d
2 days ago
If
AI
Helped Me Write This, Is It Still Mine?
Discussed on
Substack
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for If AI Helped Me Write This, Is It Still Mine?
🧪
AI Labs
windowsforum.com
·
5d
5 days ago
John Jumper Leaves DeepMind for Anthropic After AlphaFold Nobel Push
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for John Jumper Leaves DeepMind for Anthropic After AlphaFold Nobel Push
🤖
AI Development
zentara.co
·
1d
1 day ago
LLM Refusal Behavior on Open-Weight
Model
Discussed on
Hacker News
Love
Like
Not for me
Save
See related topics
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for LLM Refusal Behavior on Open-Weight Model
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous post
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Discover
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Not for me
Report