Skip to main content
Scour
Discover
Docs
Login
Sign Up
Discover
About
Docs
Changelog
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
Q-Learning, Policy Gradient, RL Agents, Game AI
Filter Results
Timeframe
Choose a timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
31
posts in
25.9
ms
🤖
AI
musicallyut.xyz
·
6d
6 days ago
The Mote in
AI
's Eye: software engineering with
agents
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Mote in AI's Eye: software engineering with agents
🤖
AI
fareedkhan-dev.github.io
·
1d
1 day ago
Train LLM from Scratch
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Train LLM from Scratch
🤖
AI
sakana.ai
·
3h
3 hours ago
Sakana Fugu
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Sakana Fugu
🤖
AI
theregister
·
1d
1 day ago
Why Amazon hates 'human-in-the-loop'
AI
governance
Covered by
naked capitalism
,
TNW | Data-Security
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Why Amazon hates 'human-in-the-loop' AI governance
🤖
AI
brightray.ai
·
4d
4 days ago
Built Uber aggregator that tracks top
AI
researchers and leaders
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Built Uber aggregator that tracks top AI researchers and leaders
🤖
Machine Learning
alignment.openai.com
·
3d
3 days ago
Reinforcement
learning
towards broadly and persistently beneficial models
Covers
Introducing ChatGPT Health
Covered by
6 sources
See all sources covering this story
including
The Decoder
,
tldr.tech
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Reinforcement learning towards broadly and persistently beneficial models
🤖
Machine Learning
technotes.substack.com
·
2d
2 days ago
Taste and judgement are lies we tell ourselves
Discussed on
Substack
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Taste and judgement are lies we tell ourselves
🤖
Machine Learning
GitHub
·
5d
5 days ago
GoLongRL: Capability-Oriented Long Context
RL
with Multitask Alignment
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for GoLongRL: Capability-Oriented Long Context RL with Multitask Alignment
🤖
AI
arxiv.org
·
6d
6 days ago
Greed Is
Learned
: Visible Incentives as Reward-Hacking Triggers
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Greed Is Learned: Visible Incentives as Reward-Hacking Triggers
🤖
Machine Learning
mukulsingh105.github.io
·
2d
2 days ago
Knowledge workers don't need frontier models
Covers
5 stories
See all stories this covers
including
Building a hill-climbing machine: Launching seven new MAI models
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Knowledge workers don't need frontier models
🔥
PyTorch
runtimewire.com
·
5d
5 days ago
Cursor Says 1.5T Parameter Coding Model Is Training on 100k GPUs
Covers
3 stories
See all stories this covers
including
Do you respect 'Vibe Coders'? Can you actually call them devs?
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Cursor Says 1.5T Parameter Coding Model Is Training on 100k GPUs
🤖
AI
people.idsia.ch
·
2d
2 days ago
Munich 1991: The Roots of the Current
AI
Boom
Covers
2 stories
See all stories this covers
including
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Munich 1991: The Roots of the Current AI Boom
🤖
AI
shanethegamer.com
·
5d
5 days ago
They made a Pokemon TCG
AI
Battle Challenge with a $290k prize pool
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for They made a Pokemon TCG AI Battle Challenge with a $290k prize pool
🤖
AI
day1training.com
·
4d
4 days ago
Distributed
AI
on AWS
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Distributed AI on AWS
🤖
AI
castform.com
·
4d
4 days ago
I post-trained a model to reliably roll a die
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for I post-trained a model to reliably roll a die
🤖
AI
adaptivesoftware.substack.com
·
3d
3 days ago
The Artificial Life Lesson: Forty Years of Digital Evolution Research
Discussed on
Substack
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Artificial Life Lesson: Forty Years of Digital Evolution Research
🤖
AI
blog.cloudflare.com
·
6d
6 days ago
Growing the Cloudflare
AI
Team with Talent from Ensemble
AI
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Growing the Cloudflare AI Team with Talent from Ensemble AI
🤖
AI
notas.grod.es
·
6d
6 days ago
The Rain Spell
Covers
2 stories
See all stories this covers
including
Opencode – open-source alternative to Claude Code
Covered by
blog.grod.es
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Rain Spell
🤖
AI
Odyssey
·
4d
4 days ago
Odyssey $310M Fundraise to Accelerate World Simulation
Covered by
5 sources
See all sources covering this story
including
therundown.ai
,
lebigdata.fr
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Odyssey $310M Fundraise to Accelerate World Simulation
⚙
Mechanical Engneering
IEEE Spectrum
·
4d
4 days ago
Smarter Charging, An
AI
controller treats batteries differently as they
age
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Smarter Charging, An AI controller treats batteries differently as they age
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous post
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Discover
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Not for me
Report