Skip to main content
Scour
Discover
Docs
Login
Sign Up
Discover
About
Docs
Changelog
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎮 Reinforcement Learning
Q-Learning, Policy Gradient, RL Agents, Game AI
Filter Results
Timeframe
Choose a timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
33
posts in
22.2
ms
🤖
AI
musicallyut.xyz
·
6d
6 days ago
The Mote in
AI
's Eye: software engineering with
agents
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Mote in AI's Eye: software engineering with agents
🤖
Machine Learning
alignment.openai.com
·
2d
2 days ago
Reinforcement
learning
towards broadly and persistently beneficial models
Covers
Introducing ChatGPT Health
Covered by
6 sources
See all sources covering this story
including
The Decoder
,
tldr.tech
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Reinforcement learning towards broadly and persistently beneficial models
🤖
AI
fareedkhan-dev.github.io
·
11h
11 hours ago
Train LLM from Scratch
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Train LLM from Scratch
🤖
AI
theregister
·
1d
1 day ago
Why Amazon hates 'human-in-the-loop'
AI
governance
Covered by
TNW | Data-Security
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Why Amazon hates 'human-in-the-loop' AI governance
🤖
AI
huggingface.co
·
6d
6 days ago
FastContext-1.0-4B-SFT: lightweight repository-exploration subagent
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for FastContext-1.0-4B-SFT: lightweight repository-exploration subagent
🤖
AI
brightray.ai
·
3d
3 days ago
Built Uber aggregator that tracks top
AI
researchers and leaders
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Built Uber aggregator that tracks top AI researchers and leaders
🔥
PyTorch
humanoidsdata.com
·
2d
2 days ago
Comparison of simulation environments for robot training data
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Comparison of simulation environments for robot training data
🤖
Machine Learning
GitHub
·
5d
5 days ago
GoLongRL: Capability-Oriented Long Context
RL
with Multitask Alignment
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for GoLongRL: Capability-Oriented Long Context RL with Multitask Alignment
🤖
AI
arxiv.org
·
5d
5 days ago
Greed Is
Learned
: Visible Incentives as Reward-Hacking Triggers
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Greed Is Learned: Visible Incentives as Reward-Hacking Triggers
🤖
Machine Learning
technotes.substack.com
·
1d
1 day ago
Taste and judgement are lies we tell ourselves
Discussed on
Substack
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Taste and judgement are lies we tell ourselves
🔥
PyTorch
runtimewire.com
·
4d
4 days ago
Cursor Says 1.5T Parameter Coding Model Is Training on 100k GPUs
Covers
3 stories
See all stories this covers
including
Do you respect 'Vibe Coders'? Can you actually call them devs?
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Cursor Says 1.5T Parameter Coding Model Is Training on 100k GPUs
🤖
Machine Learning
mukulsingh105.github.io
·
1d
1 day ago
Knowledge workers don't need frontier models
Covers
5 stories
See all stories this covers
including
Building a hill-climbing machine: Launching seven new MAI models
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Knowledge workers don't need frontier models
🤖
AI
scalingintelligence.stanford.edu
·
6d
6 days ago
Toward Better Hip Kernel Generation for AMD GPUs
Covers
KernelBench: Can LLMs Write Efficient GPU Kernels?
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Toward Better Hip Kernel Generation for AMD GPUs
🤖
AI
people.idsia.ch
·
1d
1 day ago
Munich 1991: The Roots of the Current
AI
Boom
Covers
2 stories
See all stories this covers
including
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Munich 1991: The Roots of the Current AI Boom
🤖
AI
shanethegamer.com
·
5d
5 days ago
They made a Pokemon TCG
AI
Battle Challenge with a $290k prize pool
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for They made a Pokemon TCG AI Battle Challenge with a $290k prize pool
🤖
AI
day1training.com
·
4d
4 days ago
Distributed
AI
on AWS
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Distributed AI on AWS
🤖
AI
castform.com
·
3d
3 days ago
I post-trained a model to reliably roll a die
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for I post-trained a model to reliably roll a die
🤖
AI
adaptivesoftware.substack.com
·
3d
3 days ago
The Artificial Life Lesson: Forty Years of Digital Evolution Research
Discussed on
Substack
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Artificial Life Lesson: Forty Years of Digital Evolution Research
🤖
AI
blog.cloudflare.com
·
5d
5 days ago
Growing the Cloudflare
AI
Team with Talent from Ensemble
AI
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for Growing the Cloudflare AI Team with Talent from Ensemble AI
🤖
AI
notas.grod.es
·
5d
5 days ago
The Rain Spell
Covers
2 stories
See all stories this covers
including
Opencode – open-source alternative to Claude Code
Covered by
blog.grod.es
Discussed on
Hacker News
Love
Like
Not for me
Save
Add to your feed
Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Actions for The Rain Spell
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous post
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Discover
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Not for me
Report