Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🤖 Reinforcement Learning
Agents
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
42
posts in
11.5
ms
See,
Act
, Correct: three levers for working with a code
agent
🤖
AI agents
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
🤖
AI agents
anjalishriva.com
·
1d
1 day ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Agentic
RL
: Token-In, Token-Out Done Right
🤖
AI agents
qgallouedec-tito.hf.space
·
22h
22 hours ago
·
Hacker News
Actions for Agentic RL: Token-In, Token-Out Done Right
Researchers trained an open source AI search
agent
, Harness-1, that outperforms GPT-5.4 on recalling relevant information
🤖
AI agents
venturebeat.com
·
1d
1 day ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
How to Train Your Goblin
✍️
Prompt Engineering
goblins.mchen.workers.dev
·
3d
3 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training
agents
.
🤖
AI agents
Content type:
Blog
huggingface.co
·
2d
2 days ago
·
Hacker News
,
r/LocalLLaMA
Actions for OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training agents.
Good teachers don’t cheat
📡
Information Theory
Content type:
Blog
jasonkena.github.io
·
6d
6 days ago
·
Hacker News
Actions for Good teachers don’t cheat
I got so mad at poke(rogue)like that I trained a
RL
agent
to beat it for me
📱
Edge AI
thiagolira.blot.im
·
2d
2 days ago
·
Hacker News
Actions for I got so mad at poke(rogue)like that I trained a RL agent to beat it for me
NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running
Agents
🤖
AI agents
Content type:
Blog
developer.nvidia.com
·
6d
6 days ago
·
Hacker News
Actions for NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents
Of Termites & Tokens
🤖
AI agents
tomcritchlow.com
·
2d
2 days ago
·
Hacker News
Actions for Of Termites & Tokens
Reinforcement
Learning
and
Optimal
Control Book (RIP Dimitri Bertsekas)
📊
Algorithms
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Why LLMs (still) lack taste
🤖
Automation
beyondtheprior.com
·
1d
1 day ago
·
Hacker News
Actions for Why LLMs (still) lack taste
How to Stop Shipping Low-Quality
RL
Environments
(with Examples)
🧬
biology
Content type:
News
latent.space
·
4d
4 days ago
·
Hacker News
Actions for How to Stop Shipping Low-Quality RL Environments (with Examples)
Alpha-RTL: Test-Time Training for RTL Hardware
Optimization
🔌
FPGA
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for Alpha-RTL: Test-Time Training for RTL Hardware Optimization
Memoirs of a
Learning
Machine: Autobiographical Self-Training and the Self-Training Gap
🤖
AI agents
zenodo.org
·
3d
3 days ago
·
Hacker News
Actions for Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap
The
Exploit
Always Wins
🤖
AI agents
Content type:
Blog
abhishek-shankar.com
·
5d
5 days ago
Actions for The Exploit Always Wins
A
Functional
Taxonomy of World Models – Fei Fei Li
🧬
biology
Content type:
Blog
drfeifei.substack.com
·
6d
6 days ago
·
Substack
Actions for A Functional Taxonomy of World Models – Fei Fei Li
KJLdefeated/RL.cu
: RLVR training for LLM in CUDA/C++
🔥
PyTorch
Content type:
Code
github.com
·
3d
3 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
Why Robotics Is a Pre-Paradigm Field
🤖
Swarm Robotics
Content type:
News
whattotelltherobot.com
·
3d
3 days ago
·
Hacker News
Actions for Why Robotics Is a Pre-Paradigm Field
I got so mad at poke(rogue)like that I trained a
RL
agent
to beat it for me
📱
Edge AI
Content type:
Blog
blog.thiagolira.com.br
·
6d
6 days ago
·
Hacker News
Actions for I got so mad at poke(rogue)like that I trained a RL agent to beat it for me
No more posts from nmarshall's subscribed feeds.
Scour all
25255
feeds
Learn more about Feeds
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help