Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
reinforcement learning
馃 reinforcement learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
31
posts in
6.2
ms
Reinforcement
Learning
and Optimal Control Book (RIP Dimitri Bertsekas)
聽
馃З
operations research
聽
Content type:
Academic
web.mit.edu
路
5d
5 days ago
路
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Agents
Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
聽
馃З
operations research
anjalishriva.com
路
1d
1 day ago
路
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
Measuring Embedding Drift: Why Hybrid Search Saves Stale
Models
.
聽
馃搳
linear programming
pub.towardsai.net
路
20h
20 hours ago
Actions for Measuring Embedding Drift: Why Hybrid Search Saves Stale Models.
Propel: Breaking the Solver Bottleneck in Task-Generator
RL
聽
馃搳
linear programming
vmax.ai
路
3h
3 hours ago
路
Hacker News
Actions for Propel: Breaking the Solver Bottleneck in Task-Generator RL
Why LLMs (still) lack taste
聽
馃З
operations research
beyondtheprior.com
路
2d
2 days ago
路
Hacker News
Actions for Why LLMs (still) lack taste
How to Train Your Goblin
聽
馃搳
linear programming
goblins.mchen.workers.dev
路
3d
3 days ago
路
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Researchers trained an open source AI search
agent
, Harness-1, that outperforms GPT-5.4 on recalling relevant information
聽
馃З
operations research
venturebeat.com
路
2d
2 days ago
路
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
See, Act, Correct: three levers for working with a code
agent
聽
馃З
operations research
聽
Content type:
Blog
blog.owulveryck.info
路
6d
6 days ago
路
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
Agentic
RL
: Token-In, Token-Out Done Right
聽
馃搳
linear programming
qgallouedec-tito.hf.space
路
1d
1 day ago
路
Hacker News
Actions for Agentic RL: Token-In, Token-Out Done Right
NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running
Agents
聽
馃З
operations research
聽
Content type:
Blog
developer.nvidia.com
路
6d
6 days ago
路
Hacker News
Actions for NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents
AI-powered living business intelligence network
聽
馃З
operations research
atlasforgex.com
路
12h
12 hours ago
路
Hacker News
Actions for AI-powered living business intelligence network
I got so mad at poke(rogue)like that I trained a
RL
agent
to beat it for me
聽
馃搳
linear programming
thiagolira.blot.im
路
3d
3 days ago
路
Hacker News
Actions for I got so mad at poke(rogue)like that I trained a RL agent to beat it for me
Beyond Dexterity: Why Contact May Define the Next Era of Robotics
聽
馃З
operations research
聽
Content type:
Video
聽
Content type:
News
spectrum.ieee.org
路
1d
1 day ago
路
Hacker News
Actions for Beyond Dexterity: Why Contact May Define the Next Era of Robotics
Memoirs of a
Learning
Machine: Autobiographical Self-Training and the Self-Training Gap
聽
馃З
operations research
zenodo.org
路
4d
4 days ago
路
Hacker News
Actions for Memoirs of a Learning Machine: Autobiographical Self-Training and the Self-Training Gap
Stack Overflow didn't just help AI
learn
to code
聽
馃
Rust
zozo123.github.io
路
3d
3 days ago
路
Hacker News
Actions for Stack Overflow didn't just help AI learn to code
Vibe Diaries: Training Nanochat
聽
馃
Rust
vibediary.dev
路
2d
2 days ago
路
Hacker News
Actions for Vibe Diaries: Training Nanochat
The Effective Sample Size
聽
馃З
operations research
alex.smola.org
路
6d
6 days ago
路
Hacker News
Actions for The Effective Sample Size
Nvidia Nemotron 3 Ultra
聽
馃
Rust
research.nvidia.com
路
6d
6 days ago
路
Hacker News
Actions for Nvidia Nemotron 3 Ultra
Apple's New AI
Models
Contain 'None' of Google's Gemini Assistant
聽
馃搳
linear programming
聽
Content type:
News
macrumors.com
路
1d
1 day ago
路
Hacker News
Actions for Apple's New AI Models Contain 'None' of Google's Gemini Assistant
Arithmetic Pedagogy for Language
Models
聽
馃搳
linear programming
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
路
Hacker News
Actions for Arithmetic Pedagogy for Language Models
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help