Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
RL
🎮 RL
Specific
reinforcement learning, RLHF, reward model, policy gradient
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
452
posts in
7.4
ms
Reasoning
RL
in 2026: GRPO,
DPO
, RLVR, Agentic
PO
& Beyond
💡
AI Reasoning
turingpost.com
·
4d
4 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
A Unifying Lens on
Reward
Uncertainty in
RLHF
🧠
LLMs
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for A Unifying Lens on Reward Uncertainty in RLHF
How to Implement a
Model-Free
RL
Algorithm: A Step-by-Step Guide
🕵️
AI Agents
Content type:
Blog
ujangriswanto08.medium.com
·
11h
11 hours ago
Actions for How to Implement a Model-Free RL Algorithm: A Step-by-Step Guide
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
🦾
Robotics
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
I Got Tired of Rebuilding My Retro
RL
Projects
🦾
Robotics
Content type:
Blog
medium.com
·
8h
8 hours ago
Actions for I Got Tired of Rebuilding My Retro RL Projects
Reinforcement
Learning
and Optimal Control Book (RIP Dimitri Bertsekas)
🦾
Robotics
Content type:
Academic
web.mit.edu
·
6d
6 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Tracing Eval-Awareness Emergence Through Training of OLMo 3
💡
AI Reasoning
lesswrong.com
·
1d
1 day ago
Actions for Tracing Eval-Awareness Emergence Through Training of OLMo 3
Researchers develop AI-powered railway control system for efficient urban train operation
🕵️
AI Agents
techxplore.com
·
1d
1 day ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
Time-slip in AI sepsis
models
may inflate results, risking under- or overtreatment
🕵️
AI Agents
medicalxpress.com
·
5d
5 days ago
Actions for Time-slip in AI sepsis models may inflate results, risking under- or overtreatment
Less-relevant results
The week AI infrastructure crossed from a technology story to a financial one
🧠
LLMs
Content type:
News
mlwhiz.com
·
17h
17 hours ago
Actions for The week AI infrastructure crossed from a technology story to a financial one
Reinforcement-learning
signals support dynamic adaptive control during language switching
🎭
Multimodal AI
Content type:
Academic
nature.com
·
1d
1 day ago
Actions for Reinforcement-learning signals support dynamic adaptive control during language switching
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
🕵️
AI Agents
Content type:
Code
github.com
·
4d
4 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Agents Need Work Data: A Primer on RLWD, or
Reinforcement
Learning
on Work Data
🕵️
AI Agents
anjalishriva.com
·
2d
2 days ago
·
Hacker News
Actions for Agents Need Work Data: A Primer on RLWD, or Reinforcement Learning on Work Data
2026 FIVB Volleyball Women's Nations League in Nanjing: Poland beats Czech Republic 3-0
🕵️
AI Agents
ecns.cn
·
6d
6 days ago
Actions for 2026 FIVB Volleyball Women's Nations League in Nanjing: Poland beats Czech Republic 3-0
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker AI
🦾
Robotics
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Deterministic
Policy
Gradient
for
Learning
Equilibrium in Time-Inconsistent Control Problems
💹
AI in Finance
Content type:
Academic
arxiv.org
·
13h
13 hours ago
Actions for Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems
Protest against ballot paper shortages enters 2nd day, demanding new election
🧠
LLMs
Content type:
News
koreatimes.co.kr
·
5d
5 days ago
·
r/news
Actions for Protest against ballot paper shortages enters 2nd day, demanding new election
China women’s volleyball team finish Nations League leg on a high after opening defeat
👁️
VLMs
Content type:
News
scmp.com
·
2d
2 days ago
·
r/SCMPauto
Actions for China women’s volleyball team finish Nations League leg on a high after opening defeat
[NEW
MODEL
] SupraLabs just released Supra1.5-50M Base (Experimental)!
🔧
Tool Use
huggingface.co
·
5h
5 hours ago
·
r/LocalLLaMA
Actions for [NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!
Semi-finalists confirmed in Secondary Schools Volleyball Competition
🦾
Robotics
cbc.bb
·
1d
1 day ago
Actions for Semi-finalists confirmed in Secondary Schools Volleyball Competition
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help