Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
🎯 Reinforcement Learning
Q-learning, Policy Gradient, Reward Functions, TD Learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
136
posts in
7.3
ms
Policy
Gradient
for Continuous-Time Robust
Markov
Decision Processes
Â
🎯
Predictive Coding
Â
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for Policy Gradient for Continuous-Time Robust Markov Decision Processes
Q-Learning
(
Reinforcement
learning
): Bellman Equation,
Markov
Decision Processes, Q-Values, and…
Â
🔄
Meta-Learning
Â
Content type:
Blog
medium.com
·
1d
1 day ago
Actions for Q-Learning (Reinforcement learning): Bellman Equation, Markov Decision Processes, Q-Values, and…
Researchers develop AI-powered railway control system for efficient urban train operation
Â
🤖
Machine Learning
techxplore.com
·
7h
7 hours ago
Actions for Researchers develop AI-powered railway control system for efficient urban train operation
OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training
agents
.
Â
🤖
Machine Learning
Â
Content type:
Blog
huggingface.co
·
2d
2 days ago
·
Hacker News
,
r/LocalLLaMA
Actions for OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training agents.
SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Â
🦾
Robotics
Â
Content type:
Code
github.com
·
3d
3 days ago
·
r/opensource
Actions for SimarcLabs/pybullet-swarm-sim: Python framework for simulating drone swarms with PyBullet in seconds.
Scale Robot
Reinforcement
Learning
with NVIDIA Isaac Lab on Amazon SageMaker AI
Â
🔄
Meta-Learning
Â
Content type:
Blog
aws.amazon.com
·
1d
1 day ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Semi-finalists confirmed in Secondary Schools Volleyball Competition
Â
📡
Wnt Signaling
cbc.bb
·
22h
22 hours ago
Actions for Semi-finalists confirmed in Secondary Schools Volleyball Competition
See, Act, Correct: three levers for working with a code
agent
Â
ðŸ§
Axon Guidance
Â
Content type:
Blog
blog.owulveryck.info
·
6d
6 days ago
·
Hacker News
,
Hacker News
Actions for See, Act, Correct: three levers for working with a code agent
Import AI 460:
Reward
hacking society, RSI data from Anthropic; and RL-based quadcopter racing
Â
ðŸ§
Neuromorphic Hardware
jack-clark.net
·
2d
2 days ago
Actions for Import AI 460: Reward hacking society, RSI data from Anthropic; and RL-based quadcopter racing
Researchers trained an open source AI search
agent
, Harness-1, that outperforms GPT-5.4 on recalling relevant information
Â
🤖
Machine Learning
venturebeat.com
·
1d
1 day ago
·
Hacker News
Actions for Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
Reasoning RL in 2026: GRPO, DPO, RLVR,
Agentic
PO
& Beyond
Â
ðŸ§
Neuromorphic Hardware
turingpost.com
·
3d
3 days ago
Actions for Reasoning RL in 2026: GRPO, DPO, RLVR, Agentic PO & Beyond
Geometrically Averaged Hard Target Updates for Linear
Q-Learning
Â
🔄
Meta-Learning
Â
Content type:
Academic
arxiv.org
·
16h
16 hours ago
Actions for Geometrically Averaged Hard Target Updates for Linear Q-Learning
Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms
Â
🤖
Machine Learning
Â
Content type:
Blog
cncf.io
·
2d
2 days ago
Actions for Breaking free of a single datacenter: Practical geo-distributed AI operations with the k0smos platforms
How to Train Your Goblin
Â
🔄
Meta-Learning
goblins.mchen.workers.dev
·
3d
3 days ago
·
Hacker News
,
Hacker News
Actions for How to Train Your Goblin
Some Interesting Papers on RLVR
Â
🎯
Predictive Coding
lesswrong.com
·
1d
1 day ago
Actions for Some Interesting Papers on RLVR
How to Stop Shipping Low-Quality RL
Environments
(with Examples)
Â
ðŸ§
Neuromorphic Hardware
Â
Content type:
News
latent.space
·
5d
5 days ago
·
Hacker News
Actions for How to Stop Shipping Low-Quality RL Environments (with Examples)
DQN
Tutorial - RL Summer School 2026
Â
🔄
Meta-Learning
araffin.github.io
·
1d
1 day ago
Actions for DQN Tutorial - RL Summer School 2026
Reinforcement
Learning
and Optimal Control Book (RIP Dimitri Bertsekas)
Â
🤖
Machine Learning
Â
Content type:
Academic
web.mit.edu
·
5d
5 days ago
·
Hacker News
Actions for Reinforcement Learning and Optimal Control Book (RIP Dimitri Bertsekas)
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
Â
🔄
Meta-Learning
Â
Content type:
Academic
arxiv.org
·
16h
16 hours ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Good teachers don’t cheat
Â
🔄
Meta-Learning
Â
Content type:
Blog
jasonkena.github.io
·
6d
6 days ago
·
Hacker News
Actions for Good teachers don’t cheat
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help