Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
Specific
RL, reward functions, policy gradient, RLHF
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
171822
posts in
27.1
ms
Reinforcement Learning From
Scratch
(Part 1) — Understanding the Agent–Environment
Loop
medium.com
·
1d
🤖
AI Agents
Discounted Beta--
Bernoulli
Reward Estimation for Sample-Efficient Reinforcement Learning with
Verifiable
Rewards
arxiv.org
·
10h
♟️
Game Theory
Balancing the Reasoning Load:
Difficulty-Differentiated
Policy Optimization with Length
Redistribution
for Efficient and Robust Reinforcement Learning
arxiv.org
·
10h
💬
LLMs
Exploring
reinforcement
learning for a
self-balancing
robot
blog.adafruit.com
·
1d
🤖
AI Agents
State of
RL
for
reasoning
LLMs
aweers.de
·
4d
📐
ML Theory
Goldilocks
RL: Tuning Task
Difficulty
to Escape Sparse Rewards for Reasoning
machinelearning.apple.com
·
1d
💬
LLMs
Explainable
Causal Reinforcement Learning for precision
oncology
clinical workflows under real-time policy constraints
dev.to
·
2d
·
Discuss:
DEV
📐
ML Theory
LangGraph
vs Temporal for AI Agents: Durable Execution Architecture Beyond For
Loops
pub.towardsai.net
·
22h
🤖
AI Agents
How AI Learned to Design
Reward
Functions Without
Examples
vinitpahwa.medium.com
·
1d
🤖
AI Agents
The Controller Problem in Bio-Hybrid Computing:
Simulation
&
Protocol
Update
medium.com
·
19h
🤖
AI Agents
Dopamine
GPS
: Visual Guidance Beyond Reward
neurosciencenews.com
·
18h
🧠
Cognitive Science
Modeling ballistic magnetization
reversals
via spin-orbit
torques
by reinforcement learning
link.aps.org
·
2d
🤖
Machine Learning
ayushdnb/Neural-Abyss
: Experimental platform for studying emergent behavior in large-scale multi-agent reinforcement learning environments with evolutionary dynamics,
PPO
training.
github.com
·
2d
·
Discuss:
Hacker News
🤖
AI Agents
AI Agents for Data Scientists: The Agent
Loop
- the Core
Pattern
Behind AI Agents
datascienceweekly.substack.com
·
2d
·
Discuss:
Substack
🤖
AI Agents
Reinforcement Learning for Robotics: A Comprehensive 2025 Guide |
Abhishek
Nair
- Fractional CTO for Deep Tech & AI
padawanabhi.de
·
4d
·
Discuss:
DEV
🤖
AI Agents
Preventing
Memory and Context
Poisoning
in AI Agents
levelup.gitconnected.com
·
6h
🤖
AI Agents
Powering the agents: Workers AI now runs large models, starting with
Kimi
K2.5
blog.cloudflare.com
·
14h
·
Discuss:
Hacker News
,
Hacker News
🤖
AI Agents
What should we think about
shard
theory in light of
chain-of-thought
agents?
lesswrong.com
·
1d
🤖
AI Agents
Claudeception
: Inside the Mind of an
Analytics
Agent
motherduck.com
·
13h
🤖
AI Agents
Understanding How AI Agents Work
dev.to
·
22h
·
Discuss:
DEV
🤖
AI Agents
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help