Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
馃幃 Reinforcement Learning
Specific
RL, reward function, policy gradient, Q-learning, OpenAI Gym
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
184078
posts in
18.7
ms
Dynamical
Priors
as a Training Objective in Reinforcement Learning
聽
馃
AI
arxiv.org
路
6d
How to build custom reasoning agents with a
fraction
of the
compute
聽
馃
LLMs
venturebeat.com
路
1d
The Data
Layer
Tax for Robot Learning
聽
馃
LLMs
rerun.io
路
5h
路
Hacker News
Ask HN: Anyone using AI agents for
active
learning
sprints
? Here's my setup
聽
馃
AI
news.ycombinator.com
路
11h
路
Hacker News
Boiler
combustion
optimization via offline reinforcement learning with an ensemble high-dimensional environment
聽
馃
AI
sciencedirect.com
路
2d
There Will Be a
Scientific
Theory of Deep Learning
聽
馃
AI
mail.bycloud.ai
路
23h
How does
Reinforcement
Learning
Affect
Models
聽
馃
LLMs
lesswrong.com
路
3d
Effective Personalized AI Tutors via LLM-Guided Reinforcement Learning by Angel Tsai-Hsuan Chung,
Botong
Zhang, Ling-Chieh Kung, Hamsa Bastani,
Osbert
Bastani :...
聽
馃
LLMs
papers.ssrn.com
路
20h
Learning diverse natural behaviors for enhancing the
agility
of
quadrupedal
robots
聽
馃
LLMs
nature.com
路
1d
Deep Learning Weekly: Issue 453
聽
馃
LLMs
deeplearningweekly.com
路
3h
context-labs/HALO:
Hierarchal
Agent Loop
Optimizer
聽
馃
LLMs
github.com
路
19h
路
Hacker News
Constraints
That Compute: A Unified Framework for Efficient Intelligence from Prime
Harmonics
to Latent Reasoning
聽
馃
AI
zenodo.org
路
1h
路
Hacker News
DEEP
Robotics
聽
馃
AI
youtube.com
路
2d
路
r/singularity
On-Policy vs Off-Policy RL:
PPO
vs SAC on 5
Gymnasium
Tasks
聽
馃
AI
tildalice.io
路
4d
Best
Cheap
Open Source Models for
Hermes
Agent in 2026
聽
馃
AI
bitdoze.com
路
18h
Wild
parrots
exhibit age-dependent
conformity
when learning about novel food
聽
馃
AI
journals.plos.org
路
4h
Jaxpot
: Train self-play RL agents FAST by
parallelizing
environments on GPU
聽
馃
LLMs
bardsai.substack.com
路
2d
路
Substack
Inside Claude Code, OpenAI
Codex
, and
HuggingFace
's ML Engineer Agent
聽
馃
LLMs
newsletter.artofsaience.com
路
5h
RL
, in
pictures
and videos
聽
馃
AI
suriya.cc
路
5d
The Policy Picks the Policy
聽
馃
LLMs
noise2signal.bearblog.dev
路
2d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help