Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 Reinforcement Learning
Q-learning, Policy Gradient, Reward Functions, TD Learning
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
135077
posts in
43.2
ms
Using
Reinforcement
Learning to
Solve
Real-World Problems
pub.towardsai.net
·
13h
📊
Dynamic Programming
Operator-Theoretic Foundations and Policy Gradient Methods for General
MDPs
with
Unbounded
Costs
arxiv.org
·
3d
📊
Dynamic Programming
How
Weak
Agents Make Strong Agents
Stronger
kooexperience.com
·
4h
·
Discuss:
DEV
💬
Prompt Engineering
Discounted Beta--
Bernoulli
Reward Estimation for Sample-Efficient Reinforcement Learning with
Verifiable
Rewards
arxiv.org
·
2d
🎲
Deterministic Simulation
Exploration and
Exploitation
: The Simple Yet
Profound
Logic at the Heart of Reinforcement Learning
pub.towardsai.net
·
1d
📊
Dynamic Programming
The
BMAD
Framework: Advanced AI Agents for Software Development and Beyond
youtube.com
·
9h
🎭
Program Synthesis
Precise
Manipulation with Efficient Online
RL
pi.website
·
3d
·
Discuss:
Hacker News
🤖
Robotics
Best
practices
for building
declarative
agents in Microsoft 365 Copilot
learn.microsoft.com
·
10h
⏱️
Temporal Workflow
Exploring
reinforcement
learning for a
self-balancing
robot
blog.adafruit.com
·
4d
🤖
Robotics
Goldilocks
RL: Tuning Task
Difficulty
to Escape Sparse Rewards for Reasoning
machinelearning.apple.com
·
3d
📱
Edge AI
Explainable
Causal Reinforcement Learning for
circular
manufacturing supply chains in carbon-negative infrastructure
dev.to
·
22h
·
Discuss:
DEV
⚡
LMAX Disruptor
The Future of
Aligning
Deep Learning systems will probably look like "training on
interp
"
lesswrong.com
·
1d
📱
Edge AI
Scalable learning of
macroscopic
stochastic
dynamics
link.aps.org
·
1d
⚡
Incremental Computation
Agents (or Humans) in
Goal-Directed
and
Goalless
Environments
changkun.substack.com
·
20h
·
Discuss:
Substack
⚓
Anchors
🔗
Effective
context
engineering for AI agents
yellowduck.be
·
10h
💬
Prompt Engineering
Why AI Systems Don't Learn — And What Agents Should Do About It
bemiagent.com
·
7h
💬
Prompt Engineering
Modeling ballistic magnetization
reversals
via spin-orbit
torques
by reinforcement learning
link.aps.org
·
5d
🔲
Cellular Automata
CTI-REALM
: A new benchmark for end-to-end detection rule generation with AI agents
microsoft.com
·
2d
🛡️
AI Security
Jensen
Huang proposes a compensation model where engineers receive an AI token budget on top of their base salary, to deploy agents as productivity
multipliers
...
techmeme.com
·
1d
💬
Prompt Engineering
Defining AI Safety
Paradigms
: Constitutional AI and
RLHF
adiyogiarts.com
·
1d
·
Discuss:
DEV
💬
Prompt Engineering
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help