Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
RL, Agents, Policy Optimization, Reward Functions
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
82005
posts in
205.6
ms
RLAnything
:
Forge
Environment, Policy, and Reward Model in Completely Dynamic RL System
arxiv.org
·
1d
💬
LLM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
**Abstract:** This paper presents a novel approach to dynamic task allocation in multi-robot systems leveraging
multi-objective
reinforcement learning (
MORL
)...
freederia.com
·
20h
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Distributed
Reinforcement Learning for
Scalable
High-Performance Policy Optimization
towardsdatascience.com
·
2d
🔥
PyTorch
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Adaptive
Rollout
Allocation for Online Reinforcement Learning with
Verifiable
Rewards
arxiv.org
·
1d
💬
LLM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Agentic
AI - Building
Intelligent
Agents
dev.to
·
11h
·
Discuss:
DEV
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Massively
Parallel
Methods for Deep Reinforcement Learning
dev.to
·
2d
·
Discuss:
DEV
🔥
PyTorch
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Self-Optimizing Football
Chatbot
Guided by Domain Experts on
Databricks
databricks.com
·
11h
💬
LLM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
**Abstract:** This paper proposes a novel approach for optimizing
compiler
performance in
resource-constrained
embedded systems using Reinforcement Learning ...
freederia.com
·
1d
🤖
Machine Learning
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Thoughts on
Toby
Ords
' AI Scaling Series
lesswrong.com
·
4h
🤖
Machine Learning
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Swe-Replay
Achieves 17.4% Performance Gain With Efficient Test-Time Scaling For Agents
quantumzeitgeist.com
·
17h
💬
LLM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
SDPO
: Reinforcement Learning via
Self-Distillation
self-distillation.github.io
·
2d
·
Discuss:
r/LocalLLaMA
🤖
Machine Learning
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Axiomeer
– An open
marketplace
for AI agents
news.ycombinator.com
·
1d
·
Discuss:
Hacker News
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Specification-Guided
Reinforcement Learning
cacm.acm.org
·
6d
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Context
Engineering & Agent Memory Platform for AI Agents
getzep.com
·
5h
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
The Dual
Pillars
of
Embodied
Autonomy: A Technical Deep Dive into Language-Action Models and…
pub.towardsai.net
·
1h
💬
LLM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
The
3Cs
: A Framework for AI Agent Security
docker.com
·
3h
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
The
Gumbel-Max
Trick
blog.quipu-strands.com
·
11h
·
Discuss:
Hacker News
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Selection
Rather
Than Prediction
voratiq.com
·
1d
·
Discuss:
Hacker News
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Routing
in a
Sparse
Graph: a Distributed Q-Learning Approach
towardsdatascience.com
·
12h
🤖
AI
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Stop building systems for agents, build systems for human
blog.xiangpeng.systems
·
1d
·
Discuss:
Hacker News
💬
LLM
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help