Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reinforcement Learning
馃幃 Reinforcement Learning
RL, AI Agents, Game Playing, Policy Optimization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
381
posts in
7.8
ms
馃Top
AI
Papers of the Week
聽
馃
LLM
聽
Content type:
News
nlp.elvissaravia.com
路
3d
3 days ago
Actions for 馃Top AI Papers of the Week
2026 FIVB Volleyball Women's Nations League in Nanjing: Poland beats Czech Republic 3-0
聽
馃
AI
ecns.cn
路
6d
6 days ago
Actions for 2026 FIVB Volleyball Women's Nations League in Nanjing: Poland beats Czech Republic 3-0
Model predictive task sampling for efficient and robust adaptation
聽
鉁嶏笍
Prompt Engineering
聽
Content type:
Academic
nature.com
路
2d
2 days ago
Actions for Model predictive task sampling for efficient and robust adaptation
Training Deliberative Monitors for Black-Box Scheming Detection
聽
馃帗
RLHF
lesswrong.com
路
6d
6 days ago
Actions for Training Deliberative Monitors for Black-Box Scheming Detection
Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
聽
馃捇
Cursor
digg.com
路
6d
6 days ago
Actions for Core Automation co-founder Jerry Tworek jokes that Nvidia's CUDA translates to miracles in Polish
Protest against ballot paper shortages enters 2nd day, demanding new election
聽
馃攳
RAG
聽
Content type:
News
koreatimes.co.kr
路
5d
5 days ago
路
r/news
Actions for Protest against ballot paper shortages enters 2nd day, demanding new election
Bridging Multi-Vector and
Learned-Sparse
Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!
聽
馃
LLM
聽
Content type:
News
聽
Content type:
Blog
recsys.substack.com
路
5d
5 days ago
路
Substack
Actions for Bridging Multi-Vector and Learned-Sparse Retrieval, A Diagnostic Framework for Robust Semantic IDs, and More!
Self-Paced Curriculum
Reinforcement
Learning
for Autonomous Superbike Racing in Simulation
聽
馃
Agent
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation
What is MBPO? A Beginner鈥檚 Guide to Efficient
Reinforcement
Learning
聽
馃帗
RLHF
聽
Content type:
Blog
ujangriswanto08.medium.com
路
5d
5 days ago
Actions for What is MBPO? A Beginner鈥檚 Guide to Efficient Reinforcement Learning
Comp.compilers: Paper: MileStone: A Multi-Objective Compiler Phase Ordering Framework for Graph-based IR-Level
Optimization
聽
鉁嶏笍
Prompt Engineering
compilers.iecc.com
路
5d
5 days ago
Actions for Comp.compilers: Paper: MileStone: A Multi-Objective Compiler Phase Ordering Framework for Graph-based IR-Level Optimization
Value representation in youth psychopathology: evidence of a transdiagnostic risk mechanism for psychosis
聽
馃
LLM
聽
Content type:
Academic
nature.com
路
1d
1 day ago
Actions for Value representation in youth psychopathology: evidence of a transdiagnostic risk mechanism for psychosis
Discovering Interpretable Multi-Parameter Control
Policies
for Evolutionary Algorithms Using
Deep
Reinforcement
Learning
聽
馃帗
RLHF
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Discovering Interpretable Multi-Parameter Control Policies for Evolutionary Algorithms Using Deep Reinforcement Learning
Google
DeepMind
's Susan Zhang argues abundant
AI
content shifts the premium from raw intelligence to human relationships and social dynamics
聽
馃幁
Anthropic Claude
聽
Content type:
News
digg.com
路
3d
3 days ago
Actions for Google DeepMind's Susan Zhang argues abundant AI content shifts the premium from raw intelligence to human relationships and social dynamics
LLM Research Papers: The 2026 List (January to May)
聽
馃
LLM
聽
Content type:
News
magazine.sebastianraschka.com
路
4d
4 days ago
路
Hacker News
Actions for LLM Research Papers: The 2026 List (January to May)
A wild idea: Abstract reality using ontology
聽
馃
LLM
聽
Content type:
Discussion
news.ycombinator.com
路
4d
4 days ago
路
Hacker News
Actions for A wild idea: Abstract reality using ontology
Combermere and Harrison College reach Under-15 basketball final
聽
馃
LLM
cbc.bb
路
4d
4 days ago
Actions for Combermere and Harrison College reach Under-15 basketball final
Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using
Reinforcement
Learning
聽
馃帗
RLHF
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Towards End to End Motion Planning and Execution for Autonomous Underwater Vehicles Using Reinforcement Learning
NAVER Expands
AI
Infrastructure With NVIDIA to Serve Surging Global
AI
Demand
聽
馃
Agent
nvidianews.nvidia.com
路
3d
3 days ago
Actions for NAVER Expands AI Infrastructure With NVIDIA to Serve Surging Global AI Demand
Why Robotics Is a Pre-Paradigm Field
聽
鉁嶏笍
Prompt Engineering
聽
Content type:
News
whattotelltherobot.com
路
4d
4 days ago
路
Hacker News
Actions for Why Robotics Is a Pre-Paradigm Field
Fast and Highly Expressive
Policy
Learning
for Offline
Reinforcement
Learning
via Bootstrapped Flow
Q-Learning
聽
馃帗
RLHF
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help