Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
World Models
馃寪 World Models
Specific
world models, model-based AI, environment simulation
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
355
posts in
7.2
ms
Deterministic Policy Gradient for
Learning
Equilibrium in Time-Inconsistent Control Problems
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems
Event-Driven
Reinforcement
Learning
Enables Long-Horizon Control in Semiconductor Fabrication
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication
Critic Architecture Matters: Dual vs. Unified Critics for Humanoid Loco-Manipulation
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for Critic Architecture Matters: Dual vs. Unified Critics for Humanoid Loco-Manipulation
Architecture-Aware
Reinforcement
Learning
Makes Sliding-Window Attention Competitive in Math Reasoning
聽
馃幆
Post-training
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning
On-sky demonstration of
reinforcement
learning
for adaptive optics control
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for On-sky demonstration of reinforcement learning for adaptive optics control
Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM
Reinforcement
Learning
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
1d
1 day ago
Actions for Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning
Representation
Learning
Enables Scalable Multitask Deep
Reinforcement
Learning
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for Representation Learning Enables Scalable Multitask Deep Reinforcement Learning
QnRL: Quantum-Native
Reinforcement
Learning
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for QnRL: Quantum-Native Reinforcement Learning
EEGDancer: Dynamic Emotion
Latent
Space
Masked
Modeling
with Reinforcement Learning for EEG Continuous Emotion Prediction
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
6d
6 days ago
Actions for EEGDancer: Dynamic Emotion Latent Space Masked Modeling with Reinforcement Learning for EEG Continuous Emotion Prediction
Cooperative Long Rope Skipping via Multi-Agent
Reinforcement
Learning
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning
RePAIR:
Predictive
Self-Supervised
Representation
Learning
in Chess
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for RePAIR: Predictive Self-Supervised Representation Learning in Chess
Reinforcement
Learning
for Flow-Matching Policies with Density Transport
聽
馃搳
ML
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Reinforcement Learning for Flow-Matching Policies with Density Transport
PAWS: Preference
Learning
with Advantage-Weighted Segments
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for PAWS: Preference Learning with Advantage-Weighted Segments
Self-Paced Curriculum
Reinforcement
Learning
for Autonomous Superbike Racing in
Simulation
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation
Progress-SQL: Improving
Reinforcement
Learning
for Text-to-SQL via Progressive Rewards
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
3d
3 days ago
Actions for Progress-SQL: Improving Reinforcement Learning for Text-to-SQL via Progressive Rewards
GIFT: LLM-Guided
State-Reward
Interface for Financial
Reinforcement
Learning
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for GIFT: LLM-Guided State-Reward Interface for Financial Reinforcement Learning
Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL
Offline
Reinforcement
Learning
for Plasma Control in Nuclear Fusion: Codebase and Benchmark
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
2d
2 days ago
Actions for Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark
Performance Variation in Deep
Reinforcement
Learning
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
3d
3 days ago
Actions for Performance Variation in Deep Reinforcement Learning
HERO: Hindsight-Enhanced Reflection from
Environment
Observations for Agentic Self-Distillation
聽
馃幃
RL
聽
Content type:
Academic
arxiv.org
路
12h
12 hours ago
Actions for HERO: Hindsight-Enhanced Reflection from Environment Observations for Agentic Self-Distillation
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help