World Models

Feeds to Scour
SubscribedAll
Scoured 355 posts in 7.2 ms

Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems

馃幃RLContent type: Academic
arxiv.org

Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication

馃幃RLContent type: Academic
arxiv.org

Critic Architecture Matters: Dual vs. Unified Critics for Humanoid Loco-Manipulation

馃幃RLContent type: Academic
arxiv.org

Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning

馃幆Post-trainingContent type: Academic
arxiv.org

On-sky demonstration of reinforcement learning for adaptive optics control

馃幃RLContent type: Academic
arxiv.org

Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning

馃幃RLContent type: Academic
arxiv.org

Representation Learning Enables Scalable Multitask Deep Reinforcement Learning

馃幃RLContent type: Academic
arxiv.org

QnRL: Quantum-Native Reinforcement Learning

馃幃RLContent type: Academic
arxiv.org

EEGDancer: Dynamic Emotion Latent Space Masked Modeling with Reinforcement Learning for EEG Continuous Emotion Prediction

馃幃RLContent type: Academic
arxiv.org

Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning

馃幃RLContent type: Academic
arxiv.org

RePAIR: Predictive Self-Supervised Representation Learning in Chess

馃幃RLContent type: Academic
arxiv.org

Reinforcement Learning for Flow-Matching Policies with Density Transport

馃搳MLContent type: Academic
arxiv.org

PAWS: Preference Learning with Advantage-Weighted Segments

馃幃RLContent type: Academic
arxiv.org

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

馃幃RLContent type: Academic
arxiv.org

Progress-SQL: Improving Reinforcement Learning for Text-to-SQL via Progressive Rewards

馃幃RLContent type: Academic
arxiv.org

GIFT: LLM-Guided State-Reward Interface for Financial Reinforcement Learning

馃幃RLContent type: Academic
arxiv.org

Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL

馃幃RLContent type: Academic
arxiv.org

Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark

馃幃RLContent type: Academic
arxiv.org

Performance Variation in Deep Reinforcement Learning

馃幃RLContent type: Academic
arxiv.org

HERO: Hindsight-Enhanced Reflection from Environment Observations for Agentic Self-Distillation

馃幃RLContent type: Academic
arxiv.org
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help