Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 243 posts in 13.0 ms

Beyond Dexterity: Why Contact May Define the Next Era of Robotics

馃RoboticsContent type: VideoContent type: News

AI model predicts building fire spread, redirecting evacuees to safer exits in real time

馃AI Agents
techxplore.comHacker News

Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL

馃敩Deep LearningContent type: Academic
arxiv.org

Agentic RL: Token-In, Token-Out Done Right

馃敜Tokenization

Why Robotics Is a Pre-Paradigm Field

馃Machine LearningContent type: News

Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization

馃敟PyTorchContent type: Academic
arxiv.org

CCKS: Consensus-based Communication and Knowledge Sharing

馃Knowledge ManagementContent type: Academic
arxiv.org

Stack Overflow didn't just help AI learn to code

馃LLM

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

馃殌BootstrappingContent type: Academic
arxiv.org

OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training agents.

馃Machine LearningContent type: Blog

Apple's New AI Models Contain 'None' of Google's Gemini Assistant

馃ObsidianContent type: News
macrumors.comHacker News

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

馃敟PyTorchContent type: Academic
arxiv.org

Geometrically Averaged Hard Target Updates for Linear Q-Learning

馃搱OptimizationContent type: Academic
arxiv.org

Inside soccer鈥檚 data renaissance

馃Data scienceContent type: News
technologyreview.comHacker News

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

馃搱OptimizationContent type: Academic
arxiv.org

Vibe Diaries: Training Nanochat

馃敜Tokenization
vibediary.devHacker News

INFRAMIND: Infrastructure-Aware Multi-Agent Orchestration

馃LLM InferenceContent type: Academic
arxiv.org

Geometry-Aware Reinforcement Learning for 2D Irregular Nesting

馃敟PyTorchContent type: Academic
arxiv.org

gaelazzo/python_chess: Chess trainer

馃幆Fine-tuningContent type: Code
github.comHacker News

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

馃AI AgentsContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help