Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 287 posts in 14.0 ms

Self-Evolving Scientific Agent Discovers Generalizable Physically-Reasoned Fluid Control

 🤖AI  Content type: Academic
arxiv.org·

Drag reduction or reward hacking? Recurrent multi-agent reinforcement learning that earns its reward

 Automatic Differentiation  Content type: Academic
arxiv.org·

Reinforcement Learning for Flow-Matching Policies with Density Transport

 🤖AI  Content type: Academic
arxiv.org·

Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents

 🎲Probability Theory  Content type: Academic
arxiv.org·

AliyunConsoleAgent: Training Web Agents in Real-World Cloud Environments via Distillation and Reinforcement Learning

 Automatic Differentiation  Content type: Academic
arxiv.org·

AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning

 🔶TensorFlow  Content type: Academic
arxiv.org·

HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning

 📊Optimization  Content type: Academic
arxiv.org·

Belief-Space Quantum-Inspired Reinforcement Learning for Partially Observable Autonomous Cyber Defense in the Internet of Vehicles

 🎲Probability Theory  Content type: Academic
arxiv.org·

Self-Distilled Policy Gradient

 📡Information Theory  Content type: Academic
arxiv.org·

MacArena: Benchmarking Computer Use Agents on an Online macOS Environment

 🗣️Large Language Models  Content type: Academic
arxiv.org·

On Advantage Estimates for Max@K Policy Gradients

 Automatic Differentiation  Content type: Academic
arxiv.org·

Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy

 🗣️Large Language Models  Content type: Academic
arxiv.org·

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

 🧠Deep Learning  Content type: Academic
arxiv.org·

Self-Optimizing Control of Continuous Processes Based on Reinforcement Learning

 📊Optimization  Content type: Academic
arxiv.org·

TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution

 Automatic Differentiation  Content type: Academic
arxiv.org·

Exact Unlearning in Reinforcement Learning

 🤖AI  Content type: Academic
arxiv.org·

StainFlow: Entity-Stain Tracking and Evidence Linking for Process Rewards in GUI Agents

 🕸️Graph Theory  Content type: Academic
arxiv.org·

Co-Evolving Skill Generation and Policy Optimization

 📊Optimization  Content type: Academic
arxiv.org·

Policy-Conditioned Counterfactual Credit for Verifiable Reinforcement Learning of Long-Horizon Language Agents

 Automatic Differentiation  Content type: Academic
arxiv.org·

Shape Formation for the Cooperative Transportation of Arbitrary Objects Using Multi-Agent Reinforcement Learning

 🌐Distributed Systems  Content type: Academic
arxiv.org·

No more posts from gautam6599123's subscribed feeds.

Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help