Reinforcement Learning

Feeds to Scour
SubscribedAll
Scoured 383 posts in 5.3 ms

Deep Reinforcement Learning for Adaptive Power Allocation in ISAC Systems with Mobile Target

 🛡️AI Security  Content type: Academic
arxiv.org·

Geometrically Averaged Hard Target Updates for Linear Q-Learning

 🛡️AI Security  Content type: Academic
arxiv.org·

Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations

 🛡️AI Security  Content type: Academic
nature.com·

Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria

 🎮Game Theory  Content type: Academic
arxiv.org·

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

 🛡️AI Security  Content type: Academic
arxiv.org·

Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL

 🛡️AI Security  Content type: Academic
arxiv.org·

Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization

 🛡️AI Security  Content type: Academic
arxiv.org·

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

 🛡️AI Security  Content type: Academic
arxiv.org·

CCKS: Consensus-based Communication and Knowledge Sharing

 🎮Game Theory  Content type: Academic
arxiv.org·

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

 🎮Game Theory  Content type: Academic
arxiv.org·

Variational Proximal Policy Optimization

 🛡️AI Security  Content type: Academic
arxiv.org·

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

 🛡️AI Security  Content type: Academic
arxiv.org·

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

 🛡️AI Security  Content type: Academic
arxiv.org·

CFCamo: A Counterfactual Detect-or-Abstain Framework for Camouflaged Object Detection

 🛡️AI Security  Content type: Academic
arxiv.org·

UNIQ: Conformal Calibration for Adaptive Conservatism in Offline Reinforcement Learning

 🛡️AI Security  Content type: Academic
arxiv.org·

Organize then Retrieve: Hierarchical Memory Navigation for Efficient Agents

 🛡️AI Security  Content type: Academic
arxiv.org·

Path Planning Using Deep Deterministic Policy Gradient: A Reinforcement Learning Approach

 🛡️AI Security  Content type: Academic
arxiv.org·

INFRAMIND: Infrastructure-Aware Multi-Agent Orchestration

 🔧DevOps  Content type: Academic
arxiv.org·

A Unifying Lens on Reward Uncertainty in RLHF

 🎮Game Theory  Content type: Academic
arxiv.org·

DrivingAgent: Design and Scheduling Agents for Autonomous Driving Systems

 🏗️Systems Design  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help