Scalable Offline Model-Based RL with Action Chunks
arxiv.org·1h
🔲Cellular Automata
Preview
Report Post

Title:Scalable Offline Model-Based RL with Action Chunks

View PDF

Abstract:In this paper, we study whether model-based reinforcement learning (RL), in particular model-based value expansion, can provide a scalable recipe for tackling complex, long-horizon tasks in offline RL. Model-based value expansion fits an on-policy value function using length-n imaginary rollouts generated by the current policy and a learned dynamics model. While larger n reduces bias in value bootstrapping, it amplifies accumulated model errors over long horizons, degrading future predictions. We address this trade-off with an \emph{action-chunk} model that predicts a future state from a sequence of actions (an "action chunk") instead of a single action, which re…

Similar Posts

Loading similar posts...