Symmetry-Aware Steering of Equivariant Diffusion Policies: Benefits and Limits

View PDF HTML (experimental)

Abstract:Equivariant diffusion policies (EDPs) combine the generative expressivity of diffusion models with the strong generalization and sample efficiency afforded by geometric symmetries. While steering these policies with reinforcement learning (RL) offers a promising mechanism for fine-tuning beyond demonstration data, directly applying standard (non-equivariant) RL can be sample-inefficient and unstable, as it ignores the symmetries that EDPs are designed to exploit. In this paper, we theoretically establish that the diffusion process of an EDP is equivariant, which in turn induces a group-invariant latent-noise MDP that is well-suited for equivariant diffusion steering…

View PDF HTML (experimental)

Abstract:Equivariant diffusion policies (EDPs) combine the generative expressivity of diffusion models with the strong generalization and sample efficiency afforded by geometric symmetries. While steering these policies with reinforcement learning (RL) offers a promising mechanism for fine-tuning beyond demonstration data, directly applying standard (non-equivariant) RL can be sample-inefficient and unstable, as it ignores the symmetries that EDPs are designed to exploit. In this paper, we theoretically establish that the diffusion process of an EDP is equivariant, which in turn induces a group-invariant latent-noise MDP that is well-suited for equivariant diffusion steering. Building on this theory, we introduce a principled symmetry-aware steering framework and compare standard, equivariant, and approximately equivariant RL strategies through comprehensive experiments across tasks with varying degrees of symmetry. While we identify the practical boundaries of strict equivariance under symmetry breaking, we show that exploiting symmetry during the steering process yields substantial benefits-enhancing sample efficiency, preventing value divergence, and achieving strong policy improvements even when EDPs are trained from extremely limited demonstrations.


Subjects:	Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:2512.11345 [cs.LG]
	(or arXiv:2512.11345v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2512.11345 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Minwoo Park [view email] [v1] Fri, 12 Dec 2025 07:42:01 UTC (786 KB)

Submission history

Similar Posts