DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision
arxiv.org·2d
📊Rate-Distortion Theory
Preview
Report Post

Title:DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision

View PDF HTML (experimental)

Abstract:Vision Transformers face a fundamental limitation: standard self-attention jointly processes spatial and channel dimensions, leading to entangled representations that prevent independent modeling of structural and semantic dependencies. This problem is especially pronounced in hyperspectral imaging, from satellite hyperspectral remote sensing to infrared pathology imaging, where channels capture distinct biophysical or biochemical cues. We propose DisentangleFormer, an architecture that achieves robust multi-channel vision representation through principled spatial-channel decoupling. Motivat…

Similar Posts

Loading similar posts...