DGAT: Dynamic Gaussian Attenuate Transformer for Remote Sensing Image Change Captioning (opens in new tab)
The remote sensing image change captioning (RSICC) technique is designed to enhance geospatial analysis by generating semantic descriptions of differences observed in bi-temporal remote sensing imagery (RSI). Although Transformer-based methods have achieved significant advancements in this field, their standard global attention mechanism allows pixels to pay equal attention to all spatial positions, which lacks explicit prior knowledge of spatial proximity correlation, making the model unable...
Read the original article