Hybridizing Cross-Level Contextual and Attentive Representations for Remote Sensing Imagery Semantic Segmentation

Abstract
Semantic segmentation of remote sensing imagery is a fundamental task in intelligent interpretation. Since deep convolutional neural networks (DCNNs) performed considerable insight in learning implicit representations from data, numerous works in recent years have transferred the DCNN-based model to remote sensing data analysis. However, the wide-range observation areas, complex and diverse objects and illumination and imaging angle influence the pixels easily confused, leading to undesirable results. Therefore, a remote sensing imagery semantic segmentation neural network, named HCANet, is proposed to generate representative and discriminative representations for dense predictions. HCANet hybridizes cross-level contextual and attentive representations to emphasize the distinguishability of learned features. First of all, a cross-level contextual representation module (CCRM) is devised to exploit and harness the superpixel contextual information. Moreover, a hybrid representation enhancement module (HREM) is designed to fuse cross-level contextual and self-attentive representations flexibly. Furthermore, the decoder incorporates DUpsampling operation to boost the efficiency losslessly. The extensive experiments are implemented on the Vaihingen and Potsdam benchmarks. In addition, the results indicate that HCANet achieves excellent performance on overall accuracy and mean intersection over union. In addition, the ablation study further verifies the superiority of CCRM.
Funding Information
  • National Key Research and Development Program of China (2018YFC0407105, 2018YFC0407905, 2017YFC0405505)
  • Technology Project of China Huaneng Group (MW 2017/P28, 51779100, 51679103, HKY-JBYW-2020-21, HKY-JBYW-2020-07)

This publication has 41 references indexed in Scilit: