Designing deep CNN models based on sparse coding for aerial imagery: a deep-features reduction approach

Abstract
Traditional methods focus on low-level handcrafted features representations and it is difficult to design a comprehensive classification algorithm for remote sensing scene classification problems. Recently, convolutional neural networks (CNNs) have obtained remarkable performance outcomes, setting several remote sensing benchmarks. Furthermore, direct applications of UAV remote sensing images that use deep convolutional networks are extremely challenging given high input data dimensionality with relatively small amounts of available labelled data. We, therefore, propose a CNN approach to scene classification that architecturally incorporates sparse coding (SC) technique for dimension reduction to minimize overfitting. Outcomes were compared with principal component analysis (PCA) and global average pooling (GAP) alternatives that use fully connected layer(s) in pre-trained CNN architecture(s) to minimize overfitting. SC was used to encode deep features extracted from the last convolutional layer of pre-trained CNN models by using different features maps in which deep features had been converted into low-dimensional SC features. These same sparse-coded features were concatenated by means of different pooling techniques to obtain global image features for scene classification. The proposed algorithm outperformed current state-of-the-art algorithms based on handcrafted features. When using our own UAV-based dataset and existing datasets, it was also exceptionally efficient computationally when learning data representations, producing a 93.64% accuracy rate..

This publication has 58 references indexed in Scilit: