A full convolutional network based on DenseNet for remote sensing scene classification

Top Cited Papers
Open Access
Abstract
The convolutional neural networks (CNN) applied in remote sensing scene classification have two common problems. One is that these models have large number of parameters, which causes over-fitting easily. The other is that the network is not deep enough, thus more abstract semantic information cannot be extracted. To solve these two problems, we propose a simple and efficient full convolutional network based on DenseNet for remote sensing scene classification. We construct a small number of convolutional kernels to generate a large number of reusable feature maps by dense connections, which makes the network deeper, but does not increase the number of parameters significantly. Our network is so deep that it has more than 100 layers. However, it has only about 7 million parameters, which is far less than the number of VGGos parameters. Then we incorporate an adaptive average 3D pooling operation in our network. This operation fixes feature maps of size 7 × 7 from the last DenseBlock to 1 × 1 and decreases the number of channels from 1024 to 512, thus the whole network can accept input images with different sizes. Furthermore, we design the convolutional layer instead of the fully connected layer that is used as a classifier usually, so that the output features of the network can be classified without flattening operation, which simplifies the classification operation. Finally, a good model is trained by exploiting pre-trained weights and data augmentation technology. Compared with several state-of-the-art algorithms, our algorithm improves classification performance significantly on UCM dataset, AID dataset, OPTIMAL-31 dataset and NWPU-RESISC45 dataset.