An end-to-End deep context gate convolutional visual odometry system based on lightweight attention mechanism

Abstract
Purpose: Conventional learning-based visual odometry (VO) systems usually use convolutional neural networks (CNN) to extract features, where some important context-related and attention-holding global features might be ignored. Without essential global features, VO system will be sensitive to various environmental perturbations. The purpose of this paper is to design a novel learning-based framework that aims to improve accuracy of learning-based VO without decreasing the generalization ability. Design/methodology/approach: Instead of CNN, a context-gated convolution is adopted to build an end-to-end learning framework, which enables convolutional layers that dynamically capture representative local patterns and composes local features of interest under the guidance of global context. In addition, an attention mechanism module is introduced to further improve learning ability and enhance robustness and generalization ability of the VO system. Findings: The proposed system is evaluated on the public data set KITTI and the self-collected data sets of our college building, where it shows competitive performance compared with some classical and state-of-the-art learning-based methods. Quantitative experimental results on the public data set KITTI show that compared with CNN-based VO methods, the average translational error and rotational error of all the test sequences are reduced by 45.63% and 37.22%, respectively. Originality/value: The main contribution of this paper is that an end-to-end deep context gate convolutional VO system based on lightweight attention mechanism is proposed, which effectively improves the accuracy compared with other learning-based methods.

This publication has 8 references indexed in Scilit: