GLAD: Global–Local-Alignment Descriptor for Scalable Person Re-Identification

Abstract
The huge variance of human pose and the misalignment of detected human images significantly increase the difficulty of pedestrian image matching in person Re-Identification (Re-ID). Moreover, the massive visual data being produced by surveillance video cameras requires highly efficient person Re-ID systems. Targeting to solve the first problem, this work proposes a robust and discriminative pedestrian image descriptor, namely the Global-Local-Alignment Descriptor (GLAD). For the second problem, this work treats person Re-ID as image retrieval and proposes an efficient indexing and retrieval framework. GLAD explicitly leverages the local and global cues in human body to generate a discriminative and robust representation. It consists of part extraction and descriptor learning modules, where several part regions are first detected and then deep neural networks are designed for representation learning on both the local and global regions. A hierarchical indexing and retrieval framework is designed to perform off-line relevance mining to eliminate the huge person ID redundancy in the gallery set, and accelerate the online Re-ID procedure. Extensive experimental results on widely used public benchmark datasets show GLAD achieves competitive accuracy compared to the state-of-the-art methods. On a large-scale person Re-ID dataset containing more than 520K images, our retrieval framework significantly accelerates the online Re-ID procedure while also improving the Re-ID accuracy. Therefore, this work has potential to work better on person Re-ID tasks in real scenarios.
Funding Information
  • National Natural Science Foundation of China (61572050, 91538111, 61620106009, 61429201)
  • Beijing Major Science and Technology Project (Z171100000117008)
  • ARO (W911NF-15-1-0290)
  • Faculty Research Gift Awards by NEC Laboratories of America and Blippar

This publication has 63 references indexed in Scilit: