Learning Social Spatio-Temporal Relation Graph in the Wild and a Video Benchmark
- 14 September 2021
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks and Learning Systems
- Vol. 34 (6), 2951-2964
- https://doi.org/10.1109/tnnls.2021.3110682
Abstract
Social relations are ubiquitous and form the basis of social structure in our daily life. However, existing studies mainly focus on recognizing social relations from still images and movie clips, which are different from real-world scenarios. For example, movie-based datasets define the task as the video classification, only recognizing one relation in the scene. In this article, we aim to study the problem of social relation recognition in an open environment. To close the gap, we provide the first video dataset collected from real-life scenarios, named social relation in the wild (SRIW), where the number of people can be huge and vary, and each pair of relations needs to be recognized. To overcome new challenges, we propose a spatio-temporal relation graph convolutional network (STRGCN) architecture, utilizing correlative visual features to recognize social relations intuitively. Our method decouples the task into two classification tasks: person-level and pair-level relation recognition. Specifically, we propose a person behavior and character module to encode moving and static features in two explicit ways. Then we take them as node features to build a relation graph with meaningful edges in a scene. Based on the relation graph, we introduce the graph convolutional network (GCN) and local GCN to encode social relation features which are used for both recognitions. Experimental results demonstrate the effectiveness of the proposed framework, achieving 83.1% and 40.8% mAP in person-level and pair-level classification. Moreover, the study also contributes to the practicality in this field.Keywords
Funding Information
- Key Scientific Technological Innovation Research Project
- Ministry of Education
- State Key Program and the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (61836009, 61621005)
- Key Research and Development Program in Shaanxi Province of China (2019ZDLGY03-06)
- Major Research Plan of the National Natural Science Foundation of China (91438201, 91438103, 61801124)
- National Natural Science Foundation of China (U1701267, 62006177, 61871310, 61902298, 61573267, 91838303, 61906150)
- Fund for Foreign Scholars in University Research and Teaching Programs 111 Project (B07048)
- Program for Cheung Kong Scholars and Innovative Research Team in University (IRT 15R53)
- Science and Technology (ST) Innovation Project from the Chinese Ministry of Education
- National Science Basic Research Plan in Shaanxi Province of China (2019JQ-659)
- Scientific Research Project of Education Department in Shaanxi Province of China (20JY023)
- Fundamental Research Funds for the Central Universities (XJS201901, XJS201903, JBF201905, JB211908)
- Chinese Association for Artificial Intelligence (CAAI)-Huawei MindSpore Open Fund
This publication has 44 references indexed in Scilit:
- Temporal Segment Networks: Towards Good Practices for Deep Action RecognitionPublished by Springer Science and Business Media LLC ,2016
- Pedestrian Behavior Modeling From Stationary Crowds With Applications to Intelligent SurveillanceIEEE Transactions on Image Processing, 2016
- Social LSTM: Human Trajectory Prediction in Crowded SpacesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Tega: A social robotPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Learning Social Relation Traits from Face ImagesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Family Member Identification from Photo CollectionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Finding Actors and Actions in MoviesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- Socially-aware robot navigation: A learning approachPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Intelligent distributed surveillance systems: a reviewIEE Proceedings - Vision, Image, and Signal Processing, 2005
- Long Short-Term MemoryNeural Computation, 1997