Classroom Behavior Detection Based on Improved YOLOv5 Algorithm Combining Multi-Scale Feature Fusion and Attention Mechanism
Open Access
- 5 July 2022
- journal article
- research article
- Published by MDPI AG in Applied Sciences
- Vol. 12 (13), 6790
- https://doi.org/10.3390/app12136790
Abstract
The detection of students’ behaviors in classroom can provide a guideline for assessing the effectiveness of classroom teaching. This study proposes a classroom behavior detection algorithm using an improved object detection model (i.e., YOLOv5). First, the feature pyramid structure (FPN+PAN) in the neck network of the original YOLOv5 model is combined with a weighted bidirectional feature pyramid network (BiFPN). They are subsequently processed with feature fusion of different scales of the object to mine the fine-grained features of different behaviors. Second, a spatial and channel convolutional attention mechanism (CBAM) is added between the neck network and the prediction network to make the model focus on the object information to improve the detection accuracy. Finally, the original non-maximum suppression is improved using the distance-based intersection ratio (DIoU) to improve the discrimination of occluded objects. A series of experiments were conducted on our new established dataset which includes four types of behaviors: listening, looking down, lying down, and standing. The results demonstrated that the algorithm proposed in this study can accurately detect various student behaviors, and the accuracy was higher than that of the YOLOv5 model. By comparing the effects of student behavior detection in different scenarios, the improved algorithm had an average accuracy of 89.8% and a recall of 90.4%, both of which were better than the compared detection algorithms.Keywords
Funding Information
- Research and practice of mobile academic management platform based on ubiquitous learning (2017-GX-268, KJQN201800534)
This publication has 27 references indexed in Scilit:
- Realtime Multi-person 2D Pose Estimation Using Part Affinity FieldsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2017
- SSD: Single Shot MultiBox DetectorPublished by Springer Science and Business Media LLC ,2016
- Temporal Segment Networks: Towards Good Practices for Deep Action RecognitionPublished by Springer Science and Business Media LLC ,2016
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence, 2016
- A research framework of smart educationSmart Learning Environments, 2016
- Fast R-CNNPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Learning Spatiotemporal Features with 3D Convolutional NetworksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence, 2015
- The Instructional Process: A Review of Flanders’ Interaction Analysis in a Classroom SettingInternational Journal of Secondary Education, 2015
- Reducing the Dimensionality of Data with Neural NetworksScience, 2006