Using Video to Automatically Detect Learner Affect in Computer-Enabled Classrooms

Abstract
Affect detection is a key component in intelligent educational interfaces that respond to students’ affective states. We use computer vision and machine-learning techniques to detect students’ affect from facial expressions (primary channel) and gross body movements (secondary channel) during interactions with an educational physics game. We collected data in the real-world environment of a school computer lab with up to 30 students simultaneously playing the game while moving around, gesturing, and talking to each other. The results were cross-validated at the student level to ensure generalization to new students. Classification accuracies, quantified as area under the receiver operating characteristic curve (AUC), were above chance (AUC of 0.5) for all the affective states observed, namely, boredom (AUC = .610), confusion (AUC = .649), delight (AUC = .867), engagement (AUC = .679), frustration (AUC = .631), and for off-task behavior (AUC = .816). Furthermore, the detectors showed temporal generalizability in that there was less than a 2% decrease in accuracy when tested on data collected from different times of the day and from different days. There was also some evidence of generalizability across ethnicity (as perceived by human coders) and gender, although with a higher degree of variability attributable to differences in affect base rates across subpopulations. In summary, our results demonstrate the feasibility of generalizable video-based detectors of naturalistic affect in a real-world setting, suggesting that the time is ripe for affect-sensitive interventions in educational games and other intelligent interfaces.
Funding Information
  • Bill & Melinda Gates Foundation
  • National Science Foundation (DRL 1235958)

This publication has 50 references indexed in Scilit: