Student Performance Prediction Using A Cascaded Bi-level Feature Selection Approach

Abstract
Features in educational data are ambiguous which leads to noisy features and curse of dimensionality problems. These problems are solved via feature selection. There are existing models for features selection. These models were created using either a single-level embedded, wrapperbased or filter-based methods. However single-level filter-based methods ignore feature dependencies and ignore the interaction with the classifier. The embedded and wrapper based feature selection methods interact with the classifier, but they can only select the optimal subset for a particular classifier. So their selected features may be worse for other classifiers. Hence this research proposes a robust Cascade Bi-Level (CBL) feature selection technique for student performance prediction that will minimize the limitations of using a single-level technique. The proposed CBL feature selection technique consists of the Relief technique at first-level and the Particle Swarm Optimization (PSO) at the second-level. The proposed technique was evaluated using the UCI student performance dataset. In comparison with the performance of the single-level feature selection technique the proposed technique achieved an accuracy of 94.94% which was better than the values achieved by the single-level PSO with an accuracy of 93.67% for the binary classification task. These results show that CBL can effectively predict student performance.