Classification of β-Thalassemia Carriers From Red Blood Cell Indices Using Ensemble Classifier

Abstract
Thalassemia is viewed as a prevalent inherited blood disease that has gotten exorbitant consideration in the field of medical research around the world. Inherited diseases have a high risk that children will get these diseases from their parents. If both the parents are β-Thalassemia carriers then there are 25% chances that each child will have β-Thalassemia intermediate or β-Thalassemia major, which in most of its cases leads to death. Prenatal screening after counseling of couples is an effective way to control β-Thalassemia. Generally, identification of the Thalassemia carriers is performed by some quantifiable blood traits determined effectively by high-performance-liquid-chromatography (HPLC) test, which is costly, time-consuming, and requires specialized equipment. However, cost-effective and rapid screening techniques need to be devised for this problem. This study aims to detect β-Thalassemia carriers by evaluating red blood cell indices from the complete-blood-count test. The present study included Punjab Thalassemia Prevention Project Lab Reports dataset. The proposed SGR-VC is an ensemble of three machine learning algorithms: Support Vector Machine, Gradient Boosting Machine, and Random Forest. Comparative analysis proved that the proposed ensemble model using all indices of red blood cells is very effective in β-Thalassemia carrier screening with 93% accuracy.
Funding Information
  • Ministry of Science and ICT South Korea (IITP-2020-2016-0-00313)
  • National Research Foundation of Korea (NRF-2019R1A2C1006159)