Machine learning as new promising technique for selection of significant features in obese women with type 2 diabetes

Abstract
Background: The global trend of obesity and diabetes is considerable. Recently, the early diagnosis and accurate prediction of type 2 diabetes mellitus (T2DM) patients have been planned to be estimated according to precise and reliable methods, artificial networks and machine learning (ML). Materials and methods: In this study, an experimental data set of relevant features (adipocytokines and anthropometric levels) obtained from obese women (diabetic and non-diabetic) was analyzed. Machine learning was used to select significant features [by the separability-correlation measure (SCM) algorithm] for classification of women with the best accuracy and the results were evaluated using an artificial neural network (ANN). Results: According to the experimental data analysis, a significant difference (p < 0.05) was found between fasting blood sugar (FBS), hemoglobin A(1c) (HbA(1c)) and visfatin level in two groups. Moreover, significant correlations were determined between HbA(1c) and FBS, homeostatic model assessment (HOMA) and insulin, total cholesterol (TC) level and body mass index (BMI) in non-diabetic women and insulin and HOMA, FBS and HbA(1c), insulin and HOMA, systolic blood pressure (SBP) and diastolic blood pressure (DBP), BMI and TC and HbA(1c) and TC in the diabetic group. Furthermore, there were significant positive correlations between adipocytokines except for the resistin and leptin levels for both groups. The excellent (FBS and HbA(1c)), good (HOMA) and fair (visfatin, adiponectin and insulin) discriminators of diabetic women were determined based on specificities and sensitivities level. The more selected features in the ML method were FBS, apelin, visfatin, TC, HbA(1c) and adiponectin. Conclusions: Thus, the subset of features involving FBS, apelin, visfatin and HbA(1c) are significant features and make the best discrimination between groups. In this study, based on statistical and ML results, the useful biomarkers for discrimination of diabetic women were FBS, HbA(1c), HOMA, insulin, visfatin, adiponectin and apelin. Eventually, we designed useful software for identification of T2DM and the healthy population to be utilized in clinical diagnosis.