Genome feature optimization and coronary artery disease prediction using cuckoo search

Abstract
CVD (Cardiovascular Diseases) is among the major health ailment issue leading to millions of deaths every year. CVDs are resulting as an outcome of implications in terms of environmental and the genetic factors that result in the CVD for individuals. Phenomenal advancements that has taken place in the diagnosis solutions like usage of genomic tools are contributing towards predicting and diagnosis of heart diseases more accurately. In recent past, analyzing gene expression data, particularly using machine learning strategies to predict and classify the given unlabeled gene expression record is a generous research issue. Concerning this, a substantial requirement is feature optimization, which is since the overall genes observed in human body are closely 25000 and among them 636 are cardio vascular related genes. Hence, it complexes the process of training the machine learning models using these entire cardio vascular gene features. Hence, this manuscript is using bidirectional pooled variance strategy of ANOVA standard to select optimal features. Along the side to surpass the constraint observed in traditional classifiers, which is unstable accuracy at k-fold cross validation, this manuscript proposed a classification strategy that build upon the swarm intelligence technique called cuckoo search. The experimental study indicating that the number of optimal features those selected by proposed model is substantially low that compared to the other contemporary model that selects features using Forward Feature Selection and classifies using SVM classifier (FFSSVM). The experimental study evinced that the proposed model, which selects feature by bidirectional pooled variance estimation and classifies using proposed classification strategy that build on cuckoo search (BPVECS) outperformed the selected contemporary model (FFSSVM).