An application of machine learning to haematological diagnosis
Open Access
- 11 January 2018
- journal article
- research article
- Published by Springer Science and Business Media LLC in Scientific Reports
- Vol. 8 (1), 1-12
- https://doi.org/10.1038/s41598-017-18564-8
Abstract
Quick and accurate medical diagnoses are crucial for the successful treatment of diseases. Using machine learning algorithms and based on laboratory blood test results, we have built two models to predict a haematologic disease. One predictive model used all the available blood test parameters and the other used only a reduced set that is usually measured upon patient admittance. Both models produced good results, obtaining prediction accuracies of 0.88 and 0.86 when considering the list of five most likely diseases and 0.59 and 0.57 when considering only the most likely disease. The models did not differ significantly, which indicates that a reduced set of parameters can represent a relevant "fingerprint" of a disease. This knowledge expands the model's utility for use by general practitioners and indicates that blood test results contain more information than physicians generally recognize. A clinical test showed that the accuracy of our predictive models was on par with that of haematology specialists. Our study is the first to show that a machine learning predictive model based on blood tests alone can be successfully applied to predict haematologic diseases. This result and could open up unprecedented possibilities for medical diagnosis.This publication has 30 references indexed in Scilit:
- Image processing and machine learning for fully automated probabilistic evaluation of medical imagesComputer Methods and Programs in Biomedicine, 2011
- Evaluation and Comparison of Diagnostic Test Performance Based on Information TheoryInternational Journal of Statistics and Applications, 2011
- Modern parameterization and explanation techniques in diagnostic decision support system: A case study in diagnostics of coronary artery diseaseArtificial Intelligence in Medicine, 2011
- Random forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosisComputers in Biology and Medicine, 2011
- LIBSVMACM Transactions on Intelligent Systems and Technology, 2011
- A systematic analysis of performance measures for classification tasksInformation Processing & Management, 2009
- Using random forest for reliable classification and cost-sensitive learning for medical diagnosisBMC Bioinformatics, 2009
- Multiple imputation of discrete and continuous data by fully conditional specificationStatistical Methods in Medical Research, 2007
- Relative Entropy as a Measure of Diagnostic InformationMedical Decision Making, 1999
- Contributions of the History, Physical Examination, and Laboratory Investigation in Making Medical DiagnosesObstetrical & Gynecological Survey, 1992