A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases
Open Access
- 9 March 2020
- journal article
- review article
- Published by Springer Science and Business Media LLC in npj Digital Medicine
- Vol. 3 (1), 1-11
- https://doi.org/10.1038/s41746-020-0229-3
Abstract
Autoimmune diseases are chronic, multifactorial conditions. Through machine learning (ML), a branch of the wider field of artificial intelligence, it is possible to extract patterns within patient data, and exploit these patterns to predict patient outcomes for improved clinical management. Here, we surveyed the use of ML methods to address clinical problems in autoimmune disease. A systematic review was conducted using MEDLINE, embase and computers and applied sciences complete databases. Relevant papers included "machine learning" or "artificial intelligence" and the autoimmune diseases search term(s) in their title, abstract or key words. Exclusion criteria: studies not written in English, no real human patient data included, publication prior to 2001, studies that were not peer reviewed, non-autoimmune disease comorbidity research and review papers. 169 (of 702) studies met the criteria for inclusion. Support vector machines and random forests were the most popular ML methods used. ML models using data on multiple sclerosis, rheumatoid arthritis and inflammatory bowel disease were most common. A small proportion of studies (7.7% or 13/169) combined different data types in the modelling process. Cross-validation, combined with a separate testing set for more robust model evaluation occurred in 8.3% of papers (14/169). The field may benefit from adopting a best practice of validation, cross-validation and independent testing of ML models. Many models achieved good predictive results in simple scenarios (e.g. classification of cases and controls). Progression to more complex predictive models may be achievable in future through integration of multiple data types.This publication has 190 references indexed in Scilit:
- Use of computerized algorithm to identify individuals in need of testing for celiac diseaseJournal of the American Medical Informatics Association, 2013
- The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and futureGenetics in Medicine, 2013
- Risk estimation and risk prediction using machine-learning methodsHuman Genetics, 2012
- A CD4 T cell gene signature for early rheumatoid arthritis implicates interleukin 6-mediated STAT3 signalling, particularly in anti-citrullinated peptide antibody-negative diseaseAnnals Of The Rheumatic Diseases, 2012
- Predicting Three-Year Kidney Graft Survival in Recipients with Systemic Lupus ErythematosusASAIO Journal, 2011
- An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findingsBMC Genetics, 2010
- Supervised machine learning and logistic regression identifies novel epistatic risk factors with PTPN22 for rheumatoid arthritisGenes & Immunity, 2010
- Recent insights in the epidemiology of autoimmune diseases: Improved prevalence estimates and understanding of clustering of diseasesJournal of Autoimmunity, 2009
- Abrogation of T cell quiescence characterizes patients at high risk for multiple sclerosis after the initial neurological eventProceedings of the National Academy of Sciences of the United States of America, 2008
- Epidemiology of autoimmune diseases in DenmarkJournal of Autoimmunity, 2007