Ada-WHIPS: explaining AdaBoost classification with applications in the health sciences
Open Access
- 2 October 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Medical Informatics and Decision Making
- Vol. 20 (1), 1-25
- https://doi.org/10.1186/s12911-020-01201-2
Abstract
Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients’ disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning models and high dimensional data sources such as electronic health records, magnetic resonance imaging scans, cardiotocograms, etc. These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. In this work, we focus on AdaBoost, a black box model that has been widely adopted in the CAD literature. We address the challenge – to explain AdaBoost classification – with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost’s adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model’s decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting. Experiments on 9 CAD-related data sets showed that Ada-WHIPS explanations consistently generalise better (mean coverage 15%-68%) than the state of the art while remaining competitive for specificity (mean precision 80%-99%). A very small trade-off in specificity is shown to guard against over-fitting which is a known problem in the state of the art methods. The experimental results demonstrate the benefits of using our novel algorithm for explaining CAD AdaBoost classifiers widely found in the literature. Our tightly coupled, AdaBoost-specific approach outperforms model-agnostic explanation methods and should be considered by practitioners looking for an XAI solution for this class of models.Other Versions
This publication has 43 references indexed in Scilit:
- Interpretability of Fuzzy SystemsLecture Notes in Computer Science, 2013
- Subhealth state classification with AdaBoost learnerInternational Journal of Functional Informatics and Personalised Medicine, 2013
- Data Mining in Healthcare and Biomedicine: A Survey of the LiteratureJournal of Medical Systems, 2011
- Multi-class AdaBoostStatistics and Its Interface, 2009
- Predictive learning via rule ensemblesThe Annals of Applied Statistics, 2008
- Bootstrapping rule induction to achieve rule stability and reductionJournal of Intelligent Information Systems, 2006
- Using Rule Extraction to Improve the Comprehensibility of Predictive ModelsSSRN Electronic Journal, 2006
- Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)The Annals of Statistics, 2000
- A Decision-Theoretic Generalization of On-Line Learning and an Application to BoostingJournal of Computer and System Sciences, 1997
- Survey and critique of techniques for extracting rules from trained artificial neural networksKnowledge-Based Systems, 1995