A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions
Open Access
- 1 July 2014
- journal article
- Published by Oxford University Press (OUP) in Journal of the American Medical Informatics Association
- Vol. 21 (4), 699-706
- https://doi.org/10.1136/amiajnl-2013-002162
Abstract
Background Data-driven risk stratification models built using data from a single hospital often have a paucity of training data. However, leveraging data from other hospitals can be challenging owing to institutional differences with patients and with data coding and capture. Objective To investigate three approaches to learning hospital-specific predictions about the risk of hospital-associated infection with Clostridium difficile, and perform a comparative analysis of the value of different ways of using external data to enhance hospital-specific predictions. Materials and methods We evaluated each approach on 132 853 admissions from three hospitals, varying in size and location. The first approach was a single-task approach, in which only training data from the target hospital (ie, the hospital for which the model was intended) were used. The second used only data from the other two hospitals. The third approach jointly incorporated data from all hospitals while seeking a solution in the target space. Results The relative performance of the three different approaches was found to be sensitive to the hospital selected as the target. However, incorporating data from all hospitals consistently had the highest performance. Discussion The results characterize the challenges and opportunities that come with (1) using data or models from collections of hospitals without adapting them to the site at which the model will be used, and (2) using only local data to build models for small institutions or rare events. Conclusions We show how external data from other hospitals can be successfully and efficiently incorporated into hospital-specific models.Keywords
This publication has 4 references indexed in Scilit:
- Predictors of prolonged hospital stay for the treatment of severe neuropsychiatric symptoms in patients with dementia: a cohort study in multiple hospitalsInternational Psychogeriatrics, 2013
- A Survey on Transfer LearningIEEE Transactions on Knowledge and Data Engineering, 2009
- Integrating syndromic surveillance data across multiple locations: effects on outbreak detection performance.2003
- The National Surgical Quality Improvement Program in Non-Veterans Administration HospitalsAnnals of Surgery, 2002