A Bayesian Inference Framework to Reconstruct Transmission Trees Using Epidemiological and Genetic Data

Open Access

15 November 2012

journal article
research article
Published by Public Library of Science (PLoS) in PLoS Computational Biology

Vol. 8 (11), e1002768
https://doi.org/10.1371/journal.pcbi.1002768

Abstract

The accurate identification of the route of transmission taken by an infectious agent through a host population is critical to understanding its epidemiology and informing measures for its control. However, reconstruction of transmission routes during an epidemic is often an underdetermined problem: data about the location and timings of infections can be incomplete, inaccurate, and compatible with a large number of different transmission scenarios. For fast-evolving pathogens like RNA viruses, inference can be strengthened by using genetic data, nowadays easily and affordably generated. However, significant statistical challenges remain to be overcome in the full integration of these different data types if transmission trees are to be reliably estimated. We present here a framework leading to a bayesian inference scheme that combines genetic and epidemiological data, able to reconstruct most likely transmission patterns and infection dates. After testing our approach with simulated data, we apply the method to two UK epidemics of Foot-and-Mouth Disease Virus (FMDV): the 2007 outbreak, and a subset of the large 2001 epidemic. In the first case, we are able to confirm the role of a specific premise as the link between the two phases of the epidemics, while transmissions more densely clustered in space and time remain harder to resolve. When we consider data collected from the 2001 epidemic during a time of national emergency, our inference scheme robustly infers transmission chains, and uncovers the presence of undetected premises, thus providing a useful tool for epidemiological studies in real time. The generation of genetic data is becoming routine in epidemiological investigations, but the development of analytical tools maximizing the value of these data remains a priority. Our method, while applied here in the context of FMDV, is general and with slight modification can be used in any situation where both spatiotemporal and genetic data are available. In order to most effectively control the spread of an infectious disease, we need to better understand how pathogens spread within a host population, yet this is something we know remarkably little about. Cases close together in their locations and timing are often thought to be linked, but timings and locations alone are usually consistent with many different scenarios of who-infected-who. The genome of many pathogens evolves so quickly relative to the rate that they are transmitted, that even over single short epidemics we can identify which hosts contain pathogens that are most closely related to each other. This information is valuable because when combined with the spatial and timing data it should help us infer more reliably who-transmitted-to-who over the course of a disease outbreak. However, doing this so that these three different lines of evidence are appropriately weighted and interpreted remains a major statistical challenge. In our paper we present a new statistical method for combining these different types of data and estimating trees that show how infection was most likely transmitted between individuals in a host population. Because sequencing genetic material has become so affordable, we think methods like ours will become very important for future epidemiology.

Keywords

This publication has 25 references indexed in Scilit:

Methods to infer transmission risk factors in complex outbreak data
Journal of The Royal Society Interface, 2011
Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data
Proceedings Of The Royal Society B-Biological Sciences, 2011
A Bayesian Phylogenetic Method to Estimate Unknown Sequence Ages
Molecular Biology and Evolution, 2010
Reconstructing disease outbreaks from genetic data: a graph approach
Heredity, 2010
The Global Circulation of Seasonal Influenza A (H3N2) Viruses
Science, 2008
The genomic and epidemiological dynamics of human influenza A virus
Nature, 2008
Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus
Proceedings Of The Royal Society B-Biological Sciences, 2008
Molecular Epidemiology of the Foot-and-Mouth Disease Virus Outbreak in the United Kingdom in 2001
Journal of Virology, 2006
Unifying the Epidemiological and Evolutionary Dynamics of Pathogens
Science, 2004
Modelling vaccination strategies against foot-and-mouth disease
Nature, 2002

Cited by 139 articles