Universal probabilistic programming offers a powerful approach to statistical phylogenetics
Open Access
- 24 February 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Communications Biology
- Vol. 4 (1), 1-10
- https://doi.org/10.1038/s42003-021-01753-7
Abstract
Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here, we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.Keywords
Funding Information
- Vetenskapsrådet (2018-04620, 2013-4853)
- Stiftelsen för Strategisk Forskning (RIT15-0012)
- European Union Research and Innovation Program, Marie Sklodowska-Curie Actions (898120)
This publication has 46 references indexed in Scilit:
- Phylogenetic Inference via Sequential Monte CarloSystematic Biology, 2012
- Reconciling molecular phylogenies with the fossil recordProceedings of the National Academy of Sciences of the United States of America, 2011
- Inferring Speciation and Extinction Rates under Different Sampling SchemesMolecular Biology and Evolution, 2011
- Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model SelectionSystematic Biology, 2010
- EXTINCTION RATES SHOULD NOT BE ESTIMATED FROM MOLECULAR PHYLOGENIESEvolution, 2009
- On incomplete sampling under birth–death models and connections to the sampling-based coalescentJournal of Theoretical Biology, 2009
- Birth-Death Models in MacroevolutionAnnual Review of Ecology, Evolution, and Systematics, 2006
- Simulating normalizing constants: from importance sampling to bridge sampling to path samplingStatistical Science, 1998
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981
- On the Generalized "Birth-and-Death" ProcessThe Annals of Mathematical Statistics, 1948