Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
Open Access
- 28 June 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Algorithms for Molecular Biology
- Vol. 16 (1), 1-18
- https://doi.org/10.1186/s13015-021-00189-2
Abstract
One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics for NP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a “supertree method”. Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees is NP-hard. Exact-RFS-2 is available in open source form on Github at https://github.com/yuxilin51/GreedyRFS.Keywords
Funding Information
- National Science Foundation (1535977, 1513629, 1458652, DGE-1144245)
This publication has 44 references indexed in Scilit:
- DACTAL: divide-and-conquer trees (almost) without alignmentsBioinformatics, 2012
- MRL and SuperFine+MRL: new supertree methodsAlgorithms for Molecular Biology, 2012
- Split-based computation of majority-rule supertreesBMC Evolutionary Biology, 2011
- An experimental study of Quartets MaxCut and other supertree methodsAlgorithms for Molecular Biology, 2011
- Robinson-Foulds SupertreesAlgorithms for Molecular Biology, 2010
- Supertree Methods for Building the Tree of LifePublished by Taylor & Francis Ltd ,2006
- Improved Parameterized Complexity of the Maximum Agreement Subtree and Maximum Compatible Tree ProblemsIEEE/ACM Transactions on Computational Biology and Bioinformatics, 2006
- A few logs suffice to build (almost) all trees (I)Random Structures & Algorithms, 1999
- Gene Trees in Species TreesSystematic Biology, 1997
- Phylogenetic inference based on matrix representation of treesMolecular Phylogenetics and Evolution, 1992