Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2
Open Access
- 1 June 2020
- journal article
- research article
- Published by Wiley in Journal of Medical Virology
- Vol. 92 (6), 602-611
- https://doi.org/10.1002/jmv.25731
Abstract
To investigate the evolutionary history of the recent outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China, a total of 70 genomes of virus strains from China and elsewhere with sampling dates between 24 December 2019 and 3 February 2020 were analyzed. To explore the potential intermediate animal host of the SARS-CoV-2 virus, we reanalyzed virome data sets from pangolins and representative SARS-related coronaviruses isolates from bats, with particular attention paid to the spike glycoprotein gene. We performed phylogenetic, split network, transmission network, likelihood-mapping, and comparative analyses of the genomes. Based on Bayesian time-scaled phylogenetic analysis using the tip-dating method, we estimated the time to the most recent common ancestor and evolutionary rate of SARS-CoV-2, which ranged from 22 to 24 November 2019 and 1.19 to 1.31 x 10(-3) substitutions per site per year, respectively. Our results also revealed that the BetaCoV/bat/Yunnan/RaTG13/2013 virus was more similar to the SARS-CoV-2 virus than the coronavirus obtained from the two pangolin samples (SRR10168377 and SRR10168378). We also identified a unique peptide (PRRA) insertion in the human SARS-CoV-2 virus, which may be involved in the proteolytic cleavage of the spike protein by cellular proteases, and thus could impact host range and transmissibility. Interestingly, the coronavirus carried by pangolins did not have the RRAR motif. Therefore, we concluded that the human SARS-CoV-2 virus, which is responsible for the recent outbreak of COVID-19, did not come directly from pangolins.Funding Information
- National Natural Science Foundation of China (31470268)
This publication has 55 references indexed in Scilit:
- MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and UsabilityMolecular Biology and Evolution, 2013
- Bayesian Phylogenetics with BEAUti and the BEAST 1.7Molecular Biology and Evolution, 2012
- New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0Systematic Biology, 2010
- Ecoepidemiology and Complete Genome Comparison of Different Strains of Severe Acute Respiratory Syndrome-Related Rhinolophus Bat Coronavirus in China Reveal Bats as a Reservoir for Acute, Self-Limiting Infection That Allows Recombination EventsJournal of Virology, 2010
- Many-core algorithms for statistical phylogeneticsBioinformatics, 2009
- Relaxed Phylogenetics and Dating with ConfidencePLoS Biology, 2006
- Bats Are Natural Reservoirs of SARS-Like CoronavirusesScience, 2005
- Application of Phylogenetic Networks in Evolutionary StudiesMolecular Biology and Evolution, 2005
- Bayesian Coalescent Inference of Past Population Dynamics from Molecular SequencesMolecular Biology and Evolution, 2005
- A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequencesJournal of Molecular Evolution, 1980