A snapshot of SARS-CoV-2 genome availability up to 30thMarch, 2020 and its implications
Open Access
- 5 April 2020
- preprint content
- Published by Cold Spring Harbor Laboratory
Abstract
The SARS-CoV-2 pandemic has been growing exponentially, affecting nearly 900 thousand people and causing enormous distress to economies and societies worldwide. A plethora of analyses based on viral sequences has already been published, in scientific journals as well as through non-peer reviewed channels, to investigate SARS-CoV-2 genetic heterogeneity and spatiotemporal dissemination. We examined full genome sequences currently available to assess the presence of sufficient information for reliable phylogenetic and phylogeographic studies in countries with the highest toll of confirmed cases. Although number of-available full-genomes is growing daily, and the full dataset contains sufficient phylogenetic information that would allow reliable inference of phylogenetic relationships, country-specific SARS-CoV-2 datasets still present severe limitations. Studies assessing within country spread or transmission clusters should be considered preliminary at best, or hypothesis generating. Hence the need for continuing concerted efforts to increase number and quality of the sequences required for robust tracing of the epidemic.Significance Statement: Although genome sequences of SARS-CoV-2 are growing daily and contain sufficient phylogenetic information, country-specific data still present severe limitations and should be interpreted with caution.Keywords
This publication has 16 references indexed in Scilit:
- GISAID: Global initiative on sharing all influenza data – from vision to realityEurosurveillance, 2017
- W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysisNucleic Acids Research, 2016
- Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen)Virus Evolution, 2016
- IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood PhylogeniesMolecular Biology and Evolution, 2014
- BEAST: Bayesian evolutionary analysis by sampling treesBMC Evolutionary Biology, 2007
- Relaxed Phylogenetics and Dating with ConfidencePLoS Biology, 2006
- TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computingBioinformatics, 2002
- DAMBE: Software Package for Data Analysis in Molecular Biology and EvolutionJournal of Heredity, 2001
- Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic InferenceMolecular Biology and Evolution, 1999
- Likelihood-mapping: A simple method to visualize phylogenetic content of a sequence alignmentProceedings of the National Academy of Sciences of the United States of America, 1997