Analysis of amino acid change dynamics reveals SARS-CoV-2 variant emergence
Preprint
- 13 July 2021
- preprint
- research article
- Published by Cold Spring Harbor Laboratory
Abstract
Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome1,2. The co-occurrence of specific amino acid changes, collectively named ‘virus variant’, requires scrutiny (as variants may hugely impact the agent’s transmission, pathogenesis, or antigenicity); variant evolution is studied using phylogenetics3–6. Yet, never has this problem been tackled by digging into data with ad hoc analysis techniques. Here we show that the emergence of variants can in fact be traced through data-driven methods, further capitalizing on the value of large collections of SARS-CoV-2 sequences. For all countries with sufficient data, we compute weekly counts of amino acid changes, unveil time-varying clusters of changes with similar – rapidly growing – dynamics, and then follow their evolution. Our method succeeds in timely associating clusters to variants of interest/concern, provided their change composition is well characterized. This allows us to detect variants’ emergence, rise, peak, and eventual decline under competitive pressure of another variant. Our early warning system, exclusively relying on deposited sequences, shows the power of big data in this context, and concurs to calling for the wide spreading of public SARS-CoV-2 genome sequencing for improved surveillance and control of the COVID-19 pandemic.Keywords
Other Versions
- Published version: Version Scientific Reports, 11, preprints
This publication has 38 references indexed in Scilit:
- COVID-19 in Amazonas, Brazil, was driven by the persistence of endemic lineages and P.1 emergenceNature Medicine, 2021
- Efficacy of the ChAdOx1 nCoV-19 Covid-19 Vaccine against the B.1.351 VariantThe New England Journal of Medicine, 2021
- Multiple SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunityCell, 2021
- Sensitivity of infectious SARS-CoV-2 B.1.1.7 and B.1.351 variants to neutralizing antibodiesNature Medicine, 2021
- One year of SARS-CoV-2 evolutionCell Host & Microbe, 2021
- Genetic Variants of SARS-CoV-2—What Do They Mean?JAMA, 2021
- Mutations on COVID-19 diagnostic targetsGenomics, 2020
- A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiologyNature Microbiology, 2020
- GISAID: Global initiative on sharing all influenza data – from vision to realityEurosurveillance, 2017