Analysis of the ARTIC version 3 and version 4 SARS-CoV-2 primers and their impact on the detection of the G142D amino acid substitution in the spike protein
Preprint
- 28 September 2021
- preprint
- Published by Cold Spring Harbor Laboratory
Abstract
The ARTIC Network provides a common resource of PCR primer sequences and recommendations for amplifying SARS-CoV-2 genomes. The initial tiling strategy was developed with the reference genome Wuhan-01, and subsequent iterations have addressed areas of low amplification and sequence drop out. Recently, a new version (V4) was released, based on new variant genome sequences, in response to the realization that some V3 primers were located in regions with key mutations. Herein, we compare the performance of the ARTIC V3 and V4 primer sets with a matched set of 663 SARS-CoV-2 clinical samples sequenced with an Illumina NovaSeq 6000 instrument. We observe general improvements in sequencing depth and quality, and improved resolution of the SNP causing the D950N variation in the spike protein. Importantly, we also find nearly universal presence of spike protein substitution G142D in Delta-lineage samples. Due to the prior release and widespread use of the ARTIC V3 primers during the initial surge of the Delta variant, it is likely that the G142D amino acid substitution is substantially underrepresented among early Delta variant genomes deposited in public repositories. In addition to the improved performance of the ARTIC V4 primer set, this study also illustrates the importance of the primer scheme in downstream analyses.Importance: ARTIC Network primers are commonly used by laboratories worldwide to amplify and sequence SARS-CoV-2 present in clinical samples. As new variants have evolved and spread, it was found that the V3 primer set poorly amplified several key mutations. In this report, we compare the results of sequencing a matched set of samples with the V3 and V4 primer sets. We find that adoption of the ARTIC V4 primer set is critical for accurate sequencing of the SARS-CoV-2 spike region. The absence of metadata describing the primer scheme used will negatively impact the downstream use of publicly available SARS-Cov-2 sequencing reads and assembled genomes.Keywords
Other Versions
- Published version: Version Microbiology Spectrum, 9, preprints
This publication has 11 references indexed in Scilit:
- Trajectory of Growth of SARS-CoV-2 Variants in Houston, Texas, January through May 2021 Based on 12,476 Genome SequencesPublished by Cold Spring Harbor Laboratory ,2021
- Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in EnglandScience, 2021
- Sequence Analysis of 20,453 SARS-CoV-2 Genomes from the Houston Metropolitan Area Identifies the Emergence and Widespread Distribution of Multiple Isolates of All Major Variants of ConcernPublished by Cold Spring Harbor Laboratory ,2021
- The international nucleotide sequence database collaborationNucleic Acids Research, 2020
- A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiologyNature Microbiology, 2020
- Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 VirusCell, 2020
- Performance of neural network basecalling tools for Oxford Nanopore sequencingGenome Biology, 2019
- Minimap2: pairwise alignment for nucleotide sequencesBioinformatics, 2018
- GISAID: Global initiative on sharing all influenza data – from vision to realityEurosurveillance, 2017
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009