High Prevalence of SARS-CoV-2 Genetic Variation and D614G Mutation in Pediatric Patients With COVID-19

Abstract
Full spectrum of disease phenotype and viral genotype of COVID-19 have yet to be thoroughly explored in children. Here, we analyze the relationships between viral genetic variants and clinical characteristics in children. Whole genome sequencing was performed on respiratory specimens collected on all SARS-CoV-2 positive children (n=141) between March 13 to June 16, 2020. Viral genetic variations across the SARS-CoV-2 genome were identified and investigated to evaluate genomic correlates of disease severity. Higher viral load was detected in symptomatic patients (p=0.0007) and in children <5 years old (p=0.0004). Genomic analysis revealed a mean pairwise difference of 10.8 SNVs and the majority (55.4%) of SNVs led to an amino-acid change in the viral proteins. The D614G mutation in the spike protein was present in 99.3% of the isolates. The calculated viral mutational rate of 22.2 substitutions/year contrasts the 13.5 substitutions/year observed in California isolates without the D614G mutation. Phylogenetic clade 20C was associated with severe cases of COVID-19 (p=0.0467, OR=6.95). Epidemiological investigation revealed major representation of 3 of 5 major Nextstrain clades (20A, 20B and 20C) consistent with multiple introductions of SARS-CoV-2 in Southern California. Genomic evaluation demonstrated greater than expected genetic diversity, presence of the D614G mutation, increased mutation rate, and evidence of multiple introductions of SARS-CoV-2 into Southern California. Our findings suggests a possible association of phylogenetic clade 20C with severe disease but small sample size precludes a definitive conclusion. Our study warrants larger and multi-institutional genomic evaluation and has implications for infection control practices.