key: cord-0682292-tw3z6rsx authors: LoTempio, J. E.; Billings, E. A.; Draper, K.; Ralph, C.; Moshgriz, M.; Duong, N.; Dien Bard, J.; Gai, X.; Wessel, D.; DeBiasi, R. L.; Campos, J. M.; Vilain, E.; Delaney, M.; Michael, D. G. title: Novel SARS-CoV-2 spike variant identified through viral genome sequencing of the pediatric Washington D.C. COVID-19 outbreak date: 2021-02-10 journal: nan DOI: 10.1101/2021.02.08.21251344 sha: 58edcef6d9519934eaf1c95c62335532f6ad78d1 doc_id: 682292 cord_uid: tw3z6rsx The SARS-CoV-2 virus has emerged as a global pandemic, severely impacting everyday life. Significant resources have been dedicated towards profiling the viral genome in the adult population. We present an analysis of viral genomes acquired from pediatric patients presenting to Children's National Hospital in Washington D.C, including 24 with primary SARS CoV2 infection and 3 with Multisystem Inflammatory Syndrome in Children (MIS-C) undergoing treatment at our facility. Viral genome analysis using next generation sequencing indicated that approximately 81% of the analyzed strains were of the GH clade, 7% of the cases belonged to the GR clade, and 12% of the cases belonged to S, V, or G clades. One sample, acquired from a neonatal patient, presented with the highest viral RNA load of all patients evaluated at our center. Viral sequencing of this sample identified a SARS-CoV-2 spike variant, S:N679S. Analysis of data deposited in the GISAID global database of viral sequences shows the S:N679S variant is present in eight other sequenced samples within the US mid-Atlantic region. The similarity of the regional sequences suggests transmission and persistence of the SARS-CoV-2 variant within the Capitol region, raising the importance of increasing the frequency of SARS-CoV-2 genomic surveillance. IMPORTANCE A variant in the SARS-CoV-2 spike protein was identified in a febrile neonate who was hospitalized with COVID-19. This patient exhibited the highest viral RNA load of any COVID-19 patient tested at our center. Viral sequencing identified a spike protein variant, S:N679S, which is proximal to the cleavage site at residue 681. The SARS-CoV-2 surface spike is a protein trimer (three subunits) which serves as the key target for antibody therapies and vaccine development. Study of viral sequences from the GISAID database revealed eight related sequences from the US mid-Atlantic region. The identification of this variant in a very young patient, its critical location in the spike polyprotein, and the evidence that it has been detected in other patients in our region underscores the need for increased viral sequencing to monitor variant prevalence and emergence, which may have a direct impact on recommended public health measures and vaccination strategies. INTRODUCTION 56 SARS-CoV-2, a positive-sense single-stranded RNA virus, is the causative agent of the ongoing 57 COVID-19 pandemic (1-3). Reports in early 2020 suggested that children were spared the 58 harshest manifestations of disease, with the majority of patients reported as asymptomatic (4). 59 The first wave of outbreaks across Europe and the Americas demonstrated that this was not the 60 case (5, 6), and in addition to the classical array of COVID-19 symptomatology, children were 61 shown to be susceptible to a novel disease presentation, The diagnosis of suspected COVID-19 patients at CNH, which includes D.C., Maryland, 75 Virginia, West Virginia, and Delaware in its hospital catchment area, is routinely performed 76 using semi-quantitative RT-PCR commercial platforms with EUA approved tests designed to 77 assess multiple loci within the SARS-CoV-2 genome. When a test returns as positive, that 78 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 19 outbreak within the D.C. area followed trends observed in the adult population, as 104 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint Novel SARS-CoV-2 spike variant identified in Washington, D.C. 5 established by the clades of viruses deposited in the Global Initiative on Sharing All Influenza 105 Data (GISAID) database and annotated for our geographic locale (Fig. 1B) . This finding 106 supports the hypothesis that the viral strains propagating in adults are similar to those in 107 children. France, with minor relationships to sequences deposited from Portugal, Australia, Israel, and 126 New Zealand (Fig. 2) . 127 The results point toward a European origin for the virus propagating in the US Capitol region 129 pediatric population. The observation of the US and UK as major locations of genetic similarity 130 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. variants in both the nucleoprotein (N:S193I) and non-structural protein 2 (NSP2:T371I), (Fig. 3) . 156 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. rather than a sequencing error or artifact of PCR (Fig. 4B) . In order to confirm with an 192 orthogonal sequencing technology, the sample was sent for confirmation of the variant by 193 Sanger sequencing, which confirmed the presence of the S:N679S variant (Fig. 4C) . The 194 Sanger data analysis also confirmed the presence of the S:D614G variant in the viral genome, 195 which is present in the majority of global samples due in part to its hypothesized greater 196 infective potential (12) . This is significant, as association of the S:N679S with S:D614G may 197 contribute to the persistence of the S:N679S variant. 198 199 To assess where the S:N679S variant is present in the community, we queried the GISAID 200 database. At the time of the initial query, the GISAID database contained six high-quality 201 complete genomes containing the S:N679S variant. All six of these genomes were deposited by 202 labs in Maryland and Virginia. In mid-December, 2020 re-query identified an additional four 203 samples from Australia and Japan, with a third re-query revealing a sequence from Brazil. 204 Finally, a fourth query on January 12, 2021 revealed two more high-quality sequences from 205 Delaware. Phylogenetic analysis (Fig. 5A) showed that this novel spike variant had emerged in 206 two distinct viral clades which are geographically and genetically independent of each other. 207 Variant profiles and lineage assignments suggest the singleton samples from Brazil and Japan, 208 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. To probe the similarity between the reported samples of US origin, we constructed a maximum-214 likelihood (ML) phylogeny with international samples for context (Fig. 5A) . The tree topology 215 among US mid-Atlantic regional samples is consistent with time-of-sampling metadata, which is Canadian-American borders (Fig. 5 B) . Variant ORF1a:S3885F, ORF1b:H2583Y, and ORF3a:S177I, as well as silent mutations C4276T, 283 T6160C, C16293T, C16887T, C26222T. This predicted most recent common ancestor, likely 284 appeared in summer 2020. Additional sequencing of archived samples will be required to 285 assess this hypothesis. 286 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. cases. We leverage data from Australia (59.6% of confirmed cases, or 17,081/28,650 cases 309 sequenced), Japan (3.3% or 9,885/298,000 cases), and Brazil (0.02% or 2,102/8.2m cases). 310 This differs from the UK, the world's largest COVID-19 viral genome contributor with 157,626 311 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint A key goal of precision medicine is to target care to those who need it most. An ideal COVID-19 339 diagnostic and triage algorithm would integrate features such as viral genomic profile, host 340 innate errors in immunity and viral load over the course of infection to inform care. Toward this 341 end, we established systems to profile the SARS-CoV-2 genome and link viral genotype to 342 pediatric disease outcomes. This report contains the initial analysis of twenty-seven patients 343 and highlights the complexity of generating effective viral genotype-human phenotype 344 correlations. Given the complex, multifactorial nature of COVID-19, larger pediatric studies 345 which link to phenotypic outcomes will be required. Brazil, Japan, and Australia have been assigned the GR clade, they do not belong to the same 362 PANGOLIN lineage due to variation in their genomes, which we can see recapitulated in the 363 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint tree topology of Fig. 5a . This is strong evidence for four different evolutionary events which are 364 tolerated in GR and G SARS-CoV-2 clades. 365 The UK presents a strong case for widespread use of high throughput sequencing technologies 367 and rapid data sharing so that discoveries like these are more rapidly made, confirmed, and 368 acted upon. Under current funding levels, the US CDC plans to fund the sequencing and 369 release of at least 6,000 viruses per week as reported on January 4, 2021 (25, 26). This 370 represents approximately 0.4% of daily cases based on current daily caseloads in excess of 371 200,000. This is still well below the UK's level of sequencing (4.99%), but is a commendable 372 step in the right direction, assuming that samples that are sequenced will be representative of 373 the population, allowing for surveillance. Inclusion of data on disease severity, patient age and 374 ethnicity, co-morbidities, and other relevant contextual data will help researchers ascertain the 375 generalizability of sequences in hand, or adjust for confounding factors as needed. Further 376 funds should be allocated to promote a collaborative effort to sequence biobanked samples in 377 an effort to understand viral evolution and transmission paths. 378 379 Our analyses identified the S:N679S variant within a neonatal patient with a high observed viral 380 load at presentation. This single case observation currently represents insufficient evidence to 381 propose a causal relationship between the S:N679S variant and increased viral loads or 382 presentation at a very young age. Analysis of the GISAID data for pediatric enrichment was not 383 possible due to the lack of patient metadata for records with this variant. The observation that 384 this variant strain of SARS-CoV-2 is currently undergoing community transmission in the US 385 mid-Atlantic area warrants continuous and rigorous monitoring. The SARS-CoV-2 spike protein 386 not only moderates viral infectivity and cellular uptake, but is also a target for vaccine and 387 monoclonal antibody therapeutic development. While vaccines are designed to elicit a 388 polyclonal immune response, the primary target of the vaccine response is currently a 389 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. System manufactured by Invitrogen™. A total of 10 PCR reactions containing different forward 437 and reverse primer pairings were prepared to confirm the observed variants. Thermocycler 438 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint Spike 573 mutation D614G alters SARS-CoV-2 fitness Prediction of proprotein convertase cleavage sites Cleavage 578 and Activation of the Severe Acute Respiratory Syndrome Coronavirus Spike Protein by 579 Human Airway Trypsin-Like Protease The role 581 of furin cleavage site in SARS-CoV-2 spike protein-mediated membrane fusion in the 582 presence or absence of trypsin SARS-CoV-2 growth, furin-cleavage-site adaptation and neutralization using serum from 586 acutely infected, hospitalized COVID-19 patients Spike glycoprotein and host cell determinants of SARS-CoV-2 entry 590 and cytopathic effects Downloaded from Nextclade. 592 19. Excess Deaths Associated with COVID-19 CDC New variant of SARS-CoV-2 in UK causes surge of COVID-19 The antibiotic resistance crisis: causes and threats Antibiotics: 602 Combatting tolerance to stop resistance. MBio 10. 603 25. Emerging SARS-CoV-2 Variants CDC hopes to check more samples for new Covid strain COVID-19) Update: FDA Authorizes Monoclonal Antibodies for Treatment 606 of COVID-19. US FDA High Prevalence of SARS-CoV Genetic Variation and D614G Mutation in Pediatric Patients with COVID-19 Better: Lessons Learned on Data Sharing in COVID-19 Pandemic Can Inform Future 614 GISAID Data Access Terms of Use Twelve years of SAMtools and BCFtools CLUSTAL W: Improving the sensitivity of 620 progressive multiple sequence alignment through sequence weighting, position-specific 621 gap penalties and weight matrix choice Building Phylogenetic Trees from Molecular Data with MEGA MEGA7: Molecular Evolutionary Genetics Analysis 625 Version 7.0 for Bigger Datasets Interactive Tree of Life (iTOL) v4: Recent updates and new 627 developments CoV-GLUE: A Web Application for 629 Tracking SARS-CoV-2 Genomic Variation 630 A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic 633 epidemiology MAFFT multiple sequence alignment software version 7: 635 Improvements in performance and usability RAxML version 8: a tool for phylogenetic analysis and post-analysis 637 of large phylogenies IQ-TREE 2: New Models and Efficient Methods for Phylogenetic 640 Inference in the Genomic Era Prediction of proprotein convertase cleavage sites the author/funder, who has granted medRxiv a license to display the preprint in perpetuity the author/funder, who has granted medRxiv a license to display the preprint in perpetuity Data access. GISAID is presently the leader in viral sequence data sharing, having rapidly 461 expanded their influenza data sharing capabilities to suit the COVID-19 pandemic (29). All data 462 from samples outside of CNH were accessed from the GISAID database in accordance with 463 their data sharing agreement (30). As of submission, sequences from CNH have not been 464 assigned GISAID accession IDs, but will be available in that repository. 465 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. Tang X, Wu C, Li X, Song Y, Yao X, Wu X, Duan Y, Zhang H, Wang Y, Qian Z, Cui J, Lu 516 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. Regist. https://www.federalregister.gov/documents/2020/03/16/2020-05578/suspension-568 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint Novel SARS-CoV-2 spike variant identified in Washington, D.C. risk-of 570 CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint Novel SARS-CoV-2 spike variant identified in Washington, D.C. 8 795 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint Novel SARS-CoV-2 spike variant identified in Washington, D.C. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 10, 2021. ; https://doi.org/10.1101/2021.02.08.21251344 doi: medRxiv preprint