key: cord-260048-yis26g81 authors: McNamara, Ryan P.; Caro-Vegas, Carolina; Landis, Justin T.; Moorad, Razia; Pluta, Linda J.; Eason, Anthony B.; Thompson, Cecilia; Bailey, Aubrey; Villamor, Femi Cleola S.; Lange, Philip T.; Wong, Jason P.; Seltzer, Tischan; Seltzer, Jedediah; Zhou, Yijun; Vahrson, Wolfgang; Juarez, Angelica; Meyo, James O.; Calabre, Tiphaine; Broussard, Grant; Rivera-Soto, Ricardo; Chappell, Danielle L.; Baric, Ralph S.; Damania, Blossom; Miller, Melissa B.; Dittmer, Dirk P. title: High-density amplicon sequencing identifies community spread and ongoing evolution of SARS-CoV-2 in the Southern United States date: 2020-10-20 journal: Cell Rep DOI: 10.1016/j.celrep.2020.108352 sha: doc_id: 260048 cord_uid: yis26g81 SARS-CoV-2 is constantly evolving. Prior studies focused on high case-density locations, such as the U.S. Northern and Western metropolitan areas. This study demonstrates continued SARS-CoV-2 evolution in a suburban Southern U.S. region by high-density amplicon sequencing of symptomatic cases. 57% of strains carried the spike D614G variant, which was associated with higher genome copy numbers and its prevalence expanded with time. Four strains carried a deletion in a predicted stem loop of the 3’ untranslated region. The data are consistent with community spread within local populations and the larger continental U.S. The data instill confidence in current testing sensitivity and validate “testing by sequencing” as an option to uncover cases, particularly non-standard COVID-19 clinical presentations. This study contributes to the understanding of COVID-19 through an extensive set of genomes from a non-urban setting and informs vaccine design by defining D614G as a dominant and emergent SARS-CoV-2 isolate in the U.S. CoV and MERS-CoV (Lu, Roujian et al., 2020) . Residues at the receptor-binding site have evolved for 84 better association with ACE2 compared to SARS-CoV (Wan, Yushun et al., 2020; Wrapp et al., 2020) 85 and can be attributed to these molecular features: five of the residues critical for binding to ACE2 are 86 different in SARS-CoV-2 as compared to SARS-CoV (Wan, Yushun et al., 2020; Wrapp et al., 2020) 87 and a functional polybasic cleavage site (RRAR) is present at the S1/S2 boundary of the SARS-CoV- Initial analyses of human SARS-CoV-2 genomes established three major variant types worldwide 98 (Forster et al., 2020) . Clade B was derived from A by a synonymous T8782C mutation in ORF1ab; 99 and a nonsynonymous C28144T mutation that changes a leucine to serine in ORF8 (Ceraolo and To provide finer granularity about biological changes during SARS-CoV-2 transmission, we southeastern U.S. from the start of the U.S. epidemic on March 3, 2020, until past the peak of the first 115 major wave of infections. The first case in North Carolina (NC) was reported on March 26, 2020. The 116 samples cover the period when community spread in NC was established, and when the state-wide 117 stay at home order was issued (March 30 -May 8, 2020). SARS-CoV-2 testing remains limited in many countries due to a shortage of personal 119 protective equipment, testing kits, and diagnostic capacity. The Centers for Disease Control (CDC) 120 guidelines during the time of sampling prioritized patients with specific clinical symptoms (fever, 121 cough, and shortness of breath) and curtailed testing to only a subset of all probable cases. Individuals not fitting the clinical criteria for testing, as well as asymptomatic individuals, were distribution of 10x coverage for all samples is presented in Figure 1A . As expected, more mapped 151 reads yielded higher coverage. Of the 33 negative controls, none had >10 2 total reads aligned. Of the 152 positive samples, greater than 5*10 3 total mapped reads were needed to obtain 1x coverage of the 153 whole genome, a minimum of 3.1x10 4 reads were needed to obtain >90% coverage at 10x. The 154 number of reads aligned varied depending on the viral load, as determined by real-time qPCR using 155 CDC primer N1, but not total RNA, as determined using RNAse P, of the samples ( Figure 1B) . In this 156 assay, any CP <35 for SARS-CoV-2 qPCR yielded reliable coverage, which increased linearly with 157 viral load. At a CP ≥35 most positive samples still yielded reads that mapped to the target genome 158 and thus allowed detection of SARS-CoV-2 sequences; however, the results were less consistent, 159 and coverage was more variable. As expected, total RNA (measured by RNAse P) was not 160 associated with sequencing coverage and varied considerably across samples, even though each 161 sample used the same amount of virus transport medium (VTM). The coverage level distribution is shown in Figure 1C a 50-fold range of input RNA; it was higher than RNA seq, except in the terminal regions that were not 185 covered by PCR amplicons. In some cases, as little as five microliters of VTM from a single swab had 186 sufficient virus to obtain a full-length viral genome sequence at 1000x. This data is consistent with the 187 astonishingly high reported genome copy numbers of SARS-CoV-2 in some cases 188 and demonstrates the principal suitability of "testing by sequencing" as a diagnostic option for SARS- CoV-2 and other rapidly evolving viruses. The average quality score per read is set to a minimal average phred score of 20 isolates, supported by multiple, independent junction-spanning reads (Figure 3 A, B) . Junctions were 271 mapped to single nucleotide resolution directly from individual reads. To confirm our deep-sequencing 272 results, we performed 3' UTR site-specific amplification and Sanger-based sequencing (Figure 3 E- G). The variant 3' end does not destroy overall folding but introduces a shorter stable hairpin ( Figure 274 3 C, D). How this mutation affects viral fitness remains to be established. In sum, this study generated exhaustive SNV information representing the introduction and 276 spread of SARS-CoV-2 across a suburban low-density area in the Southern U.S. All samples were 277 from symptomatic cases and the majority of genomes clustered with variants that predominate the 278 outbreak in the U.S., rather than Europe or China. This supports the notion that the majority of U.S. There seems to be partial overlap between the bulged stem-loop and the pseudoknot, suggesting that 330 these two structures are mutually exclusive and may serve as a switch to regulate the ratio of full GoTaq Promega Master mix (#M712C), 2.5µL primers (0.5µM), and brought to volume with nuclease-551 free water. The PCR was performed with an initial denaturation step a 95°C for 2 minutes, the PCR 552 cycled at 95°C for 30 seconds, 56°C for 1 minute, and 73°C for 1 minute for 40 cycles, followed by a 553 final extension at 73°C for 10 minutes and a 4°C hold. The annealing temperature was derived from 554 the primer pair melting temperatures. Primer PCR products were visualized on a 1.5% agarose gels 555 and PCR bands were gel purified using the Qiagen Gel Purification Kit (Qiagen Inc.). Purified DNA 556 was eluted in 30µL nuclease-free water and PCR products were cloned using TOPO-TA for SNV were called using the CLC bio algorithm (Qiagen Inc.) for human genome SNV calling. The threshold for reporting was set at >90% frequency and a minimum coverage of 10-fold with 574 balanced forward and reverse reads for all SNV. Targeted regions were determined via Thermo SARS-CoV-2 designed BED file and 576 sequences with 1x coverage across more than 99% of the 237 SARS-COV-2 amplicons were 577 considered complete sequences. Any sequences with 1x coverage between 5% and 99% were 578 considered partial genomes. Partial genomes are included in the variant calling analysis but were not 579 submitted to Genbank or GISAID. All consensus sequences derived from this study were manually curated to revert poly- All positions with less than 95% site coverage were eliminated, i.e., fewer than 5% alignment gaps, 598 missing data, and ambiguous bases were allowed at any position (partial deletion option). There were Coronavirus Susceptibility to the Antiviral Remdesivir (GS-5734) Is 617 Mediated by the Viral Polymerase and the Proofreading Exoribonuclease The proximal origin of 619 SARS-CoV-2 Presymptomatic SARS-CoV-2 Infections and Transmission in 622 a Skilled Nursing Facility SARS-CoV-2 Phylogenetic Analysis SARS-CoV-2 viral spike G614 mutation exhibits higher case 627 fatality rate Covid-19 in Critically Ill Patients in the Seattle 630 Region -Case Series Amino acid variation analysis of surface spike glycoprotein at 614 in SARS-CoV-2 strains Genomic variance of the 2019-nCoV coronavirus Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in 638 Wuhan, China: a descriptive study Molecular evolution of the SARS coronavirus during the course of the 640 SARS epidemic in China Detection of 2019 novel coronavirus (2019-nCoV) by real-643 time RT-PCR The species 645 Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-646 CoV-2 Could the D614 G substitution in the 648 SARS-CoV-2 spike (S) protein be associated with higher COVID-19 mortality? Coast-to-Coast Spread of SARS-CoV-2 during the Early 652 Epidemic in the United States Phylogenetic network analysis of SARS-654 CoV-2 genomes Genomic epidemiology of hCoV-19 Characterization of the RNA 658 components of a putative molecular switch in the 3' untranslated region of the murine coronavirus 659 genome A live, 661 impaired-fidelity coronavirus vaccine protects in an aged, immunocompromised mouse model of 662 lethal disease Evaluation of a 664 recombination-resistant coronavirus as a broadly applicable Clinical Characteristics of Coronavirus Disease 2019 in China Nextstrain: real-time tracking of pathogen evolution Temporal dynamics in viral shedding and transmissibility of COVID-19 SARS-CoV-2 Transmission from 677 Presymptomatic Meeting Attendee Faster quantitative real-time PCR protocols may 679 lose sensitivity and show increased variability SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 682 and Is Blocked by a Clinically Proven Protease Inhibitor Evolutionary and structural analyses of SARS-685 CoV-2 D614G spike protein mutation now documented worldwide Accurate sampling and 687 deep sequencing of the HIV-1 protease gene using a Primer ID Phylogenetic Analysis 690 and Structural Modeling of SARS-CoV-2 Spike Protein Reveals an Evolutionary Distinct and 691 Proteolytically Sensitive Activation Loop MAFFT multiple sequence alignment software version 7: 693 improvements in performance and usability Infection and Rapid Transmission of SARS-CoV-2 in Ferrets Tracking Changes in SARS-CoV-2 Spike: Evidence that 699 D614G Increases Infectivity of the COVID-19 Virus CoV-2 variants collected in Russia during the COVID-19 outbreak MEGA X: Molecular Evolutionary 704 Genetics Analysis across Computing Platforms Functional assessment of cell entry and receptor usage 706 for SARS-CoV-2 and other lineage B betacoronaviruses Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. 709 The 711 Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity Efficiency clustering for low-density 714 microarrays and its application to QPCR Antibody responses to SARS-CoV-2 in patients with COVID-19 Genomic Epidemiology of SARS-CoV-2 in Guangdong Province Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins 723 and receptor binding Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins 726 and receptor binding US CDC Real-Time Reverse Transcription PCR Panel for Detection of Severe 729 Acute Respiratory Syndrome Coronavirus 2 Standardization of Sequencing Coverage Depth in NGS: Recommendation for 732 Detection of Clonal and Subclonal Mutations in Cancer Diagnostics The neighbor-joining method: a new method for reconstructing 734 phylogenetic trees Burden of respiratory viral infection in persons with 737 human immunodeficiency virus. Influenza Other Respir Viruses Structural basis of receptor recognition by SARS-CoV-2 GISAID: Global initiative on sharing all influenza data -from vision to 743 reality Whole genome and phylogenetic analysis of two SARS-746 additional clues on multiple introductions 747 and further circulation in Europe Prospects for inferring very large phylogenies by using the 749 neighbor-joining method Coronavirus Disease 2019 in Children -United States Rapid reconstruction of SARS-CoV-2 using a synthetic 754 genomics platform Aerosol and Surface Stability of SARS-CoV-757 2 as Compared with SARS-CoV-1 Emergence of genomic diversity and recurrent mutations An outbreak of severe Kawasaki CoV-2 epidemic: an observational cohort study Antigenicity of the SARS-CoV-2 Spike Glycoprotein Receptor Recognition by the Novel 769 Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus Receptor Recognition by the Novel 772 Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus A phylogenetically conserved hairpin-type 3' 775 untranslated region pseudoknot functions in coronavirus RNA replication Virological assessment of hospitalized patients with COVID-778 2019 Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation A new coronavirus associated with human respiratory disease in China Factors 786 associated with prolonged viral RNA shedding in patients with COVID-19 Characteristics of pediatric SARS-CoV-2 infection and potential evidence for persistent fecal viral 790 shedding Genetic cluster analysis of SARS-CoV-2 and the 792 identification of those responsible for the major outbreaks in various countries Quantitative Detection and Viral Load Analysis of SARS-CoV-2 in Infected Patients Genomic characterization and phylogenetic analysis of SARS-799 COV-2 in Italy Viral and 801 host factors related to the clinical outcome of COVID-19 Clinical 804 course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a 805 retrospective cohort study A pneumonia outbreak associated with a new coronavirus of probable bat origin A Novel Coronavirus from Patients with Pneumonia in China SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients Genetic interactions between an 815 essential 3' cis-acting RNA pseudoknot, replicase gene products, and the extreme 3' end of the mouse 816 coronavirus genome