key: cord-0858983-ir5t6p4k authors: Tang, C. Y.; Wang, Y.; Gao, C.; Smith, D. R.; McElroy, J. A.; Li, T.; Segovia, K.; Haynes, T.; Hammer, R.; Sampson, C.; Ritter, D.; Schulze, C.; Trotman, R.; Lidl, G. M.; Webby, R.; Hang, J.; Wan, X.-F. title: Increased SAR-CoV-2 shedding associated with reduced disease severity despite continually emerging genetic variants date: 2021-02-05 journal: nan DOI: 10.1101/2021.02.03.21250928 sha: a30eb687431257ce3617935159fd3c999789ca25 doc_id: 858983 cord_uid: ir5t6p4k Since the first report of SARS-CoV-2 in December 2019, genetic variants have continued to emerge, complicating strategies for mitigating the disease burden of COVID-19. In this study, we investigated the emergence and spread of SARS-CoV-2 genetic variants in Missouri, examined viral shedding over time, and analyzed the associations among emerging genetic variants, viral shedding, and disease severity. The study population included COVID-19 positive patients from CoxHealth (Springfield, Missouri) and University of Missouri Health Care (UMHC; Columbia, Missouri) between March and October 2020. All positive SARS-CoV-2 nasopharyngeal swabs (n=8,735) from March-October 2020 were collected. Available viral genomes (n=184) from March to July were sequenced. Hospitalization status and length of stay were extracted from medical charts of 1,335 patients (UMHC and sequenced patients). The primary outcome was hospitalization status (yes or no) and length of hospital stay (days). For the 1,335 individuals, 44 were hospitalized and four died due to COVID-19. The average age was 34.35 (SD=16.82), with 55.1% females (n=735) and 44.7% males (n=596). Multiple introductions of SARS-CoV-2 into Missouri, primarily from Australia, Europe, and domestic states, were observed. Four local lineages rapidly emerged and spread across urban and rural regions in Missouri. While most Missouri viruses harbored Spike-D614G mutations, many unreported mutations were identified among Missouri viruses, including seven in the RNA-dependent RNA polymerase complex and Spike protein that were positively selected. A 15.6-fold increase in viral RNA levels in swab samples occurred from March to May and remained elevated through October. Accounting for comorbidities, individuals test-positive for COVID-19 with high viral loads were less likely to be hospitalized (odds ratio=0.39, 95% confidence interval=0.20, 0.77) and more likely to be discharged from the hospital sooner (hazard ratio=2.9, p=0.03) than those with low viral loads. Overall, the first eight months of the pandemic in Missouri saw multiple locally acquired mutants emerge and dominate in urban and rural locations. Although we were unable to find associations between specific variants and greater disease severity, Missouri COVID-positive individuals that presented with increased viral shedding had less severe disease by several measures. Phylogenetic and phylogeographic analyses: To identify likely seeding viruses for Missouri outbreaks, all 110,901 SARS-CoV-2 complete genomes available on the Global Initiative on Sharing Avian Influenza Data (GISAID) consortium 18 (September 20, 2020) were downloaded. The 297 genetically closest sequences to Missouri samples were selected using an alignment-free complete composition vector algorithm [19] [20] [21] [22] [23] [24] . Sequence alignments were performed using MUSCLE 25 , phylogenetic analyses using BEAST2 26,27 , positive selection analyses through PAML 28 , and sequence conservation visualization with SimPlot 29 (eMethods). Lineages were identified by PANGOLIN v2.0.8 (github.com/cov-lineages/pangolin). Unique Missouri sub-lineages were identified when they contained at least five samples with unpublished mutations, posterior probability >0.99, and sequence identity >99%. Statistical analyses: Kruskal-Wallis tests were used to analyze continuous variables and Fisher exact tests for categorical variables. Logistic regression models were used to assess the effect of viral load (Ct≤20, high; Ct>20, low) on hospitalization or length of hospital stay, controlling for demographics and comorbidities. The effect of viral load on length of hospitalization was tested with a Cox Proportional Hazards survival analysis accounting for censoring due to death. In all analyses, significance was defined at alpha=0.05. Analyses were performed using SAS Studio v3.8 (Cary, Indiana) Figure 1A) . Meanwhile, the weekly case fatality rate peaked at 12% in early May, then progressively decreased and remained below 0.02% through October. To study whether changes in viral shedding correlates with increased positivity rates, we analyzed the Characteristics of Study Population. We conducted chart reviews for 1,335 COVID-19 individuals from March-October 2020 ( Table 1 ). The largest age group was 18-29 years (n=562 of 1335, 42.13%). Our dataset consisted of 55.10% female (n=735) and 44 Phylogenetic analysis showed that Missouri viruses encompassed eight major PANGOLIN lineages, each of which was associated with an independent introduction ( Figure 2 , eTable 2). The virus evolved after each initial introduction, and formed four unique Missouri sub-lineages, at least one (MO-B.1.1.b) of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 5, 2021. which circulated from the May through July collection periods. All sub-lineages were supported by posterior probabilities >0.99 and sequence identities >99.99% (Figure 2A ). At least five lineages, including the four Missouri sub-lineages, co-circulated in Missouri during the week of July 2. All five lineages originated from lineage B.1 (containing D614G in the spike [S] protein), which has been predominant in Europe, Australia, and multiple states of United States (eFigure 2-3). Molecular characterization identified mutations across multiple regions of the viral genome ( Figure 2B ). The previously reported S-D614G and nonstructural protein [NSP]12-P314L mutations appeared in most of the Missouri samples. Compared to their precursor viruses, the Missouri viruses had 126 new mutations (eTable 3). NSP3 contained the most mutations (28 of 126 distinct mutations), followed by S (13), Nucleocapsid (N) (11) , and NSP2 (11) . The most common mutations include NSP12-C22F (n=18), NSP4-M366I (18) , open reading frame [ORF]8-S47F (12), NSP12-A2V (11) , and N-V270L (10). Six unique mutations ( Figure 2B , eFigure 2) were detected in four sub-lineages that appear to have emerged and spread in the Southwestern region of the state. MO-B.1.1.a (n=11 sequences) had mutation NSP12-A2V; 10 of these viruses were from Springfield, Missouri collected between July 2-9; MO-B.1.1.b (n=18) contained NSP4-M366I and NSP12-C22F from patients living within 60 miles of Springfield between May 14-July 9, 2020; MO-B.1.c (n=6) with NSP3-N1178T and NSP3-A1179T includes six samples with all but one arising from Monett, an urban center in Southwestern Missouri (50 miles southwest from Springfield), between July 2-6, 2020; MO-B.1.2.d (n=6) with NSP15-P262L were from Springfield or Brighton, Missouri (20 miles north of Springfield) between July 5-7, 2020. To explore how SARS-CoV-2 viruses were adapting in Missouri, we determined variant sites undergoing selective pressure ( Figure 2B -C and eTable 4). Multiple sites along the S protein, the protein mediating host receptor binding and viral entry, and the RNA-dependent RNA polymerase (RdRp) complex (NSP7, NSP8, NSF12, and NSP13) had evidence of positive selection ( Figure 3B ). Seven of these positively selective sites were unique to Missouri isolates (i.e., D1163Y in Spike, K36Q and T145I in NSP8, and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 5, 2021. ; https://doi.org/10.1101/2021.02.03.21250928 doi: medRxiv preprint T172I, T431I, K460R, and S468L in NSP13). NSP8 is a cofactor to NSP12 (RdRp) and is necessary for RNA synthesis, NSP12 functions in viral replication and transcription, and NSP13 works with NSP12 in replication and mRNA capping 31 . Of the selected samples, we collected multiple samples at different time points for four patients and found one case of reinfection (eFigure 4). This patient was a female in her 20s with asthma, obesity, anxiety, and depression, who reported chills, sore throat, dizziness, rhinorrhea, and fever during her initial positive COVID-19 test in March 2020. She was discharged and instructed to self-isolate. After two weeks, her symptoms had waned to encompass only cough and fatigue, and she was tested again due to her return-to-work requirements with another positive result. Interestingly, the two samples were from two distinct SARS-CoV-2 lineages; the first sample belonged to properties. Eighteen isolates selected as representative sample strains from each viral load category and lineage were recovered from swab samples with Ct-values ranging from 9.09-37.76 (median=16.32) ( Figure 3A ). We determined growth kinetics for these isolates ( Figure 3B ) and observed a large diversity in viral growth patterns, especially during the initial 24 hours. Linear regression of viral proliferation (measured by Ct-values) at 24 hours showed that an increase of 1 cycle correlated with a decrease of 8.83 log10 (plaque forming units/mL) (Pearson correlation coefficient=-0.59, p=0.01) ( Figure 3C ). Taken together, the growth kinetics analyses revealed that strains with higher viral loads in clinical samples proliferated more efficiently than those with lower viral loads. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. .03) were more likely to be hospitalized (eTable 5, Figure 4C ). Importantly, patients with high viral loads at sampling had fewer hospitalizations (OR=0.39, p=0.01). Of hospitalized patients, those with high viral load were discharged sooner (hazard ratio = 2.9, p=0.03) ( Figure 4B ) compared to patients with low viral loads. There was no difference in time from symptom onset to COVID-19 test between low and high viral load groups (5.38±10.48 days and 4.25±4.43 days, respectively) or in time from symptom onset to hospital admission (10.03±10.48 days and 8.75±4.09 days, respectively). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. ; https://doi.org/10.1101/2021.02.03.21250928 doi: medRxiv preprint Studies assessing SARS-CoV-2 viral load as a marker for COVID-19 disease severity have been inconclusive. Early studies found that viral loads were correlated with age, disease stage, severity, progression, and mortality 32-37 . Most of these studies observed patients during the early stages of the pandemic and often investigated already hospitalized patients. Our study expanded these observations to thousands of patients over an eight-month period, March-October 2020, which was powered to detect that the average viral load increased over the study period. Furthermore, patients with high viral loads were less likely to become hospitalized than patients with low viral loads even after adjusting for month of diagnosis, age, obesity, and diabetes. Thus, although advances in treatment have improved patient outcomes, this was unlikely the only cause for reduced hospitalizations. Our results did suggest that heart disease and hypertension were confounding variables, not associated with hospitalization after accounting for other variables in the model. Additionally, patients with high viral loads were more likely to become discharged sooner than those with low viral loads. Because viral load in nasopharyngeal swabs typically declines after the first week of infection 38,39 , one confounding factor may have been delayed testing, especially at the beginning of the pandemic from limited access to testing. We did not, however, find differences in time from symptom onset to initial COVID-19 swab between the high and low viral load groups among hospitalized patients. The clinical findings and demographics from our study are reflective of national data, lending confidence towards the generalizability of our patient population 1 . Exceptions include the distribution of cases within the 18-29-year age category where we noted 50% higher proportional incidence than CDC data; correspondingly, our >50 age categories were slightly less than national data. This reflects the catchment area of our study which includes the University of Missouri with a high proportion of college-aged students. Additionally, our population had slightly lower Asian, higher Black or African American and White, and over 50% lower Hispanic or Latino individuals than national data, reflecting the overall racial and ethnic distribution in Missouri. We also found that other risk factors for hospitalization included older age, elevated BMI, and diabetes mellitus, consistent with published studies 40,41 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The N501Y strain (B.1.1.7 lineage) was detected first in the United Kingdom 10 and then appeared in other countries 9 including the United States 11 . D614G mutations enhanced viral replication and transmission but not pathogenesis in laboratory settings 43, 44 . However, disease transmission and severity of these novel variants in humans remain unclear 45 . Prior studies suggested that D614G is associated with higher viral load 14, 46 , but the mutation was already predominant in both high and low viral load Missouri strains. In our findings, viral load increased over time (Figure 1) . We explored whether a particular genetic variant was associated with the increased viral loads and were unable to find a clear association. Thus, we speculate that throughout the pandemic, all emerging variants of the virus were adapting to the human populations with greater viral replication efficiency. Of interest, multiple sites, especially at the Spike protein and RdRp complex, across multiple Missouri sub-lineages were under positive selection ( Figure 3 ). Selection at the RdRp complex may affect viral replication and transcription 47, 48 , while selection at the S protein may affect host receptor binding and viral entry 47 . Further examination of mutations from this and other studies will elucidate the phenotypic effects of these mutations. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. Increasing case studies of reinfection and studies involving waning neutralizing antibody (Nab) titers raise concerns for herd immunity and long-lasting efficacy of vaccines [49] [50] [51] . CDC criteria for SARS-CoV-2 reinfection include persons with paired respiratory specimens at least 90 days apart and symptomatic persons 45-89 days after initial illness with respective respiratory specimens showing differing lineages 52 . Recent studies show Nab titers to SARS-CoV-2 decline as early as 23 days following initial infection 53 . In this study, a young female patient was identified with two genetically distinct SARS-CoV-2 strains within two weeks, indicating that re-infection can occur within a much shorter period than expected. There are several limitations to this study. Analyses with viral loads are limited by variability in nasopharyngeal swabbing techniques, which may cause inconsistencies in Ct-values, although the same sampling and processing protocol was used throughout this study. Additionally, during the initial phases of the pandemic, testing was generally limited to patients with more severe symptoms, potentially skewing viral load findings. Despite these limitations, we analyzed a large, representative sample, and adjusted for these confounders. In summary, multiple novel lineages were identified, and locally acquired mutations, present at both the urban and rural levels, remained predominant in the community. Although we were unable to find associations between specific variants and greater disease severity, Missouri COVID-positive individuals that presented with increased viral shedding had less severe disease by several measures. Continued monitoring of the impacts of these novel variants, particularly of those in the regions of vaccine targets, will be essential to the management of this pandemic. Author contributions: XFW conceived this study, XFW, JAM, and CYT designed the analysis, CYT, YW, KS, TL, TH, and CS collected the data, DRS, RH, DR, CS, RT, GML, JH, and XFW contributed data or analysis tools, CYT, YW, CG, DRS, and XFW performed the analysis, CYT and XFW wrote the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. ; https://doi.org/10.1101/2021.02.03.21250928 doi: medRxiv preprint paper, RW, DRS, JAM, TH, CYT, and XFW revised the paper. XFW and CYT had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Genome data may be accessed using GenBank Accession Numbers: MW004168, MW521383-MW521516, MW525282. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. ; is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. ; https://doi.org/10.1101/2021.02.03.21250928 doi: medRxiv preprint adjusted probabilities denoted with standard error bars. A significant regression was found (p<0.0001), where age is categorized as above 65 or below 65 years, obesity coded as Yes=1, No=0, month of initial presentation in month number (ie. March = 3), diabetes coded as Yes=1, No=0, and viral load coded as High (Ct ≤20) or Low (Ct>20). Age, body mass index, month of initial presentation, diabetes, and viral load are significant predictors of hospitalization. *, p<0.05. D) Forest plot of univariate logistic regressions for categorical factors tested for COVID-19 hospitalization. The panel on the right-hand side is adjusted for month of initial diagnosis. Log 10 of odds ratios (OR) and 95% confidence intervals (CI) are reported. ^, Obesity is out of 923 patients with available body mass index data. Intervals overlapping the value of 0 is considered not significant. *, p<0.05. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 5, 2021. COVID-19 as the Leading Cause of Death in the United States The 2019 novel coronavirus resource Variant analysis of SARS-CoV-2 genomes Emergence of genomic diversity and recurrent mutations in SARS-CoV-2. Infection A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Positive Selection of ORF1ab, ORF3a, and ORF8 Genes Drives the Early Evolutionary Trends of SARS-CoV-2 During the 2020 COVID-19 Pandemic Computational Inference of Selection Underlying the Evolution of the Novel Coronavirus, Severe Acute Respiratory Syndrome Coronavirus 2 Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa Early empirical assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom Distinct Patterns of Emergence of SARS-CoV-2 Spike Variants including N501Y in Clinical Samples in Columbus Ohio Spike mutation D614G alters SARS-CoV-2 fitness Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus ShowMeStrong Recovery Plan: Public Health Dashboard. State of Missouri US CDC Real-Time Reverse Transcription PCR Panel for Detection of Severe Acute Respiratory Syndrome Coronavirus 2 Rapid High Throughput Whole Genome Sequencing of SARS-CoV-2 by using One-step RT-PCR Amplification with Integrated Microfluidic System and Next-Gen Sequencing Global initiative on sharing all influenza data -from vision to reality A quantitative genotype algorithm reflecting H5N1 Avian influenza niches Ubiquitous reassortments in influenza A viruses Risk factors for severity and mortality in adult COVID-19 inpatients in Wuhan SARS-CoV-2 Variants. World Health Organizaiton SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity SARS-CoV-2 D614G variant exhibits efficient replication ex vivo and transmission in vivo Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity A clade of SARS-CoV-2 viruses associated with lower viral loads in patient upper airways Coronavirus Disease 2019-COVID-19 RNA-dependent RNA polymerase: Structure, mechanism, and drug discovery for COVID-19 Reinfection with SARS-CoV-2: Implications for Vaccines Longitudinal analysis of serology and neutralizing antibody levels in COVID19 convalescents Dynamics of neutralizing antibody titers in the months after SARS-CoV-2 infection Investigative Criteria for Suspected Cases of SARS-CoV-2 Reinfection (ICR) Longitudinal observation and decline of neutralizing antibody responses in the three months following SARS-CoV-2 infection in humans B.1.1 (99.97%) MO-B.1.1.b* (99.99%)