key: cord-0805472-5t1d9ckk authors: Chan, Ernest R.; Jones, Lucas D.; Linger, Marlin; Kovach, Jeffrey D.; Torres-Teran, Maria M.; Wertz, Audric; Donskey, Curtis J.; Zimmerman, Peter A. title: COVID-19 Infection and Transmission Includes Complex Sequence Diversity date: 2022-04-19 journal: bioRxiv DOI: 10.1101/2022.04.18.488717 sha: f3c7cf24006794ef87f37824563e9fc27afb8991 doc_id: 805472 cord_uid: 5t1d9ckk SARS-CoV-2 whole genome sequencing has played an important role in documenting the emergence of polymorphisms in the viral genome and its continuing evolution during the COVID-19 pandemic. Here we present data from over 360 patients to characterize the complex sequence diversity of individual infections identified during multiple variant surges (e.g., Alpha and Delta; requiring ≥ 80% genome coverage and ≥100X read depth). Across our survey, we observed significantly increasing SARS-CoV-2 sequence diversity during the pandemic and frequent occurrence of multiple biallelic sequence polymorphisms in all infections. This sequence polymorphism shows that SARS-CoV-2 infections are heterogeneous mixtures. Convention for reporting microbial pathogens guides investigators to report a majority consensus sequence. In our study, we found that this approach would under-report at least 79% of the observed sequence variation. As we find that this sequence heterogeneity is efficiently transmitted from donors to recipients, our findings illustrate that infection complexity must be monitored and reported more completely to understand SARS-CoV-2 infection and transmission dynamics involving both immunocompetent and immunocompromised patients. Many of the nucleotide changes that would not be reported in a majority consensus sequence have now been observed as lineage defining SNPs in Omicron BA.1 and/or BA.2 variants. This suggests that minority alleles in earlier SARS-CoV-2 infections may play an important role in the continuing evolution of new variants of concern. AUTHOR SUMMARY Evolution of the virus causing COVID-19 (SARS-CoV-2) has been associated with significant transmission surges. With evolution of SARS-CoV-2, evidence has accumulated regarding increased transmissibility of lineages, varying severity of illness, evasion of vaccines and diagnostic tests. Continuous tracking of SARS-CoV-2 lineage evolution distills very large and complex viral sequence data sets down to consensus sequences that report the majority nucleotide at each of over 29,000 positions in the SARS-CoV-2 genome. We observe that this eliminates considerable sequence variation and leads to a significant underestimation of SARS-CoV-2 infection diversity and transmission complexity. Additionally, concentration on the majority consensus sequence diverts attention from genetic variation that may contribute significantly to the continuing evolution of the COVID-19 pandemic. SARS-CoV-2 whole genome sequencing has played an important role in documenting the emergence of polymorphisms in the viral genome and its continuing evolution during the COVID-19 pandemic. Here we present data from over 360 patients to characterize the complex sequence diversity of individual infections identified during multiple variant surges (e.g., Alpha and Delta; requiring ³ 80% genome coverage and ³100X read depth). Across our survey, we observed significantly increasing SARS-CoV-2 sequence diversity during the pandemic and frequent occurrence of multiple biallelic sequence polymorphisms in all infections. This sequence polymorphism shows that SARS-CoV-2 infections are heterogeneous mixtures. Convention for reporting microbial pathogens guides investigators to report a majority consensus sequence. In our study, we found that this approach would under-report at least 79% of the observed sequence variation. As we find that this sequence heterogeneity is efficiently transmitted from donors to recipients, our findings illustrate that infection complexity must be monitored and reported more completely to understand SARS-CoV-2 infection and transmission dynamics involving both immunocompetent and immunocompromised patients. Many of the nucleotide changes that would not be reported in a majority consensus sequence have now been observed as lineage defining SNPs in Omicron BA.1 and/or BA.2 variants. This suggests that minority alleles in earlier SARS-CoV-2 infections may play an important role in the continuing evolution of new variants of concern. Early investigation into the outbreak of the novel pneumonia of unknown cause, from the city of Wuhan in late December 2019, quickly identified the genomic sequence [1, 2] associated with a coronavirus, closely related to the SARS-CoV [3, 4] . The virus, SARS-CoV-2, is now understood to be the causative agent of COVID-19. Sequence variation across the ~30kb positive-strand RNA SARS-CoV-2 genome was reported in the earliest studies, including single nucleotide polymorphisms (SNP), insertions and deletions (indels) [1, 5, 6] From earlier studies of the SARS-CoV-1 virus, it was suggested that the error-correcting capacity of the RNA-dependent RNA polymerase (RdRP) conferred by nonstructural protein 14 (NSP14) [13, 14] would contribute to replication error rates significantly lower than other RNA viruses (e.g. HIV-1 and influenza virus) [15, 16] . Studies comparing SARS-CoV-1 and SARS-CoV-2 estimate mutation rates that range from 1-3 × 10 −3 substitutions/site/year [8, [17] [18] [19] , respectively (extrapolating to 30-90 mutations/genome/year) are consistent with this prediction. Data provided through the Global Initiative on Sharing All Influenza Data (GISAID), and NextStrain provides a continuously updating report on SARS-CoV-2 lineage classification [20, 21] and illustrates that an apparently low mutation rate is no guarantee that the virus will exhibit limited capacity for variation, particularly once it has become so widely dispersed through millions of daily infections [6, 22, 23] . Data analyzed for lineage tracking relies on extensive curation and validation of SARS-CoV-2 genomic data. This includes minimization and resolution of ambiguous sequence data. In fact, most sequence data used to monitor the COVID-19 pandemic distills the whole genome sequence generated from an infection sample, to a single consensus sequence (the majority nucleotide present at each genomic position) [24] [25] [26] [27] to national and international databases [28, 29] . Out of the wide range of SARS-CoV-2 genomics studies, sequence variation within infected individuals has been described in relatively few studies as intra-host single nucleotide variants (iSNVs) [30] [31] [32] [33] [34] [35] [36] [37] [38] . Outcomes of these studies appear to be mixed in their assessments of individual infection diversity and transmission outcomes but inevitably reveal and acknowledge that these variations within a single infection exists. Here we evaluate SARS-CoV-2 whole genome data from a sample of patients, administrative and medical staff experiencing COVID-19 in the VA Northeast Ohio Healthcare System. We have applied stringent inclusion criteria based on SARS-CoV-2 genome coverage and read-depth to normalize comparisons across sequences derived independently from individual patient isolates as well as patients who have had direct prolonged contact with other infected individuals. We examine multiple perspectives, including details of individual infection diversity and transmission outcomes which are critical for the successful management of the COVID-19 pandemic. A total of 364 SARS-CoV-2 full genome sequences were analyzed in comparison to the Wuhan reference genome (GenBank NC_045512). A total of 254 samples from patients (N=179) and healthcare personnel (N=75) with COVID-19 were collected to (a) perform high-resolution contact tracing of COVID-19 infections and (b) gain insight into SARS-CoV-2 transmission dynamics between donor and recipient individuals [36] [37] [38] . 1a ; Mann-Whitney U= 1,971.5, P<0.0001). Coverage across the SARS-CoV-2 genome for samples with sequence generated above (black line) and below (gray line) the coverage threshold is summarized in Fig. 1b and direct correlation between Ct and average coverage for each individual sample is provided in Fig. 1c . Among the 140 (two samples did not have RT-PCR data) sequences meeting our technical thresholds, the average coverage was 1,081X (range 223 to 2,235X) with an average of 279,408 uniquely mapped reads per sample (min=63,296, max=2,680,239). Further examination of our data showed that the ability to detect SNPs was not influenced by Ct or average coverage if a minimum coverage was reached ( Figs. 1d and 1e) . Samples that pass our inclusion criteria, ³80% of the genome covered at ³100X read depth, showed comparable number of SNPs defined by alternate allele frequency (AAF) > 5% (filled black circles) or SNPs defined by AAF > 50% (filled diamonds), regardless of Ct value. As expected, fewer SNPs were detected in samples that did not meet our inclusion (open circles and diamonds). With read depth of ³100X, we observed consistency in detecting expected sequence variations within our sampled population and across genomic sequences reported into the international databases. From these analyses we established confidence in detecting minor allele at frequencies ³5% (whether reference or alternate allele). For the 114 samples not meeting our criteria for inclusion, the average coverage was 71X (range 0 to 1,281X) and these samples were removed from subsequent analyses. A comparison of the genome sequence data across the total sampling period of our survey showed a significant increase in the number SNPs detected over time ( Fig. 1f or tripling (10 to 35 AAF ³50%) from April 2020 to August 2021. The two samples demonstrating substantially higher SNP counts (Figs. 1d-1f red and green arrows) will be covered in their epidemiological context below. Enumeration of SNPs across the SARS-CoV-2 genes and a full listing of all SNPs observed in our study are presented in Supplemental Table 3 ) that met the coverage thresholds, revealed a total of 1,096 SNV positions. Among these polymorphisms, 406 SNPs were observed in multiple infection samples. The variant positions across the SARS-CoV-2 genomic sequences detected with allele frequencies >5% reveal extensive biallelic polymorphism throughout our results ( Fig. 2 and Fig. 3 ). We first examine these data by assessing alternate vs reference allele frequency proportions (AAF:RAF) for the 75 SNV positions that were shared most commonly (in ³10 samples) (Fig. 2) . The red dashed line in the figure at 50% allele frequency is the commonly accepted cutoff for variant detection indicating major allele. Only those alternate allelic SNPs with frequencies above 50% would be reported as part of the consensus sequence [30] [31] [32] [33] [34] [35] [36] [37] [38] . To determine how often biallelic variations were included in SARS-CoV-2 sequences submitted to international repositories, we randomly selected and analyzed 110 additional COVID-19 infections (Supplemental Table 4 ) reported from 18 different laboratories or consortia with raw sequence data available in the sequence read archive (SRA) [39] ; the same ³80% coverage, ³100X read-depth inclusion threshold was applied. Results summarized in Supplemental Fig. 1 confirmed that all SRA files from the 110 external infections showed evidence of multiple biallelic iSNVs that would reflect the complexity of the SARS-CoV-2 infection (nucleotide positions characterized by biallelic iSNVs in our VA study population and the samples queried in the SAR are designated by a green dot in Following the consensus sequence reporting convention across the 87 positions, 36% of the SARS-CoV-2 variations with AAFs between 5-50% (369 iSNVs) would not have been reported across the infections studied. Given that all SRA data queried showed this complexity, we asked whether attempts to report biallelic sequence variation were being made through the use of standard ambiguity codes [40] used to identify heterozygosity in diploid genetic systems. We first observed that the international data repositories understandably discourage the use of ambiguity codes given the substantial effort in attempting to curate sequence and determine lineage relationships as the pandemic continues to emerge. When we queried the database of to August 1, 2020; September 28 to December 9, 2020; December 10, 2020 to April 13, 2021; April 15 to August 18, 2021), corresponding to breaks in sample availability ( Table 1) . In addition to the accumulation of increased polymorphism across the viral genome over time, closer examination of the genomic variations in the summary zebra plot (so-called due to the gray-scale patterns; Fig.3a ) revealed significant pattern shifts in the sequences that characterize the major variants in our sample population that correspond to epidemiological transition periods. Following the sample collection time series (Fig. 3a In evaluation of their genomic sequence data, variants are observed with an AAFs > 5% but not reaching major consensus, that are consistent with B.1.617.2-specific SNPs (Fig 3b) . Following conventional practices, the major variants (with frequencies above 50% allele frequency cutoff) identified for these infections would have been reported as B.1.1.7 and not a mixed strain infection of B.1.1.7/B.1.617.2. Observation of these minor alleles above the 5% cutoff demonstrates early detection of the next major variant of concern (Fig. 1f) . A final summary of the distribution of SNVs observed in our study population is presented in Fig. 4 . The data show that variation is significantly concentrated among the genes encoding structural proteins. This is particularly true for positions characterized by biallelic variation where AAF may not rise above 50% (Black components of stacked histogram bars); much of which is observed in the SARS-CoV-2 spike protein. Examination of the total infection sequence composition has also been considered in our study of infection transmission complexity from multiple exposure events. We performed bottleneck assessment using the exact beta-binomial method of Sobel-Leonard et al. [41] , that recognizes AAF and RAF proportions instead of treating nucleotide sequence variation on a presence/absence basis taken by alternative models [42, 43] . These analyses focused on 11 donor-recipient pairs, some that have been studied previously [36] [37] [38] . While donors were known from transmission events that occurred during commuter van transport events, we performed reciprocal calculations (A to B and B to A) for all transmission pairs. Results from our overall analyses show the maximum likelihood estimate (MLE) of the bottlenecks among our transmission pairs was between 6 to 123 virions (Fig. 5a) . These results include transmission events that occurred toward the end of pandemic Year-1 (December 2020 and January 2021) with a viral population characterized by greater genetic diversity compared to the first quarter of 2020 (Fig. 1f) . Paired SNP proportions in the "transmission plots" (Figs. 5b-5d) show the AAF for each position in the SARS-CoV-2 genome between the respective infected donor (left) and subsequent recipient (right). In these graphs, most of the nucleotide-specific data are at or below the 5% alternate allele threshold and therefore would be designated the reference allele Our study has revealed significant within host infection diversity of SARS-CoV-2 through full genome sequence analysis. We greatly appreciate the work of international data repositories such as GISAID and NextStrain for curating the millions of reported infections and believe that these repositories are a tremendous and invaluable asset for the monitoring of SARS-CoV-2. The shear amount of data that needs to be collected and annotated makes it understandable and perhaps necessary to represent these infections by a single consensus sequence. While the practice of reporting a single unambiguous sequence assists with tracking SARS-CoV-2 in regard to evolution and global distribution of lineages, emphasis on reporting the majority nucleotide signature significantly reduces our ability to detect emerging new variants that could ignite a new surge with public health impact. Additionally, the majority consensus approach limits our ability to understand the complexity of infections, dynamic changes in the viral population over time within infected individuals, and between donor and recipient individuals across transmission events. In this study, we addressed the majority consensus problem as it relates to SARS-CoV-2 infection through the use of standard ambiguity codes [40] . All consensus sequences reported to GenBank from this study have included ambiguity codes. Taking this approach, our study accurately identified two individuals who harbored mixed B. [49] . Confirming this potential will require consideration of minor alternate alleles as suggested here in more extensive whole genome analysis of SARS-CoV-2. In our assessment of transmission from infected donors to recipients we were also interested to see that the overall sequence diversity was largely retained. This appeared to be true regardless of the AAF:RAF mixtures (Figs. 5b to 5d) . These transmission plots provide both quantitative and qualitative perspectives on SARS-CoV-2 complexity. Many of the AAF:RAF mixtures were observed to be present in donor and recipient sequences at nearly identical ratios. For the events captured in commuter vans (two independent events) where the direction of transmission was known to be from infected drivers to uninfected passengers, this suggests that the majority of SARS-CoV-2 strain diversity is included during transmission and subsequent infection complexity [38] . This appeared to be true even though transmission occurred within the context of circulating air. Maximum likelihood estimates of the bottlenecks among our transmission pairs (van driver to passengers as well as others Fig. 5a) were similar or higher than those observed by Braun [35] and Lythgoe [34] , but did not reach the highest values reported by Popa [32] . Finally, given reports that have suggested evolution of complexed populations of SARS-CoV-2 in immunocompromised people [44] [45] [46] [47] , we were interested to compare infections between IM+ and IM-individuals. While normalizing for overall complexity by comparing infections that occurred within time-similar cohorts, we did not observe increased numbers of iSNVs in IMpatients. This appears to differ from the observations of Al Khatib et al. who reported increased numbers of biallelic subconsensus variants (indicating AAF:RAF mixtures) in individuals classified as severe cases compared to individuals with mild cases [30] with comorbidities similar to the IM-patients studied here. It is also important to acknowledge that we were not able to study single patients over extended time series known to have extended for up to one year by other investigators [46] . Therefore, we did not have the opportunity to compare potential evolution of SARS-CoV-2 variants within individual IM+ and IM-patients over time. The continuing dynamics of the COVID-19 pandemic have become increasingly complex and unpredictable. Underlying the challenge of COVID-19 transmission among humans, Sender et al. estimate that every infected person carries 1 billion to 100 billion virions during peak infection [50] . As a result, despite proof-reading activity within the SARS-CoV-2 RdRp that would limit mutation, efficient virus replication and transmission increasingly favors dispersal of mutations across abundant globally distributed infections (>460,000,000 to-date [22, 23] ). As This emphasizes the need to study genetic complexity of SARS-CoV-2 infections more intensively and requires increased examination to achieve a greater understanding of the capacity of this virus to evade our best efforts at mitigation. While there are efforts to sequence the virus and document how it is evolving, most of the scientific community recognizes that this effort needs to be increased [51] [52] [53] [54] [55] [56] . Coupled with this effort, approaches to report characteristics of multi-strain complex infections must be considered to enable closer monitoring and full understanding of the virus's evolutionary capacity. Lest it be thought that under-reporting of the genetic diversity of infections occurs only with SARS-CoV-2 genome sequence data, consensus sequence reporting is the submission practice for most (if not all) infectious disease pathogens to national and international data repositories As part of efforts to identify, provide care for infected individuals and limit the spread of COVID- Nasopharyngeal swab specimens were collected from patients and healthcare personnel. Clinical information was not available for healthcare personnel. For the patients, medical record review was performed to obtain information on age, medical conditions, medications, and severity of illness. COVID-19 was considered severe if oxygen saturation was <94% on room air or respiratory rate was >30 breaths/min or lung infiltrates >50% were present (https://www.covid19treatmentguidelines.nih.gov/overview/clinical-spectrum/). Patients were considered to be moderately or severely immunocompromised if they had conditions causing immunocompromise as defined by the Centers for Disease Control and Prevention (eg, receiving active cancer treatment for tumors or cancers of the blood, received an organ or stem cell transplant and taking medicine to suppress the immune system, advanced or untreated HIV infection, active treatment with high-dose corticosteroids or other drugs that suppress their immune response) (https://www.cdc.gov/coronavirus/2019ncov/vaccines/recommendations/immuno.html). All nasopharyngeal swabs were immersed in 2 mL of universal transport medium (Copan Diagnostics, Murrieta, CA) and stored at -80°C. RNA was extracted from 140ul the stored sample to yield approximately 100ul of purified RNA following Qiagen protocols (either QIAamp Viral RNA Mini Kit or the QIAcube HT, automated 96-well plate format). were performed for each new primer/probe lot used in this assay. This standard contains RNA transcripts for the N gene of SARS-CoV-2 Additional controls for E, ORF1ab, RdRp, and S genes were available, but stability of N gene amplification did not prompt varying RT-PCR assessments. The target gene transcript was quantitated to 200,000 copies/mL using digital droplet PCR. Each RT-qPCR run contained positive and no template controls, in addition to the negative control from the extraction process. Whole-genome sequencing of SARS-CoV-2 was performed on 254 clinical samples from COVID- Genomics (dual indexed CleanPlex SARS-CoV-2 Panel 918010 or 918011) [59] . Sequence was generated using an Illumina MiSeq sequencing platform. Raw sequences were aligned to COVID-19 reference sequence NC_045512.2. Primer sequences were trimmed from the ends of aligned reads using a masking file provided by Paragon Genomics using the software package fgbio. Samples were included in the final analysis if at least 80% of the genome was covered at > 100X read depth. Variants were called at positions with a minimum coverage of 100 reads with a base quality score >20. Positions were considered variable if any sample showed an alternate allele frequency at that position greater than 5%. These inclusion criteria are consistent and, in many ways, exceed benchmarks included in recently published quality assessments to evaluate efforts to sequence and describe SARS-CoV-2 genome sequence variation [60, 61] . We expanded our analyses to determine whether our observation of frequent biallelic iSNVs were characteristic of SARS-CoV-2 sequences submitted to international repositories. To perform this analysis, we randomly selected and analyzed 110 additional COVID-19 infections reported from 18 different laboratories or consortia with raw sequence data available in the sequence read archive (SRA) (Supplemental Table 4 ). Inclusion of SARS-CoV-2 full genome sequence for all samples in the final analysis (generated by the current study or outside labs) required at least 80% of the genome was covered at > 100X read depth. Variants were called at positions with a minimum coverage of 100 reads with a base quality score >20 using Samtools mpileup v1.8. Positions were considered variable if any sample showed an alternate allele frequency at that position greater than 5%. Consensus sequences were generated from this data by following conventional approaches, reporting the major nucleotide represented at each nucleotide position. Therefore, variant nucleotides in relation to NC_045512.2 were reported for any genomic position at which an alternate allele was detected at >50% in a sequenced sample. Additionally, sequences with ambiguity codes were also generated for allele frequencies between 5-95% using iVar Consensus (v.1.3) setting the minimum threshold t=0.95. Thus, in contrast to reporting sequence variation as limited to the standard nucleotide states (A-adenine; G-guanine; Ccytosine; U-uracil/T-thymine) iVar reports standard ambiguity codes to identify varying positions in the SARS-CoV-2 genome where more than one nucleotide state was observed at a frequency > 5% at a specific genome position (e.g. R= observation of A and G [40] ; see Supplement). These consensus sequences inclusive of biallelic ambiguity codes have been reported to GenBank (NCBI -GenBank Accession numbers OM988245-OM988384; Supplemental Table 3 ) and GISAID. Additionally, raw sequencing data have been submitted to the SRA. Evaluation of ambiguity code usage of sequence data available through Nextstrain was performed using awk on metadata available for 1,801,208 SARS-Cov-2 sequences accessible through Nextstrain up to October 3, 2021; data accessed on October 10, 2021. We thank the patients, administrators, engineering personnel, scientific and medical staff for their assistance with this study. We acknowledge Simone Edelheit for performing Illumina Legend to Fig. 1 Fig. 1 ) are designated by a green dot. Table S1 . SARS-CoV-2 Gene-Specific SNV Counts (Fig. 2 ) and the samples queried in the SAR are designated by a green dot. A new coronavirus associated with human respiratory disease in China A novel coronavirus associated with a respiratory disease in Wuhan of Hubei province The Genome sequence of the SARS-associated coronavirus Epub 2003/05/06 Characterization of a novel coronavirus associated with severe acute respiratory syndrome Epub 2003/05/06 A Novel Coronavirus from Patients with Pneumonia in China No evidence for increased transmissibility from recurrent mutations in SARS-CoV-2 Coronaviruses: an RNA proofreading machine regulates replication fidelity and diversity Epub 2011/05/20 Discovery of an RNA virus 3'->5' exoribonuclease that is critically involved in coronavirus RNA synthesis Rates of evolutionary change in viruses: patterns and determinants Low genetic diversity may be an Achilles heel of SARS-CoV-2 The origin and underlying driving forces of the SARS-CoV-2 outbreak Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2 Moderate mutation rate in the SARS coronavirus genome and its implications GISAID's Role in Pandemic Response Nextstrain: realtime tracking of pathogen evolution An interactive web-based dashboard to track COVID-19 in real time COVID-19Predict -Predicting Pandemic Trends Geographic and Genomic Distribution of SARS-CoV-2 Mutations Epub 2020/08/15 Molecular epidemiology of the first wave of severe acute respiratory syndrome coronavirus 2 infection in Thailand in 2020 Analysis of genomic distributions of SARS-CoV-2 reveals a dominant strain type with strong allelic associations Global SNP analysis of 11,183 SARS-CoV-2 strains reveals high genetic diversity Global Initiative on Sharing All Influenza Data (GISAID) Within-Host Diversity of SARS-CoV-2 in COVID-19 Patients With Variable Disease Severities Genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2 Intra-host variation and evolutionary dynamics of SARS-CoV-2 populations in COVID-19 patients CoV-2 within-host diversity and transmission SARS-CoV-2 infections harbor limited within-host diversity and transmit via tight transmission bottlenecks Use of whole-genome sequencing to investigate a cluster of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections in emergency department personnel Investigation of a cluster of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections in a hospital administration building Transmission of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) in a Patient Transport Van The 2022 Nucleic Acids Research database issue and the online molecular biology database collection Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984 Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus High-resolution Genomic Surveillance of Ebolavirus Using Shared Subclonal Variants Quantifying influenza virus diversity and transmission in humans SARS-CoV-2 variants in immunocompromised COVID-19 patients: The underlying causes and the way forward Emergence of Multiple SARS-CoV-2 Antibody Escape Variants in an Immunocompromised Host Undergoing Convalescent Plasma Treatment. mSphere Year-long COVID-19 infection reveals within-host evolution of SARS-CoV-2 in a patient with B cell depletion Within-host evolution of SARS-CoV-2 in an immunosuppressed COVID-19 patient as a source of immune escape variants An Overview of COVID-19 in Mutational cascade of SARS-CoV-2 leading to evolution and emergence of omicron variant The total number and mass of SARS-CoV-2 virions Novel SARS-CoV-2 variants: the pandemics within the pandemic. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases Multitude of coronavirus variants found in the US -but the threat is unclear Global variation in sequencing impedes SARS-CoV-2 Want to track pandemic variants faster? Fix the bioinformatics bottleneck Omicron blindspots: why it's hard to track coronavirus variants Why US coronavirus tracking can't keep up with concerning variants Centers for Disease Control and Prevention. Interim U.S. guidance for risk assessment and work restrictions for healthcare personnel 160 Accelerated Emergency Use Authorization (EUA) Summary Covid-19 RT-PCR Test (Laboratory Corporation of America) Highly sensitive and fullgenome interrogation of SARS-CoV-2 using multiplexed PCR enrichment followed by nextgeneration sequencing Assessment of SARS-CoV-2 Genome Sequencing Quality Criteria and Low-Frequency Variants External Quality Assessment of SARS-CoV-2 Sequencing: an ESGMD-SSM Pilot Trial across 15 European Laboratories 0 0 0 21 54 27 53 0 0 0 8 0 139 29 0 23 0 18 1 1 0 0 0 1 139 27 16 0 0 0 32 12 0 22 6 0 3 7 2 21 0 14 2 0 0 0 3 33 111 0 0 23 4 33 5 15 0 0 22 0 0 0 19 11 2 31 31 31 8 0 0 13 20 34 16 29 29 90 1 67 5 78 134 136 45 7 135 1 4 130 1 116 0 16 116 131 131 135 133 1 6 0 16 12 14 0 1 132 1 88 98 132 10 79 117 100 0 8 104 32 134 133 1 24 84 124 2 127 1 8 10 133 50 3 132 70 135 0 0 44 3 3 3 91 125 129 115 7 0 106 29 29 90 17 121 32 129 134 136 45 0 18 14 18 0 20 95 0 0 1 0 0 53 1 1 20 0 0 0 0 2 1 1 19 20 0 0 0 2 54 1 51 1 0 3 18 2 <50% 16 42 102 102 99 103 103 103 101 83 6 103 103 94 102 101 102 101 99 102 102 82 100 93 103 102 99 98 98 98 98 102 103 103 102 99 103 103 103 103 103 100 59 94 98 100 97 62 80 100 101 83 88 83 100 83 8 103 103 101 102 102 49 101 102 82 103 103 103 103 101 102 102 84 83 103 101 102 100 48 102 52 102 103 99 85 99 5-95% 3 2 2 2 2 2 2 2 2 4 2 2 7 10 2 3 3 11 11 2 2 4 2 9 2 2 2 2 3 2 5 3 2 3 2 2 3 3 3 2 2 3 4 2 7 3 2 2 44 2 2 20 19 5 2 3 2 2 22 2 2 2 54 2 2 19 25 24 42 49 2 2 2 19 19 2 2 2 2 4 2 10 2 2 2 14 2 Gene AA