key: cord-0697224-waxm01v9 authors: Choudhary, Manish C; Crain, Charles R; Qiu, Xueting; Hanage, William; Li, Jonathan Z title: SARS-CoV-2 Sequence Characteristics of COVID-19 Persistence and Reinfection date: 2021-04-27 journal: Clin Infect Dis DOI: 10.1093/cid/ciab380 sha: 3ab79d0029ad5aaa2a2b6aaa758624e8fc56d62f doc_id: 697224 cord_uid: waxm01v9 BACKGROUND: Both SARS-CoV-2 reinfection and persistent infection have been reported, but sequence characteristics in these scenarios have not been described. We assessed published cases of SARS-CoV-2 reinfection and persistence, characterizing the hallmarks of reinfecting sequences and the rate of viral evolution in persistent infection. METHODS: A systematic review of PubMed was conducted to identify cases of SARS-CoV-2 reinfection and persistence with available sequences. Nucleotide and amino acid changes in the reinfecting sequence were compared to both the initial and contemporaneous community variants. Time-measured phylogenetic reconstruction was performed to compare intra-host viral evolution in persistent SARS-CoV-2 to community-driven evolution. RESULTS: Twenty reinfection and nine persistent infection cases were identified. Reports of reinfection cases spanned a broad distribution of ages, baseline health status, reinfection severity, and occurred as early as 1.5 months or >8 months after the initial infection. The reinfecting viral sequences had a median of 17.5 nucleotide changes with enrichment in the ORF8 and N genes. The number of changes did not differ by the severity of reinfection and reinfecting variants were similar to the contemporaneous sequences circulating in the community. Patients with persistent COVID-19 demonstrated more rapid accumulation of sequence changes than seen with community-driven evolution with continued evolution during convalescent plasma or monoclonal antibody treatment. CONCLUSIONS: Reinfecting SARS-CoV-2 viral genomes largely mirror contemporaneous circulating sequences in that geographic region, while persistent COVID-19 has been largely described in immunosuppressed individuals and is associated with accelerated viral evolution. deal of uncertainty over the viral characteristics of reinfection cases, including the degree of sequence heterogeneity and the location of new mutations between the initial and reinfecting variants, if any. In addition, the diagnosis of COVID-19 reinfection has been complicated by the increasing reports of persistent COVID-19 infection, especially in immunosuppressed individuals. Like reinfection cases, persistent COVID-19 can also span the range of disease severity, from asymptomatic to severe disease, and recurrent symptoms can last for months [8] [9] [10] [11] . Differentiating between persistence and reinfection can be challenging, and little is known about differences in the location and quantity of SARS-CoV-2 mutations in these scenarios. We performed an analysis of SARS-CoV-2 sequences from published cases of COVID-19 reinfection and persistence, characterizing the hallmarks of reinfecting sequences and the rate of viral evolution in persistent infection. We conducted a systematic literature review in PubMed through March 8, 2021 for cases of persistent COVID-19 using the search term "((covid or sars-CoV-2) AND (persistent or persistence or prolonged)) AND (sequence or evolution)". A search for COVID-19 reinfection reports was made using the terms "(covid or sars-CoV-2) AND (reinfection)". Both peerreviewed and preprint results were evaluated. We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) for reviewing literature and for reporting search results. Additional preprints that appeared through Google search and that met our criteria were also included. For cases of reinfection, papers were included if the authors described it as a case of reinfection diagnosed >30 days after the initial infection and if A c c e p t e d M a n u s c r i p t whole genome SARS-CoV-2 sequences or sites of mutations relative to a reference sequence (e.g., Wuhan-Hu-1) from both infection time-points were available. Of the 291 results from the search, 14 articles met the inclusion criteria and were included in the present report along with 2 additional preprints that were identified (Supplemental Figure 1A ). Persistent cases were included if the authors described it as a case of persistent COVID-19 infection and if longitudinal whole genome SARS-CoV-2 sequences were available. The search returned 129 results, 7 of which met the inclusion criteria and were included in the present report along with one other preprint (Supplemental Figure 1B ). Only sequences obtained directly from patient respiratory tract samples were included in our analysis to exclude the possibility of sequence changes during the ex vivo culture process. Three cases were excluded due to uncertainty in their classification as either reinfection or persistent infection cases (Supplemental Methods, Supplemental The sequencing dataset contained a total of 262 globally representative SARS-CoV-2 genomes selected from GISAID and sequences from the reinfection and persistence cases (Supplemental Methods; Supplemental Data 1). The sampled sequences were chosen to be representative of global sequence diversity throughout the time course of the pandemic. Sequences of variants of concern B.1.1.7 and B.1.351 were also included. Nucleotide A c c e p t e d M a n u s c r i p t sequence alignment was performed using MAFFT (Multiple Alignment using Fast Fourier Transform) [12] . Best-fit nucleotide substitution was calculated using model selection followed by maximum likelihood (ML) phylogenetic tree construction using IQ-Tree with 1000-bootstrap replicates [12] . For reinfection cases, mutations were determined in two ways. First, nucleotide and amino acid changes were identified for the reinfection sequences relative to the first infection sequence. A c c e p t e d M a n u s c r i p t The temporal signal of the ML tree was examined in TempEst [14] regressing on root-to-tip divergence, and outliers were inspected in the distribution of residuals. A high degree of clock-like behavior in the whole dataset was observed (R 2 = 0.721) in root-to-tip regression analysis with the slope rate as 8. A total of twenty cases from sixteen reports were included in this analysis ( Phylogenetic analysis demonstrated distinct branching for the two sequences in each of the reinfection cases, corroborating results discussed in the original reports ( Figure 1 ). We compared nucleotide and amino acid changes in the reinfecting viral sequence compared to the initial sequence and found a median of 17.5 nucleotide changes (range 9-37) and 9 amino-acid changes (range 6-24) compared to the original sequence ( Figure 2A) . The nucleotide changes between the initial and reinfecting sequences were distributed across the SARS-CoV-2 genome, with significantly higher frequencies of changes in ORF8 (P<0.001) and N (P=0.001) ( Figure 2B) . A similar pattern was observed with amino acid changes (Supplemental Figure 3A ). All but two reinfection cases had at least one substitution or deletion in the S gene (Supplemental Table 3 ). Next, we assessed whether Figure 3C ). A total of nine cases from seven reports describing persistent infection were retrieved from our literature search. Of these nine cases, all but one had B cell immunodeficiency [8] [9] [10] [26] [27] [28] [29] . Four were treated with B cell-depleting therapy for lymphoma or autoimmune disorders, while four had B cell lymphomas treated with chemotherapy ( Table 2) . One patient had advanced HIV infection with a CD4+ count of 0 cells/mm 3 and diminished CD19+ cell counts. The median length of infection was 154 days and 33% of the cases ended in death. One patient had asymptomatic disease throughout [9] . Four patients were treated with convalescent plasma at least once during their illness [9, 10, 26, 28] , and one patient was treated with the monoclonal antibodies casirivimab and imdevimab [8] . Figure 5B ). Treatment with convalescent plasma or antibody cocktail treatment was insufficient to halt intra-host viral evolution ( Figure 3C ; Supplemental Figure 5C ). We also performed time-measured phylogenetic reconstruction with the pretreatment persistent sequences to compare the rate of intra-host viral evolution in persistent COVID-19 to the rate of community-driven evolution. This analysis provided further evidence that SARS-CoV-2 evolution appeared faster in these persistent infection individuals compared to the rate in the general public population, though substantial uncertainties are shown in these estimates given the limited sequence sampling in each patient ( Figure 3D ; Supplemental Table 4 ). We [30, 31] . While most cases of SARS-CoV-2 reinfection did involve infection with a different clade (including the variants of concern B.1.1.7 and P.1), it is noteworthy that mutations were identified throughout the genomes and the frequency of mutations within the S gene was not elevated relative to the rest of the genome. In addition, individuals with more severe reinfections did not have significantly greater frequency of S gene mutations. Interestingly, the genes with the highest frequency of mutations was ORF8 and N. ORF8 is a rapidly evolving accessory protein that may antagonize host immune function [32] while the nucleocapsid is a vital structural protein that also serves as a target for both humoral and cell-mediated immune responses [33] . Finally, the presence of rare mutations was uncommon in the reinfecting virus, which largely mirrored the contemporaneously circulating interval can be found in Supplemental Table 4 . A c c e p t e d M a n u s c r i p t Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection Symptomatic SARS-CoV-2 reinfection by a phylogenetically distinct strain COVID-19 re-infection by a phylogenetically distinct SARScoronavirus-2 strain confirmed by whole genome sequencing Genomic evidence for reinfection with SARS-CoV-2: a case study Symptomatic SARS-CoV-2 reinfection of a health care worker in a Belgian nosocomial outbreak despite primary neutralizing antibody response Confirmed Reinfection with SARS-CoV-2 Variant VOC-202012/01 Reinfection with SARS-CoV-2 and Failure of Humoral Immunity: a case report Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host Case Study: Prolonged Infectious SARS-CoV-2 Shedding from an Asymptomatic Immunocompromised Individual with Cancer Prolonged Severe Acute Respiratory Syndrome Coronavirus 2 Replication in an Immunocompromised Patient Neutralising antibodies in Spike mediated SARS-CoV-2 adaptation W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis Circos: An information aesthetic for comparative genomics Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen) Bayesian phylogenetics with BEAUti and the BEAST 1.7 A case of SARS-CoV-2 reinfection in Ecuador Spike E484K mutation in the first SARS-CoV-2 reinfection case confirmed in Brazil Evidence of SARS-CoV-2 re-infection with a different genotype Genomic Evidence of a Sars-Cov-2 Reinfection Case With E484K Spike Mutation in Brazil Asymptomatic reinfection in two healthcare workers from India with genetically distinct SARS-CoV-2 Assessment of the risk of SARS-CoV-2 reinfection in an intense re-exposure setting Three SARS-CoV-2 reinfection cases by the new Variant of Concern (VOC) P.1/501Y.V3 Clinical, virologic and immunologic features of a mild case of SARS-CoV-2 reinfection Evidence of SARS-CoV-2 reinfection without mutations in Spike protein Recurrent COVID-19 including evidence of reinfection and enhanced severity in thirty Brazilian healthcare workers SARS-CoV-2 evolution during treatment of chronic infection Long term SARS-CoV-2 infectiousness among three immunocompromised patients: from prolonged viral shedding to SARS-CoV-2 superinfection Persistent SARS-CoV-2 infection and increasing viral variants in children and young adults with impaired humoral immunity Long-term evolution of SARS-CoV-2 in an immunocompromised patient with non-Hodgkin lymphoma SARS-CoV-2 501Y.V2 escapes neutralization by South African COVID-19 donor plasma Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants Structure of SARS-CoV-2 ORF8, a rapidly evolving coronavirus protein implicated in immune evasion SARS-CoV-2-derived peptides define heterologous and COVID-19-induced T cell recognition Prospective mapping of viral mutations that escape antibodies used to treat COVID-19 Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England Detection of a SARS-CoV-2 variant of concern in South Africa Persistent replication of SARS-CoV-2 in a severely immunocompromised treated with several courses of remdesivir Intractable COVID-19 and Prolonged SARS-CoV-2 Replication in a CAR-T-cell Therapy Recipient: A Case Study Remdesivir failure with SARS-CoV-2 RNA-dependent RNA-polymerase mutation in a B-cell immunodeficient patient with protracted Covid-19 Persistent COVID-19 in an Immunocompromised Patient Temporarily Responsive to Two Courses of Remdesivir Therapy The longest persistence of viable SARS-CoV-2 with recurrence of viremia and relapsing symptomatic COVID-19 in an immunocompromised patient -a case study Correlates of protection against SARS-CoV-2 in rhesus macaques Low genetic diversity may be an Achilles heel of SARS-CoV-2 We thank Jeremy Luban and Ronald Bosch for their feedback and discussion. Dr. Li has consulted for Abbvie. All other authors have no potential conflicts to disclose. A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t Figure 3