key: cord-0787777-un5ynsrq authors: Wilkinson, S. A.; Richter, A.; Casey, A.; Osman, H.; Mirza, J. D.; Stockton, J.; Quick, J.; Ratcliffe, L.; Sparks, N.; Cumley, N.; Poplawski, R.; Nicholls, S.; Kele, B.; Harris, K.; The COVID-19 Genomics UK consortium,; Peacock, T. P.; Loman, N. J. title: Recurrent SARS-CoV-2 Mutations in Immunodeficient Patients date: 2022-03-02 journal: nan DOI: 10.1101/2022.03.02.22271697 sha: fb8013567f0fd5ebfe32e7ab0ee3215726fd7c5a doc_id: 787777 cord_uid: un5ynsrq Long-term SARS-CoV-2 infections in immunodeficient patients are an important source of variation for the virus but are understudied. Many case studies have been published which describe one or a small number of long-term infected individuals but no study has combined these sequences into a cohesive dataset. This work aims to rectify this and study the genomics of this patient group through a combination of literature searches as well as identifying new case series directly from the COG-UK dataset. The spike gene receptor binding domain (RBD) and N-terminal domains (NTD) were identified as mutation hotspots. Numerous mutations associated with variants of concern were observed to emerge recurrently. Additionally a mutation in the envelope gene, - T30I was determined to be the most recurrent frequently occurring mutation arising in persistent infections. A high proportion of recurrent mutations in immunodeficient individuals are associated with ACE2 affinity, immune escape, or viral packaging optimisation. There is an apparent selective pressure for mutations which aid intra-host transmission or persistence which are often different to mutations which aid inter-host transmission, although the fact that multiple recurrent de novo mutations are considered defining for variants of concern strongly indicates that this potential source of novel variants should not be discounted. Long-term SARS-CoV-2 infections in immunodeficient patients are important, but understudied (Moran et al., 2021) . Evolution of viruses during long-term infection is an important source of novel variation and is thought to be a key influence of the evolutionary dynamics of SARS-CoV-2 generally, and the emergence of new variants specifically. Notably Alpha and Omicron, which were responsible for recent epidemic waves globally, are hypothesised by some to have arisen during long-term infections (Msomi et al., 2021; Rambaut et al., 2020) . The Alpha variant (B.1.1.7) emerged abruptly with a constellation of novel mutations and a long branch length from its nearest common ancestor in the B.1.1 clade, during a time of extremely high surveillance in the UK (Rambaut et al., 2020) . A likely explanation is that the Alpha variant evolved within a single long-term host over a long period before emergence back into the general population. Evolution during long-term infection has been associated with the rapid accumulation of many mutations within a short period (Avanzato et al., 2020; Choi et al., 2020; Karim et al., 2021; Peacock et al., 2021; Riddell et al., 2022) . The Beta (B.1.351), Gamma (P.1), and Omicron (B.1.1.529) variants all emerged in similar circumstances to alpha, potentially suggesting that they also emerged from long-term infections. To better understand evolutionary pressures associated with viral evolution during long-term infections a dataset composed of 168 SARS-CoV-2 genomes associated with 28 patients with a range of conditions that result in immunodeficiency significant enough to prevent rapid viral clearance was compiled to examine the frequency of recurrent mutations. This builds upon previous work performing a similar analysis using case studies which included a total of ten patients (Peacock et al., 2021) . This analysis expands on previous studies by utilising a significantly larger dataset which increases the power, also many of the cases included are the alpha variant which have not been discussed in the context of long-term SARS-CoV-2 cases previously and potentially gives insight into future variant emergence, and lastly all genome series' were analysed using a single analysis pipeline. Patient-associated genome series were selected for inclusion via a literature search for case studies using the following search terms and filters: After 2019, "SARS-CoV-2", "nCoV-2019", "Immunodeficient", "Immunocompromised", "long-term", all searches took place between the dates 01/08/2021 and 30/11/2021. Other genome series were extracted from the COG-UK dataset, a UK-wide genomic surveillance repository (COVID-19 Genomics UK (COG-UK), 2020; Nicholls et al., 2021) . Genome series were only included if they met the following criteria: at least two genomes available on either public databases or via a request, evidence of long-term viral infection for a period no less than 28 days (some genome series covered a shorter period but the clinical information met this criterion), clinical information available was sufficient to indicate the nature of the patient's immune deficiency. For all genome series included in the dataset a Civet report (O'Toole et al., 2021a) was generated using Civet v3.0. These reports confirm that all genomes were the result of long-term infections rather than a super-infection or independent infection events by virtue of individual genomes sharing a recent common ancestor with a step-wise accumulation of mutations over time. A single genome from patient 11 was excluded due to a probable superinfection as described by . Figures were generated for each phylogeny generated with civet using ggtree (Yu et al., 2018) and are included within the supplementary material. When a genome series was selected for inclusion all genomes were placed within an individual multi-fasta file with a header identifying the patient via an identifier ("pt-1", pt-2" etc) and the number of days passed since the initial genome available within that genome series (the day 0 genome), in several cases this genome was collected after a lengthy period of active infection but only the time period covered by the genome series was considered in the analysis. Mutation calling was automated with an R script adapted from (Mercatelli et al., 2021) which utilises Nucleotide mummer (NUCmer) (Marçais et al., 2018) for genome alignment to an annotated SARS-CoV-2 reference sequence and defines SNPs, insertions, deletions, frameshifts, and inversions relative to this reference sequence (NCBI accession NC_045512.2). One change was made to the annotations of the reference in the case of the ORF1ab polyprotein gene NSP12 where the position was adjusted by a single nucleotide so that all mutation calls would be relative to the reading frame post the ribosomal frame-shift for simplicity; zero mutations were detected in the pre-ribosomal frame-shift region of NSP12 therefore no mutations were incorrectly annotated as a result. Processing of the mutation calls was performed with a Python script (https://github.com/BioWilko/recurrent-sars-cov-2-mutations/blob/main/mutation_call_analysis.py) to investigate de novo mutations, which were defined as observed mutations within a genome series which were not present at day 0 of the genome series. A cumulative count of each observed de novo mutation (DNM) was performed for each day between 0 and the maximum genome series length (218 days). When a deletion was observed all deletions with a reference position within 18 nucleotides of the reference position of the initial deletion regardless of length or position were clustered as a single region. Ambiguous nucleotides were not considered in mutation calling. The resultant dataframe was finally formatted with an R script and figures generated using ggplot2 (Wickham, 2016) . Schematic of SARS-CoV-2 genome with relevant ORFs annotated. DNMs with the highest frequency annotated by amino acid position and substitutions -X indicates multiple amino acids form DNMs at this position. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint number of occurrences is increased to twelve clearly demonstrating an enrichment of DNMs at this loci. The DNMs at the loci S:484 consist of: eight S:E484K, two S:E484G, and one each of S:E484Q, and S:E484A. AA loci clustering highlighted the loci S:330, S:440, and S:501 as recurrent for DNMs (≥ two occurrences in the period). The only recurrent deletions observed in the dataset were located within the NTD of S-gene; S:Δ67 region (recurrent deletion region 1/RDR1), S:Δ138 region (RDR2), and S:Δ243 region (RDR4) (McCarthy et al., 2021) . S:Δ138 region was the most frequent with four occurrences, followed by S:Δ67 region and S:Δ138 region with two occurrences respectively. Deletions within the S:67 region consisted of one S:Δ67 and one S:Δ69-70, the unconventional annotation is the result of the algorithm utilised to cluster deletions, the genome series in which S:Δ67 occurred already possessed S:Δ69 in its day 0 genome. S-gene constitutes just over one eighth of the overall SARS-CoV-2 genome by length; despite this ~34% (79/234) of the total DNM occurrences were observed within S-gene as well as 59% (13/22) of the recurrent DNMs. (encodes ORF10 protein), ORF3a (encodes ORF3a protein), ORF6 (encodes ORF6 protein), ORF7a (encodes ORF7a protein), and ORF8 (encodes ORF8 protein) the full details of the gene definitions used are available from .The first genome from each patient was considered to be day 0. The sampling periods and frequencies within the dataset was highly variable, 218 days was the longest time-period covered within the dataset but the majority were much shorter, the full details of the dataset are available in supplementary table 1. All mutations with cumulative frequencies ≥ 2 were labelled on-graph. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint When DNMs observed in these genes were clustered by AA loci the findings remained almost entirely unchanged other than in the case of the loci M:2 which was raised to three DNM occurrences by day 218 rather than the two presented in (Figure 3 ). ORF1ab polyprotein subdivided by gene in 168 genomes obtained from 28 patients. NSP12 mutations were annotated relative to the reading frame post ribosomal frameshift, no mutations were observed within NSP12 prior to this loci (13,468). The first genome from each patient was considered to be day 0. The sampling periods and frequencies within the dataset was highly variable, 218 days was the longest time-period covered within the dataset but the majority were much shorter, the full details of the dataset are available in supplementary table 1. All mutations with cumulative frequencies ≥ 2 were labelled on-graph. ORF1ab polyprotein genes, constituting the Non-structural proteins (NSPs) of SARS-CoV-2, demonstrated a larger number of recurrent mutations but still far fewer than in spike ( Figure 4 ). Six DNMs were notable for their occurrence frequency; NSP3:T504P, NSP3:T820I, NSP3:P822L, NSP3:K977Q, NSP4:T295I, and NSP12:V792I. ORF1ab contained eighty-six out of the 195 DNMs observed, but only six of the total of twenty-one of the recurrent DNMs. ORF1ab constitutes more than two-thirds of the overall SARS-CoV-2 genome by length making the number of overall DNMs within the polyprotein disproportionately lower than would be expected if the distribution were random. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint When DNMs observed within ORF1ab were clustered by AA loci the overall shape of the results remain broadly identical with two exceptions; NSP3:T504 and NSP3:P822 where their day 218 occurrences are raised to 3 and 4, respectively. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; Figure 5 : Spike mutational profiles of particular interest described by this study. Select spikes from late sequencing of 3 long-term Alpha infections shown as Spike schematics. Spike variants from WT Alpha, Delta and BA.1 Omicron shown for comparison. Mutations shown in grey are existing lineage-defining Alpha mutations. Mutations marked with an asterisk indicate mixed, but resolvable bases in the sequence. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint Each observed recurrent DNM was compared to the UKHSA VOC/VUI definition files (Table 2 ). S:E484K was the most frequent DNM to appear in VOC/VUI definitions with eleven appearances, then S:L452R with four, then S:T95I and S:Δ138/RDR2 region with three each, followed by NSP3:K977Q, NSP3:P822L, S:Q498R, S:Δ67/RDR1 region, and S:Δ243/RDR4 region with one each. Of the twenty-one recurrent DNMs observed in the analysis nine of them are considered defining mutations for a VOC/VUI. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint Not all mutations are discussed in detail, while a literature search has been performed for every recurrent DNM only those with sufficient literature available for discussion to be informative were included below. The frequency of RBD DNMs observed in this analysis is a significant finding; the RBD is a relatively small region of the SARS-CoV-2 genome making up less than two percent of the genome by length, but these account for 17 percent of all DNMs observed (Figure 1 ). It is clear that RBD mutations were the most strongly selected for in the immunocompromised patients included within the dataset. The sharp rise of S:E484K occurrences early in the period is biased due to the data from as a result of their sampling strategy and research focus. specifically discussed the emergence of S:E484K in long-term immunocompromised patients and published short periods of surveillance of these cases when the patients in question had significantly longer shedding periods to demonstrate this. However, even if this study is excluded S:E484K remains the most frequently occurring DNM within spike. The high frequency of the S:E484K occurrences is suggestive of a strong selective pressure; this is further demonstrated by the total of twelve DNMs observed at the S:484 loci. The two occurrences of S:E484G in the dataset also suggest that the glycine substitution is subject to differing selection pressures than the lysine substitution in S:E484K although this may be host dependent. In one of the two occurrences of S:E484G this change was transient and was replaced by S:E484K. There are two possible explanations for this observation; a secondary mutation, or both mutations occurred within the patient and the S:E484K subpopulation outcompeted the S:E484G population to become dominant. There is no single nucleotide change by which a G -> K AA change might occur, supporting the second possibility. If the second explanation is correct it would suggest that S:484 mutations are selected for generally. The large difference between the frequency of S:E484K in this dataset compared to the national COG-UK dataset further suggests that the selection pressures which caused S:E484K to be so frequent within this analysis are not true of the majority of hosts (Table 1) . S:E484K is also considered a defining mutation for a large number of variants, further indicating a . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint strong selection pressure for the mutation (Table 2 ). Despite its presence within a large number of variants it is only present within a small proportion of the COG-UK dataset suggesting that on a population level it may have a deleterious effect on transmission, however this may be explained by other factors such as variants with S:E484K not being common in the UK generally. A strong selective pressure for S:E484K was also observed by Zahradník et al, (2021) who discovered using an in vitro experimental evolution model, that >70% of clones in one library gained S:E484K and S:N501Y which were associated with a significant increase in ACE2 affinity. Furthermore they observed the occurrence of the mutation S:Q498R alongside S:N501Y in two repeats, this combination was observed to lead to significantly greater affinity to ACE2 compared to both wild-type and Alpha which rose further alongside S:E484K. This combination was only observed within a single patient (pt 19) although the combination E484G, Q498R and N501Y did arise in a further patient (pt 17), in both cases the infections were Alpha and therefore already possessed S:N501Y. At the time Zahradník et al was published this constellation of mutations had not been observed in wild virus but with the emergence of Omicron this combination has become significantly more frequent (albeit with E484A rather than E484K). The low occurrence frequency of S:N501Y compared to that observed by (Zahradník et al., 2021) is also notable but is partly explained by its high (nine out of twenty-eight) day zero frequency in the genome series', due to the high amount of long-term Alpha infections included in this study. When DNMs were clustered by AA loci S:501 was highlighted as recurrent however. Another notable observation is the two de novo occurrences of S:L452R (a defining mutation of Delta, Kappa and Epsilon variants) which aids both immune evasion and ACE2 affinity (Motozono et al., 2021) . S:Q493K has previously been identified by Huang et al, (2021) as a highly beneficial adaptation to a mouse host, improving spike binding affinity to murine ACE2 , its rarity in the overall SARS-CoV-2 population (58 in COG-UK dataset) suggests that it is not strongly selected for in a human host generally. The three occurrences in this dataset may suggest that S:Q493K does confer a benefit to the virus within the context of a long-term infection but not in transient infection. A highly similar mutation, S:Q493R, is a defining mutation of the Omicron variant. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 S:F486I has been observed to decrease the affinity of some neutralising antibodies to spike protein (Xu et al., 2021) , and may decrease the affinity of spike to ACE2 (Clark et al., 2021) , S:F486I has furthermore been associated with mink adaptation (Zhou et al., 2021) . S:490L has been observed to reduce the affinity of multiple mAbs as well as decreasing the neutralisation sensitivity of pseudovirus to convalescent sera, however it does not appear to have an impact on viral infectivity (Li et al., 2020) . It is noteworthy that a large number of mutations described in this present study are associated with enhanced human ACE2 affinity including Q493K, Q498R and N501Y (Starr et al., 2020) When AA loci clustering was performed recurrent DNMs at S:330 and S:440 were observed. Finally, although most of this study has considered mutations in isolation, several of the late stage long-term infections showed interesting combinations of mutations, particularly within Spike (figure 5). Patient 19 for example was an Alpha infection that had picked up a large number of mutations, many of which were in common with, or similar to Omicron, for example A67D, G142V, T95I, Δ210/L212I, E484K, and Q498R. A further case, patient 17 also contained E484G and Q498R alongside the Alpha lineage defining mutation, N501Y and pt 27 contained T95I, a further deletion at S:Δ138 region and G496S, in common with Omicron. T95I has been shown to bind of the human Tyrosine-protein kinase receptor UFO (AXL) and it has been suggested by (Singh et al., 2021) that AXL facilitates SARS-CoV-2 cell entry to the same extent as ACE2 in AXL overexpressed cell culture. NTD also has a substantial role in the antigenicity of spike with multiple escape mutations identified in this domain (Harvey et al., 2021) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint has not been commonly observed, but it is notable that the genome series in which S:Δ67 was observed already possessed S:Δ69 at day 0. S:Δ69-70 is also a defining mutation of the Alpha and Omicron variants and is responsible for the S-gene target failure observed in the PCR testing of alpha variant samples with TaqPath SARS-CoV-2 PCR kits (Kidd et al., 2021) . De novo occurrences of slightly differing deletions within the S:Δ138/RDR2 region were observed 4 times. This region makes up part of the "NTD antigenic supersite" which the majority of neutralising antibodies against the NTD target (McCallum et al., 2021b) . S:Δ140 has consequently been associated with a significant decrease in Ab neutralisation (Andreano et al., 2021; Liu et al., 2021) . Based upon the high number of occurrences, it appears likely that deletions in this region confer some benefit to the virus during long-term infections. As with S:N501Y, as well as S:Δ67, it is worth noting a substantial proportion of long-term infections already carried deletions in the S:Δ138 region at day 0 due to being the Alpha variant. Two occurrences of S:Δ243, another NTD supersite mutation, were also observed, another deletion which has been demonstrated to decrease Ab neutralisation in vitro (McCallum et al., 2021b; McCarthy et al., 2021) . The single recurrent signal peptide DMN, S:S13I, has been previously shown to mediate a shift of the cleavage site of the signal peptide which in turn facilitates immune evasion by causing a significant re-arrangement of the NTD antigenic supersite and its constituent internal disulphide bonding (McCallum et al., 2021a (McCallum et al., , 2021b . The most frequent de novo mutation observed outside of the spike gene is Envelope:T30I (the second most frequent mutation overall after S:E484X). This mutation was observed by Chaudhry et al, (2020) in a cell-culture passage experiment, where it conferred a growth advantage in Calu-3 cells but slowed growth in Vero E6 cells (Chaudhry et al., 2020) . The high frequency of E:T30I is strongly suggestive of a selective pressure during long-term infections and further suggests that the conditions experienced by the virus in immunocompromised patients may exist in a . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 similar selective environment as cell culture, potentially due to a lack of stability needed for transmission. The significant enrichment of E:T30I in this analysis compared to the COG-UK dataset ( (Fillâtre et al., 2021) . This also raises the hypothetical possibility that E:T30I may be considered a marker of long-term SARS-CoV-2 infections. Further study is necessary to determine the phenotypic effect of this mutation and its role in influencing within and between host fitness. Literature concerning mutations in ORF1ab is generally observational rather than experimental due to the current lack of tractable models to study them in vitro. The concentration of higher frequency mutations within the NSP3 gene is not surprising considering it is the largest gene within the ORF1ab polyprotein and is known to be a bulky, modular protein which may have some flexible linker regions which are fairly hypermutatable. Stanevich et al identified NSP3:T504P as a mutation associated with cytotoxic T cell epitope immune escape (Stanevich et al., 2021) . This work sought to determine recurrent mutations across the SARS-CoV-2 genome associated with long-term infections in immunodeficient patients. This study has several notable limitations: importantly a significant publication bias is likely to be present which may overemphasis the importance of some mutations. S:E484K especially is affected by this, the six genome series obtained from were published to demonstrate the emergence of S:E484K within immunocompromised patients. Further work will attempt to avoid this by utilising less biased sampling strategies from long-term infected patients, requiring a prospective study design that aims to regularly sample genomes from long-term infected patients. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 2, 2022. The majority of recurrently observed DNMs have been associated with immune escape, increased ACE2 affinity, or improved viral packaging and are generally not highly prevalent within the wider SARS-CoV-2 population (with the exception of some SARS-CoV-2 variants). These factors suggest that the conditions during long-term infections at least partly select for mutations which aid the virus with intra-host replication and persistence as opposed to the general SARS-CoV-2 population, where mutations which aid inter-host transmission are more strongly selected for. E:T30I in particular is worthy of further study as a potential marker of long-term SARS-CoV-2 infections. However, the large number of occurrences overlapping with variant defining mutations observed does indicate that patients within this category should not be discounted as a potential source of previous, or indeed future variants. The potential of mutations which aid cell-cell transmission within the host or improve viral packaging may affect virulence and any mutations within this category which do not impact viral transmissibility could have a significant impact. This is highly relevant as many of the most abundant mutations described in this dataset are found across many variant lineages. Furthermore, its possible sub-neutralising levels of antibodies which may be present in some cases (either homologous or from heterologous convalescent or monoclonal antibody treatments) could be selecting for the acquisition of antigenic mutations seen in several of these cases . At present it is unresolved where SARS-CoV-2 variants emerge from. One prevailing hypothesis is that some variants emerged from long-term chronic infections, generating novel advantageous combinations of mutations without the stringent selection pressure of transmission, eventually resulting in an outbreak and onward transmission. We have compared common mutations arising during chronic infections and described how many are shared with SARS-CoV-2 variant lineages. Furthermore we present evidence, based on a rare mutational signature, that the French B.1.616 variant lineage arose from a direct and recent spillover from a chronic infection. Overall the data presented here is consistent and supportive of the chronic infection hypothesis of SARS-CoV-2 variant emergence. Therefore we suggest identifying and curing chronic infections, preferably with combined antiviral therapy as would be used for more traditionally chronic viruses (HIV, HCV), both to the infected individual, but also to global health. Intrahost variation of SARS-CoV-2 is likely to play a . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint significant role within this patient group however the lack of raw data availability for the majority of the samples within this dataset makes this challenging (Chaudhry et al., 2020) . We anticipate this dataset will be maintained as a public resource to enable the study of long-term SARS-CoV-2 infections in immunodeficient patients for as long as it is deemed relevant to enable other researchers to contribute to this understudied, highly important, patient group (https://github.com/BioWilko/recurrent-sars-cov-2-mutations/blob/main/dataset/mutation_calls.csv). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 2, 2022. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 2, 2022. ; https://doi.org/10.1101/2022.03.02.22271697 doi: medRxiv preprint SARS-CoV-2 escape from a highly neutralizing COVID-19 convalescent plasma Case Study: Prolonged Infectious SARS-CoV-2 Shedding from an Asymptomatic Immunocompromised Individual with Prolonged Severe Acute Respiratory Syndrome Coronavirus 2 Replication in an Immunocompromised Patient Long-Term Evolution of SARS-CoV-2 in an Immunocompromised Patient with Non-Hodgkin Lymphoma. mSphere 6, e0024421 SARS-CoV-2 Quasispecies Mediate Rapid Virus Evolution and Adaptation Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host Longitudinal study of a SARS-CoV-2 infection in an immunocompromised patient with X-linked agammaglobulinemia SARS-CoV-2 evolution in an immunocompromised host reveals shared neutralization escape mechanisms An integrated national scale SARS-CoV-2 genomic surveillance network A new SARS-CoV-2 variant with high lethality poorly detected by RT-PCR on nasopharyngeal samples: an observational study SARS-CoV-2 variants, spike mutations and immune escape Q493K and Q498H substitutions in Spike promote adaptation of SARS-CoV-2 in mice Emergence of the E484K mutation in SARS-COV-2-infected immunocompromised patients treated with bamlanivimab in Germany Persistent SARS-CoV-2 infection and intra-host evolution in association with advanced HIV infection SARS-CoV-2 evolution during treatment of chronic infection Emergence of multiple SARS-CoV-2 mutations in an immunocompromised host S-Variant SARS-CoV-2 Lineage B1.1.7 Is Associated With Significantly Higher Viral Load in Samples Tested by TaqPath Polymerase Chain Reaction The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity A combination of cross-neutralizing antibodies synergizes to prevent SARS-CoV-2 and SARS-CoV pseudovirus infection MUMmer4: A fast and versatile genome alignment system SARS-CoV-2 immune evasion by variant B.1.427/B.1.429 N-terminal domain antigenic mapping reveals a site of vulnerability for SARS-CoV-2 Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape Recurrent emergence of SARS-CoV-2 spike deletion H69/V70 and its role in the Alpha variant B.1.1.7. Cell Rep Coronapp: A web application to annotate and monitor SARS-CoV-2 mutations Persistent SARS-CoV-2 infection: the urgent need for access to treatment and trials SARS-CoV-2 spike L452R variant evades cellular immunity and increases infectivity Africa: tackle HIV and COVID-19 together CLIMB-COVID: continuous integration supporting decentralised sequencing for SARS-CoV-2 genomic surveillance Genomics-informed outbreak investigations of SARS-CoV-2 using civet Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool SARS-CoV-2 one year on: evidence for ongoing viral adaptation Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations Severe clinical relapse in an immunocompromised host with persistent SARS-CoV-2 infection Generation of novel SARS-CoV-2 variants on B.1.1.7 lineage in three patients with advanced HIV disease (preprint) N-terminal domain of SARS CoV-2 spike protein mutation associated reduction in effectivity of neutralizing antibody with vaccinated individuals SARS-CoV-2 escape from cytotoxic T cells during long-term COVID-19 (preprint) Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding Long-Term Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Infectiousness Among Three Immunocompromised Patients: From Prolonged Viral Shedding to SARS-CoV-2 Superinfection Within-host evolution of SARS-CoV-2 in an immunosuppressed COVID-19 patient as a source of immune escape variants ggplot2: Elegant Graphics for Data Analysis A new coronavirus associated with human respiratory disease in China Domains and Functions of Spike Protein in SARS-Cov-2 in the Context of Vaccine Design Structure-based analyses of neutralization antibodies interacting with naturally occurring SARS-CoV-2 RBD variants Brigham and Women's Hospital Ragon Institute of Severe clinical relapse in an immunocompromised host with persistent SARS-CoV-2 infection Long-Term Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Infectiousness Among Three Immunocompromised Patients: From Prolonged Viral Shedding to SARS-CoV-2 Superinfection SARS-CoV-2 evolution during treatment of chronic infection Prolonged Severe Acute Respiratory Syndrome Coronavirus 2 Replication in an Immunocompromised Patient Long-Term Evolution of SARS-CoV-2 in an Immunocompromised Patient with Non-Hodgkin Lymphoma. mSphere 6, e0024421 • Contact: j.paulo.gomes@insa.min-saude Longitudinal study of a SARS-CoV-2 infection in an immunocompromised patient with X-linked agammaglobulinemia Emergence of the E484K mutation in SARS-COV-2-infected immunocompromised patients treated with bamlanivimab in Germany Within-host evolution of SARS-CoV-2 in an immunosuppressed COVID-19 patient as a source of immune escape variants Metadata curation, and Samples and logistics Metadata curation, and Software and analysis tools Project administration, and Samples and logistics Adhyana IK Mahanama 97 , Buddhini Samaraweera 97 , Sophia T Girgis Project administration, and Sequencing and analysis Project administration, and Software and analysis tools: Radoslaw Poplawski 43 Samples and logistics, and Software and analysis tools: Igor Starinskij 53 Sreenu Vattipally 53 Software and analysis tools, and Visualisation Leadership and supervision Michelle L Michelsen 105 , Christine M Sambles 105