key: cord-0951459-fh2qg339 authors: de Silva, Thushan I.; Liu, Guihai; Lindsey, Benjamin B.; Dong, Danning; Moore, Shona C.; Hsu, Nienyun Sharon; Shah, Dhruv; Wellington, Dannielle; Mentzer, Alexander J.; Angyal, Adrienn; Brown, Rebecca; Parker, Matthew D.; Ying, Zixi; Yao, Xuan; Turtle, Lance; Dunachie, Susanna; Maini, Mala K.; Ogg, Graham; Knight, Julian C.; Peng, Yanchun; Rowland-Jones, Sarah L.; Dong, Tao title: The impact of viral mutations on recognition by SARS-CoV-2 specific T-cells date: 2021-10-28 journal: iScience DOI: 10.1016/j.isci.2021.103353 sha: 15476633dae6d0b95312b8594f30485e118fffe8 doc_id: 951459 cord_uid: fh2qg339 We identify amino acid variants within dominant SARS-CoV-2 T-cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T-cells assessed by IFN-γ and cytotoxic killing assays. Complete loss of T-cell responsiveness was seen due to Q213K in the A*01:01-restricted CD8+ ORF3a epitope FTSDYYQLY207-215, due to P13L, P13S and P13T in the B*27:05-restricted CD8+ nucleocapsid epitope QRNAPRITF9-17, and due to T362I and P365S in the A*03:01/A*11:01-restricted CD8+ nucleocapsid epitope KTFPPTEPK361-369. CD8+ T-cell lines unable to recognise variant epitopes have diverse T-cell receptor repertoires. These data demonstrate the potential for T-cell evasion and highlight the need for ongoing surveillance for variants capable of escaping T-cell as well as humoral immunity. We identify amino acid variants within dominant SARS-CoV-2 T-cell epitopes by interrogating 233 global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen 234 independently in multiple lineages and result in loss of recognition by epitope-specific T-cells 235 assessed by IFN-γ and cytotoxic killing assays. Complete loss of T-cell responsiveness was seen 236 due to Q213K in the A*01:01-restricted CD8+ ORF3a epitope FTSDYYQLY207-215, due to P13L, 237 P13S and P13T in the B*27:05-restricted CD8+ nucleocapsid epitope QRNAPRITF9-17, and due 238 to T362I and P365S in the A*03:01/A*11:01-restricted CD8+ nucleocapsid epitope 239 KTFPPTEPK361-369. CD8+ T-cell lines unable to recognise variant epitopes have diverse T-cell 240 receptor repertoires. These data demonstrate the potential for T-cell evasion and highlight the need 241 for ongoing surveillance for variants capable of escaping T-cell as well as humoral immunity. 242 Introduction 244 J o u r n a l P r e -p r o o f Evolution of SARS-CoV-2 can lead to evasion from adaptive immunity generated following 245 infection and vaccination. Much focus has been on humoral immunity and spike protein mutations 246 that impair the effectiveness of neutralizing monoclonal antibodies and polyclonal sera. T-cells 247 specific to conserved proteins play a significant protective role in respiratory viral infections such 248 as influenza, particularly in broad heterosubtypic immunity (Hayward et al., 2015) . T-cell 249 responses following SARS-CoV-2 infection are directed against targets across the genome and 250 may play a role in favourable outcomes during acute infection and in immunosuppressed hosts 251 with deficient B-cell immunity (Huang et al., 2021; Peng et al., 2020; Tan et al., 2021) . While 252 CD8+ T-cells may not provide sterilising immunity, they can protect against severe disease and 253 limit risk of transmission, with a potentially more important role in the setting of antibody escape. 254 Little is known about the potential for SARS-CoV-2 mutations to impact T-cell recognition. 256 Escape from antigen-specific CD8+ T-cells has been studied extensively in HIV-1 infection, where 257 rapid intra-host evolution renders T-cell responses ineffective within weeks of acute infection 258 (Goonetilleke et al., 2009) . While these escape variants play an important role in the dynamics of 259 chronic viral infections, the opportunities for T-cell escape in acute respiratory viral infections are 260 fewer and consequences are different. Nevertheless, several cytotoxic T-lymphocyte (CTL) escape 261 variants have been described in influenza, such as the R384G substitution in the HLA B*08:01-262 restricted nucleoprotein380-388 and B*27:05-restricted nucleoprotein383-391 epitopes (Voeten et al., 263 2000) . Long-term adaptation of influenza A/H3N2 has been demonstrated, with the loss of one 264 CTL epitope every three years since its emergence in 1968 (Woolthuis et al., 2016) . 265 266 Amino acid variants within experimentally proven SARS-CoV-2 T-cell epitopes 269 To explore the potential for viral evasion from SARS-CoV-2-specific T-cell responses, we 270 conducted a proof-of-concept study, focusing initially on identifying common amino acid 271 J o u r n a l P r e -p r o o f mutations within experimentally proven T-cell epitopes and testing the functional implications in 272 selected immunodominant epitopes that we and others have described previously. We conducted 273 a literature review in PubMed and Scopus databases (29 th of November 2020; Supplementary 274 Information Data S4) that identified 14 publications defining 360 experimentally proven CD4+ 275 and CD8+ T-cell epitopes (Chour et al., 2020; Ferretti et al., 2020; Gangaev et al., 2020; Habel et 276 al., 2020; Kared et al., 2021; Keller et al., 2020; Le Bert et al., 2020; Nelde et al., 2021; Peng et 277 al., 2020; Poran et al., 2020; Schulien et al., 2021; Sekine et al., 2020; Shomuradova et al., 2020; 278 Snyder et al., 2020) . Of these, 53 that were described in 1 publication were all CD8+ epitopes 279 (Table S1 ) and distributed across the genome (n=14 Open Reading Frame (ORF)1a, n=5 ORF1b, 280 n=18 S, n=2 M, n=8 N, n=5 ORF3a, n=1 ORF7a). In total 12503 amino acid substitutions or 281 deletions were identified within the 360 T-cell epitopes by searching the mutation datasets 282 downloaded from CoV-GLUE (http://cov-glue.cvr.gla.ac.uk/#/home) on the 30 th July 2021 283 ( Figure S1 , Table S2 ). 1370 amino acid variants were present within the 53 CD8+ T-cell epitopes 284 with responses described across multiple cohorts, with at least one variant in all epitopes ( Figure 285 S2, Table S3 ). 286 287 We focused on evaluating the functional impact of variants within seven immunodominant 289 epitopes in nucleocapsid, ORF3a and spike (five CD8+, two CD4+) described in our study of UK 290 convalescent donors (Peng et al., 2020) , along with a further immunodominant ORF1a CD8+ 291 epitope described in several other studies (Table 1) . Of these, all six CD8+ epitopes have been 292 described in at least one other cohort. In particular, responses to the A*03:01/A*11:01-restricted 293 nucleocapsid KTFPPTEPK361-369 (Ferretti et al., 2020; Gangaev et al., 2020; Kared et al., 2021; 294 Peng et al., 2020) epitope, A*01:01-restricted ORF3a FTSDYYQLY207-215 (Ferretti et al., 2020; 295 Kared et al., 2021; Peng et al., 2020; Schulien et al., 2021) epitope and A*01:01-restricted ORF1a 296 TTDPSFLGRY1637-1646 (Ferretti et al., 2020; Gangaev et al., 2020; Nelde et al., 2021) epitope are 297 consistently dominant and of high magnitude. We tested the functional avidity of SARS-CoV-2 298 specific CD4+ and CD8+ polyclonal T-cell lines by interferon (IFN)-γ ELISpots using wild-type 299 and variant peptide titrations ( Figure 1A -F). We found that several variants resulted in complete 300 loss of responsiveness to the T-cell lines evaluated: the Q213K variant in the A*01:01-restricted 301 CD8+ ORF3a epitope FTSDYYQLY207-215 (Ferretti et al., 2020; Kared et al., 2021; Peng et al., 302 2020; Schulien et al., 2021) , the P13L, P13S and P13T variants in the B*27:05-restricted CD8+ 303 nucleocapsid epitope QRNAPRITF9-17 (Nelde et al., 2021; Peng et al., 2020) , and T362I and 304 P365S variants in the A*03:01/A*11:01-restricted CD8+ nucleocapsid epitope KTFPPTEPK361-305 369 (Ferretti et al., 2020; Gangaev et al., 2020; Kared et al., 2021; Peng et al., 2020) ( Figure 1A -306 C). 307 308 In contrast, Q9H in QRNAPRITF9-17, T366I in KTFPPTEPK361-369, P384L in the A*03:01-309 restricted CD8+ spike epitope KCYGVSPTK378-386 (Ferretti et al., 2020; Peng et al., 2020) and 310 M177I in the CD4+ spike epitope CTFEYVSQPFLMDLE166-180 (Peng et al., 2020) showed no 311 impact on T-cell recognition (Figures 1B, C, F, S3) . In fact, T366I in KTFPPTEPK361-369 appeared 312 to result in higher avidity ( Figure 1C ). Several other variants showed partial loss of T-cell 313 responsiveness, with lower avidity observed to the variant peptide compared to wild-type peptide. 314 These included T325I in the B*40:01-restricted nucleocapsid epitope MEVTPSGTWL322-331 315 (Nelde et al., 2021; Peng et al., 2020; Schulien et al., 2021) , R765L in the DRB1*15:01-restricted 316 CD4+ spike epitope NLLLQYGSFCTQLNR751-765 (Peng et al., 2020) , and L176F in the CD4+ 317 spike epitope CTFEYVSQPFLMDLE166-180 (Peng et al., 2020) (Figure 1D S4) and a killing assay ( Figures 1N and S4) . 327 328 T-cell escape can occur via interrupting several mechanisms: antigen processing, binding of MHC 330 to peptide, or T-cell receptor (TCR) recognition of the MHC-peptide complex. While we did not 331 explicitly establish which of these was responsible in each case, it is likely that any partial 332 impairment of T-cell recognition is due to reduced TCR binding to MHC-peptide. Reasons for 333 complete escape are more difficult to predict. As the anchor residues of peptide-MHC binding in 334 A*03:01/A*11:01-restricted KTFPPTEPK361-369 are at positions 2 and 9, T362I (position 2) may 335 impair peptide-MHC binding, while P365S (position 5) may affect a T-cell binding residue 336 (Rammensee et al., 1999) . The proline changes (P13L, P13S, P13T) in the B*27:05-restricted 337 QRNAPRITF9-17 (position 5) again may be at a key T-cell contact residue, as peptide-MHC 338 binding anchor residues are at position 2 and 9 (Rammensee et al., 1999) . The anchor residues for 339 the A*01:01-restricted FTSDYYQLY207-215 are predicted to be at position 3 and 9, with auxiliary 340 anchors at positions 2 and 7 (Rammensee et al., 1999) , which may explain the impact of the Q213K 341 (position 7) variant. In keeping with this, we see no significant impact of these mutations on the 342 predicted binding affinities of epitope to MHC (Table S4) . Despite a modest 4-fold decrease in 343 predicted IC50 for Q213K compared to wild-type, FTSDYYKLY207-215 is still a strong binder to 344 A*01:01. Figure S6 ). It is worth noting that our data 361 are biased by using T-cell lines generated from donors recruited early in the pandemic and 362 therefore likely infected with 'wild-type' viruses (i.e. lineage B or B.1 viruses) (Peng et al., 2020) . 363 While variants that impair antigen processing or MHC-peptide binding result in irreversible loss 364 of T-cell recognition, CTLs with new TCR repertoires can overcome TCR-mediated escape 365 variants, as has been described in HIV-1 infection (Ladell et al., 2013) . 366 367 Many variants examined in our study were at relatively low frequency and stable prevalence at the 369 time of writing, other than P365S in KTFPPTEPK361-369, P1640L in TTDPSFLGRY1637-1646 and 370 variants affecting the proline at position 13 in QRNAPRITF9-17 (Table 1 and Figure 3A ). We 371 explored whether variants that result in loss of T-cell recognition appeared as homoplasies in the 372 phylogeny of SARS-CoV-2 suggestive of repeated independent selection, or whether global 373 frequency is due mainly to the expansion of lineages after initial acquisition. While in some cases, pressure for reasons other than T-cell immunity. A recent study has documented intra-host 380 evolution of minority variants within A*02:01 and B*40:01 CD8+ epitopes that impair T-cell 381 recognition, though not all epitopes are dominant and very few of the variants studied were 382 represented amongst the global circulating viruses (Agerer et al., 2021) . 383 Conclusions 385 There is unlikely to be adequate population immunity at present to see global changes due to T-386 cell selection akin to what has been seen in adaptation of H3N2 influenza over time (Woolthuis et 387 al., 2016) . Furthermore, polymorphism in HLA genes restricts the selective advantage of escape 388 within one particular epitope to a relatively small proportion of the population, given the breadth 389 in T-cell responses we and others have shown. The polyclonal T-cell response in a given individual 390 is therefore unlikely to be diminished significantly by mutations present in any one circulating 391 variant, unlike the potential impact on neutralising antibody responses seen with mutations in the 392 spike protein. Nevertheless, responses to many of the CTL epitopes we have studied are dominant 393 within HLA-matched individuals across many cohorts (Peng et al., 2020) . A epitope (nucleoprotein383-391) has also been observed (Voeten et al., 2000) . 403 404 A significant increase in sites under diversifying positive selective pressure was observed around 405 acquired population immunity increases further, the frequency of variants we have described 407 should be monitored globally, as well as further changes arising within all immunodominant T-408 cell epitopes. We have recently incorporated the ability to identify spike T-cell epitope variants in 409 real-time sequence data into the COG-UK mutation explorer dashboard 410 (http://sars2.cvr.gla.ac.uk/cog-uk/). Non-spike T-cell immune responses will also become 411 increasingly important to vaccine-induced immunity as inactivated whole virus vaccines are rolled 412 out. Our findings demonstrate the potential for T-cell evasion and highlight the need for ongoing 413 surveillance for variants capable of escaping T-cell as well as humoral immunity. 414 415 We have chosen to focus on key SARS-CoV-2 immunodominant epitopes characterised early in 417 the pandemic and further epitopes have been identified since. It would be important to assess 418 mutations of increasing prevalence within all immunodominant epitopes in the future to provide a 419 comprehensive overview of potential SARS-CoV-2 T-cell escape. While our findings suggest that 420 reduced T-cell receptor binding to MHC-epitope complex is likely responsible for the most 421 striking impact of mutations on T-cell responses we describe, this needs to be demonstrated 422 experimentally. Finally, further studies are required to demonstrate the occurrence of T-cell escape 423 within individuals and establish how frequently this occurs. Given the potential for immune escape 424 in prolonged or chronic SARS-CoV-2 infections that could give rise to new variants of concern, a 425 focus on infections in immunocompromised individuals would be important. Frame, HLA=Human Leukocyte Antigen. a responses to longer peptide also seen in Snyder et al., 510 2020; responses to longer peptide also seen in Snyder et al., 2020 and Kared et al., 2021  Code and data used for identifying mutations within T cell epitopes are provided in 531 Supplementary Information Data S1 Mutation identification, related to all figures. The 532 analysis folder contains a R code used for data manipulation and two sub-folders: well plate, then incubated 4-5 hours at 37C in 5% CO2. Following incubation 1 ml of RPMI 579 (GIBCO) with 20% (v/v) FBS, 100 units/mL penicillin, 0.1 mg/ml streptomycin was added to 580 each well and ciclosporin A (CSA) added to a final concentration of 100 ng/ml. Cells were fed 581 every 4 -6 days and lines expanded when required. 582 583 Variants within the 360 experimentally proven T-cell epitopes were identified using mutation 585 datasets downloaded from CoV-GLUE (http://cov-glue.cvr.gla.ac.uk/#/home) on the 30 th July 586 2021. Both amino acid substitutions and deletions were considered in this study. Sequences were 587 excluded if they did not contain a start and/stop codon at the beginning and end of each ORF. 588 COG-UK global metadata downloaded on 04 th August 2021 was used to plot the variant over time 589 (Figure 2A ). Sequence positions mentioned in this study are relative to Wuhan-Hu-1 (GenBank 590 accession MN908947.3) and were compared using custom R scripts (R version 3.5.3). 591 592 Polyclonal CD4+ and CD8+ T-cell lines specific for seven previously described immunodominant 594 epitopes (Peng et al., 2020) were generated after MHC class I Pentamer or MHC class II tetramer 595 sorting from cultured short-term cultures of SARS-CoV-2 recovered donor PBMCs. Antigen-596 specific T-cells were confirmed by corresponding Pentamer or tetramer staining. T-cells were 597 stained with Live/Dead dye (Thermo Fisher Scientific, UK), then stained with pentamer or 598 tetramer, followed by CD8-FITC (BD Bioscience, UK) or CD4-FITC (BD Bioscience, UK) 599 staining. The functional avidity of T-cell lines was assessed by IFN-γ ELISpot assays (Peng et al., 600 2015) . T-cell lines were stimulated with wild-type and variant peptide-pulsed autologous B-cells, The functional avidity of polyclonal CD8+ T-cell lines specific for the ORF1a epitope 618 TTDPSFLGRY1637-1646 (Ferretti et al., 2020; Gangaev et al., 2020; Nelde et al., 2021) was assessed 619 using stimulation with wild-type and variant peptides starting at 1000nM and serial 1:10 dilutions, 620 followed by intra-cellular cytokine staining (ICS). 1-1.5 x 10 6 cells were plated in R10 in a 96 well 621 U-bottom plate and peptide added. DMSO was used as the negative control at the equivalent 622 concentration to the peptides. Degranulation of T cells (a functional marker of cytotoxicity) was 623 measured by the addition of an anti-CD107a-PE-Cy7 antibody (clone H4A3, BD Biosciences, UK) 624 at 1 in 20 dilution during the culture. The cells were then incubated at 37°C, 5% CO2 for 1 hour 625 before adding Brefeldin A (10 μg/ml). Samples were incubated at 37°C, 5% CO2 for a further 5 626 hours before proceeding with staining for flow cytometry. Cells were stained with a cell viability 627 dye (near infrared, Thermo Fisher Scientific, UK) at 1:500 then fixed in 2% formaldehyde for 20 628 minutes, followed by permeabilization with 1x Perm/Wash buffer (BD Biosciences). Staining was 629 performed with the following antibodies: anti-CD3-BV510 (clone UCHT1, BD Biosciences), anti-630 CD8-BV421 (clone RPA-T8, BD), TNF-PE (clone MAb11, Thermo Fisher Scientific) and anti-631 Miltenyi Biotec Ltd, UK) . Samples were run on a FacsCanto II 632 cytometer and the data were analysed using FlowJo software version 10 (BD Biosciences). During 633 analysis, exclusion of doublet cells was performed, followed by gating on live peripheral blood 634 mononuclear cells and estimation of the % of CD3+CD8+ T-cells expressing cytokines at each 635 peptide concentration. 636 Cytotoxic T-lymphocyte (CTL) killing assays 638 Killing assays were performed in one of two ways. (1) For T-cell lines characterised using IFN-γ 639 ELISpot assays, autologous B-cells were stained with 0.5mol/L carboxyfluoroscein succinimidyl 640 ester (CFSE, Thermo Fisher Scientific) before wild-type or variant peptide loading at 1g/mL for 641 one hour. Peptide-loaded B-cells were co-cultured with CTLs at a range of effector:target (E:T) 642 ratios from 1:4 to 8:1 at 37°C for 6 hours and cells stained with 7-AAD (eBioscience, UK) and 643 CD19-BV421 (clone HIB19, Biolegend, UK). Assessment of cell death in each condition was 644 based on the CFSE/7-AAD population present. (2) For the ORF1a epitope TTDPSFLGRY1637-1646 645 (Ferretti et al., 2020; Gangaev et al., 2020; Nelde et al., 2021) The tips of sequences with amino acid variants impacting T-cell recognition were colour-coded. 688 Visualisations were produced using R/ape, R/ggplot2, R/ggtree, R/treeio, R/phangorn, R/stringr, 689 Data S1 -Mutation_identification. R Code and input data used for identifying mutations within 706 T-cell epitopes. 707 708 Data S2 -Variant_prevalence. R code and input data used for plotting the prevalence of variants 709 within T-cell epitopes over time. Wild-type Q213K TRBV2/TRBJ2-2 TRBV6-5/TRBJ2-6 TRBV13/TRBJ1-2 TRBV16/TRBJ2-1 TRBV20-1/TRBJ1-2 TRBV20-1/TRBJ2-1 TRBV5-1/TRBJ1-1 TRBV5-1/TRBJ2-5 TRBV6-4/TRBJ2-3 TRBV6-6/TRBJ2-2 TRBV10-3/TRBJ2-1 TRBV25-1/TRBJ2-5 TRBV7-3/TRBJ2-5 Krishanthi 192 S Subramaniam SARS-CoV-2 mutations in MHC-I-720 restricted epitopes evade CD8(+) T cell responses Shared Antigen-specific CD8+ T cell Responses Against the SARS-724 COV-2 Spike Protein in HLA-A*02:01 COVID-19 Participants Unbiased Screens Show CD8(+) T Cells Patients Recognize Shared Epitopes in SARS-CoV-2 that Largely Reside outside the Spike 729 Protein Profound CD8 T cell responses 732 towards the SARS-CoV-2 ORF1ab in COVID-19 patients The first T cell response to 736 transmitted/founder virus contributes to the control of acute viremia in HIV-1 infection Suboptimal SARS-CoV-2-specific CD8(+) T cell 740 response associated with the prominent HLA-A*02:01 phenotype Natural T Cell-mediated 744 Protection against Seasonal and Pandemic Influenza. Results of the Flu Watch Cohort Study CD8 T cells compensate for impaired humoral 748 immunity in COVID-19 patients with hematologic cancer SARS-CoV-2-specific CD8+ T cell responses in 751 convalescent COVID-19 individuals SARS-CoV-2 specific Are Rapidly Expanded for Therapeutic Use and Target Conserved Regions of Membrane A molecular basis for the control of preimmune 758 escape variants by HIV-specific CD8+ T cells SARS-CoV-2-specific T cell immunity in cases of COVID-19 and 762 SARS, and uninfected controls The emergence and ongoing convergent evolution of the 765 N501Y lineages coincides with a major global shift in the SARS-CoV-2 selective landscape SARS-CoV-2-derived peptides define heterologous and 769 COVID-19-induced T cell recognition Broad and strong memory CD4(+) and CD8(+) T cells induced by SARS-CoV-772 2 in UK convalescent individuals following COVID-19 Boosted Influenza-Specific T Cell Responses after H5N1 776 Pandemic Live Attenuated Influenza Virus Vaccination Sequence-based 780 prediction of SARS-CoV-2 vaccine targets using a mass spectrometry-based bioinformatics 781 predictor identifies immunogenic T cell epitopes 784 SYFPEITHI: database for MHC ligands and peptide motifs Characterization of pre-existing and induced SARS-CoV-2-788 specific CD8(+) T cells Robust T Cell Immunity in 791 Convalescent Individuals with Asymptomatic or Mild COVID-19 Epitopes Are Recognized by a Public and Diverse Repertoire of Human T Cell Receptors. 796 Immunity Magnitude and Dynamics of the T-Cell 799 Response to SARS-CoV-2 Infection at Both Individual and Population Levels Early induction of functional SARS-CoV-2-specific T 803 cells associates with rapid viral clearance and mild disease in COVID-19 patients Antigenic drift in the influenza A virus (H3N2) nucleoprotein and 807 escape from recognition by cytotoxic T lymphocytes Long-term 810 adaptation of the influenza A virus by escaping cytotoxic T-cell recognition Predictions of binding strength of peptides to MHC 661 NetMHCpan 4.1 (http://www.cbs.dtu.dk/services/NetMHCpan/) was used to predict the binding 662 strength of wild type and variant epitopes under standard settings (strong binder % rank 0.5, 663 weak binder % rank 2). The predicted affinity (IC50 nM) for variant epitopes was compared with 664 wild type. 665 T-cell receptor (TCR) sequencing 667One million cells from each epitope-specific polyclonal CD8+ T-cell line were harvested and 668 washed three times with Phosphate Buffered Saline. Total RNA was extracted using the RNeasy 669Plus Mini kit (Qiagen, Germany), and cDNA was then synthesized from 300ng RNA using the 670 SMARTer RACE cDNA amplification kit (Takara Bio, Japan) following the manufacturer's 671 instruction. Subsequently, cDNA was amplified for variable regions of the TCR-β chain using the 672 PCR Advantage kit (Takara Bio), with the primer 5'-673 TGCTTCTGATGGCTCAAACACAGCGACCT-3' and run on a 1.2% agarose gel for PCR band 674 confirmation (at 500 bp). PCR products were purified using the Monarch DNA Gel Extraction kit 675 (New England BioLabs, USA) and then transformed into TOP10 competent cells (ThermoFisher). 676Plasmid DNA was extract using the Spin Miniprep kit (Qiagen) followed by Sanger sequencing. 677 Phylogenetic tree generation 679Phylogenies were generated using the grapevine pipeline (https://github.com/COG-UK/grapevine) 680 based on all data available on GISAID and COG-UK up until 8 th August 2021. To visualise all 681 sequences with a specific amino acid variant of interest in a global context, a representative sample 682 of global sequences was obtained in two steps. First, one sequence per country per epi week was 683 selected randomly, followed by random sampling of the remaining sequences to generate a sample 684 of 6000 down-sampled sequences. The global tree was then pruned using code adapted from the 685 tree-manip package (https://github.com/josephhughes/tree-manip). 686