key: cord-0709745-9b03naid authors: Francis, Joshua M.; Leistritz-Edwards, Del; Dunn, Augustine; Tarr, Christina; Lehman, Jesse; Dempsey, Conor; Hamel, Andrew; Rayon, Violeta; Liu, Gang; Wang, Yuntong; Wille, Marcos; Durkin, Melissa; Hadley, Kane; Sheena, Aswathy; Roscoe, Benjamin; Ng, Mark; Rockwell, Graham; Manto, Margaret; Gienger, Elizabeth; Nickerson, Joshua; Moarefi, Amir; Noble, Michael; Malia, Thomas; Bardwell, Philip D.; Gordon, William; Swain, Joanna; Skoberne, Mojca; Sauer, Karsten; Harris, Tim; Goldrath, Ananda W.; Shalek, Alex K.; Coyle, Anthony J.; Benoist, Christophe; Pregibon, Daniel C. title: Allelic variation in Class I HLA determines pre-existing memory responses to SARS-CoV-2 that shape the CD8+ T cell repertoire upon viral exposure date: 2021-04-29 journal: bioRxiv DOI: 10.1101/2021.04.29.441258 sha: 5ff7d7db67ffdee6fcc44b40726604112cf647cc doc_id: 709745 cord_uid: 9b03naid Effective presentation of antigens by HLA class I molecules to CD8+ T cells is required for viral elimination and generation of long-term immunological memory. In this study, we applied a single-cell, multi-omic technology to generate the first unified ex vivo characterization of the CD8+ T cell response to SARS-CoV-2 across 4 major HLA class I alleles. We found that HLA genotype conditions key features of epitope specificity, TCR α/β sequence diversity, and the utilization of pre-existing SARS-CoV-2 reactive memory T cell pools. Single-cell transcriptomics revealed functionally diverse T cell phenotypes of SARS-CoV-2-reactive T cells, associated with both disease stage and epitope specificity. Our results show that HLA variations influence pre-existing immunity to SARS-CoV-2 and shape the immune repertoire upon subsequent viral exposure. One-Sentence Summary We perform a unified, multi-omic characterization of the CD8+ T cell response to SARS-CoV-2, revealing pre-existing immunity conditioned by HLA genotype. Elicitation of a robust and durable neutralizing antibody response following immunization of large sections of the population with approved SARS-CoV-2 vaccines is limiting viral transmission and decreasing mortality, providing hope that the global threat from the COVID-19 pandemic is diminishing. However, the appearance of new viral variants warrants continued 5 vigilance. A more complete understanding of the underlying cellular mechanisms that regulate host immunity and guarantee long term protection is required. Infection with SARS-CoV-2 leads to an upper respiratory tract infection, which can be benign or even asymptomatic. If not controlled by the immune response, it can evolve into a lethal pneumonia with immunopathology due to excessive amplification of the innate inflammatory response, complicated by several 10 extra-respiratory manifestations (1) . While humoral responses play an important role in immunological control of infection, the generation of effective cellular immunity and expansion of cytotoxic CD8 + memory T cells is also required to eliminate virally infected cells as shown from the earlier SARS-CoV-1 epidemic, even in the absence of seroconversion (2) (3) (4) (5) (6) (7) . Several recent studies have focused on the discovery of relevant SARS-CoV-2 epitopes in both 15 CD4 + and CD8 + T cell responses, leveraging in silico predictions, stimulation/expansion with peptide pools (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) , and tetramer binding (19, 20) . Collectively, these studies identified a number of immunodominant epitopes derived across the viral proteome including structural and non-structural proteins (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) . Interestingly, some of these specificities were also detected in uninfected individuals, suggesting potential cross-reactivity from endemic human coronaviruses 20 (HCoV) to which the population is routinely exposed (21) , though a direct connection to preexisting memory cells has not been established. The breadth and nature of the cellular immune response to SARS-CoV-2 infection is driven by diversity in both TCR repertoire and human leukocyte antigen (HLA) genetics. Mammalian cells express up to six different HLA class I alleles that shape antigen presentation in disease, and 25 allelic diversity has been associated with both disease susceptibility and outcome of viral infections (22, 23) . There are divergent reports regarding HLA polymorphism and COVID-19 incidence and severity, although the major GWAS studies clearly show no dominant effect of the locus (24) (25) (26) (27) (28) . Together with genetic influences on HLA-associated antigen presentation, the clonal selection of T cell receptors (TCRs) that compose an individual's repertoire contributes to 30 the nature and dynamics of the antiviral response, including cellular cytotoxicity and memory formation. Interestingly, despite a potential TCR diversity of 10 15 (29), several studies have described "public" T cell responses in COVID-19, where complementarity-determining region (CDR) sequences are conserved within and across individuals (18, 30) . The extent to which TCR diversity, especially in the context of epitope specificity restricted to HLA, contributes to 35 response is not well understood. Here, we leverage a unique technology to elucidate, at single-cell resolution, the connection between T cell specificity, HLA variation, conserved features of paired α/β TCR repertoires, and cellular phenotype observed in CD8 + T cell responses to SARS-CoV-2 infection. We profiled 108,078,030 CD8 + T cells ex vivo across 76 acute, convalescent, or unexposed individuals, and 40 identified T cell specificity to 648 epitopes presented by four HLA alleles across the SARS-CoV-2 proteome, few of which are implicated by the current variants of concern. Epitope-specific TCR repertoires were surprisingly public in nature, though we found a high degree of pre-existing immunity associated with a clonally diverse response to HLA-B*07:02, which can efficiently present homologous epitopes from SARS-CoV-2 and HCoVs. Transcriptomic analysis and functional validation confirmed a central memory phenotype and TCR crossreactivity in unexposed individuals with HLA-B*07:02. Our data suggest a strong association 5 between HLA genotype and the CD8 + T cell response to SARS-CoV-2, which may have important implications for understanding herd immunity and elements of vaccine design that are likely to confer long-term immunity to protect against SARS-CoV-2 variants and related viral pathogens. We leveraged single-cell RNA-sequencing with DNA-encoded peptide-HLA tetramers to characterize CD8 + T cell responses to SARS-CoV-2 across multiple Class I alleles in subjects 15 with varying degrees of disease severity. The technology illustrated in Fig. 1a simultaneously determines the specificity of paired α/β TCR sequences for HLA-restricted epitopes and provides transcriptomic phenotype at single-cell resolution. We designed peptide-HLA tetramer libraries to ensure comprehensive coverage of SARS-CoV-2 and related betacoronaviruses across four class I HLA alleles prevalent in North America (A*02:01, B*07:02, A*01:01, and 20 A*24:02, hereafter A*02, B*07, A*01, and A*24). Library inclusion was determined computationally using predicted HLA binding (NetMHC-4.0 (31)) of candidate peptides from a set of all possible 9-mers from the SARS-CoV-2 proteome (40% from structural, 60% from nonstructural proteins), potentially immunogenic neopeptides from known SARS-CoV-2 variants, and immunogenic epitopes from SARS-CoV-1. A total of 1,355 SARS-CoV-2 related epitopes 25 were included in the libraries in addition to well-characterized epitopes from common endemic viruses (CMV, EBV, and influenza). The peptide-HLA tetramer libraries were used to interrogate PBMCs from individuals who had been infected with SARS-CoV-2 (N = 28 convalescent, N = 27 with acute disease that required 30 hospitalization), or who were unexposed (N = 23) as summarized in Table S1 . For each sample, CD8 + cells were isolated from PBMCs (Methods), incubated with HLA-matched tetramer libraries, and sorted by flow cytometry to enrich viable, tetramer positive cells. Sorted single cells were encapsulated with DNA-encoded hydrogel beads to provide cell-specific barcodes and unique molecular identifiers (UMIs) that could be used to unify reads across 35 independent sequencing libraries for TCR, peptide-HLA tetramer, and mRNA (Fig. 1a) . We determined the specificity of TCRs using a classification method that identified UMI counts for TCR-peptide-HLA interactions that were outliers when Z-score transformed within and across cells for each sample (Methods). The resulting classifier was evaluated against functional assay data for each allele by a receiver-operator curve (ROC) analysis to identify thresholds, which 40 were then used for normalization. The normalized classifier evaluated by ROC analysis provided an area under the curve (AUC) of 0.82 (Fig. S1) , and at a threshold of 1, which was applied to the entire data set, yielded a true positive rate of 93% and a false positive rate of 32%. From the 55,956,215 CD8 + cells interrogated from acute and convalescent COVID-19 patients, we identified high-confidence TCR-peptide-HLA interactions across 434 immunogenic SARS-CoV-2-derived epitopes and 1,163 independent α/β TCR clonotypes (Fig. 1b, Table S2 ). The immunodominant epitopes we discovered ex vivo were consistent with those measured by other means (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) , but we also identified many epitopes with less dominant representation (yet 5 observed with two or more reactive clonotypes), 188 of which had not been previously reported as minimal epitopes (Table S3) . Importantly, CD8 + T cell reactivity to SARS-CoV-2 epitopes was observed across the entire proteome, generally distributed in a manner consistent with protein lengths (Table S4) . Of relevance, 85 of these epitopes were derived from the Spike protein currently used in vaccines, but only six of them (a total of 20 CD8 + T cell clonotypes in 10 our study) would be affected by the recent SARS-CoV-2 variants (B.1.1.7, B.1.351, P.1) ( Table S5) . Dimensionality reduced projections of mRNA expression for 224,780 CD8 + T cells revealed the broad phenotypic variance observed within this study spread across 8 clusters (Fig. 1c) . We 15 defined the phenotypic features of clusters using gene signatures generally associated with various CD8 + T cell states, including those with naïve, memory, effector, and proliferative status (Fig. 1c) . In this space, cells from convalescent patients that recognized different dominant epitopes were commonly associated with divergent phenotypes, as shown for representative epitopes in Fig. 1c . For example, T cells specific for QYIKWPWYI in A*24 (QYI-A24) were 20 clustered in regions with high effector scores while those specific for PTDNYITTY in A*01 (PTD-A01) and LLYDANYFL in A*02 (LLY-A02) resided at opposite ends of memory-rich regions. Thus, and as will be further detailed below, the different immunoreactive epitopes of SARS-CoV-2 elicit distinct CD8 + T cell phenotypes. Having established a broad landscape of SARS-CoV-2-reactive CD8 + T cells, we asked how TCR repertoires evolve over the course of infection and recovery. As our approach does not require cell expansion to determine TCR specificity, we were able to directly quantify the frequency of epitope-specific CD8 + T cells in the blood of convalescent, acute, and unexposed 30 individuals. Figure 2a shows the frequency, for each subject, of T cells reactive to the top five epitopes detected across each of the four HLA variants analyzed. Notably, we observed markedly fewer SARS-CoV-2-specific T cells in patients with acute disease compared to those in convalescence (p = 6.0e-7 for A*02, Wilcoxon rank-sum). The dramatic reduction also applied to memory T cells from prior antiviral responses in these patients, including influenza 35 and EBV, but potentially less to the CMV-specific pool in multiple acute subjects (Fig. S2) . The paucity of virus-reactive T cells is consistent with the T cell lymphopenia that has been reported to occur in patients with acute COVID-19 (1, 32). We also observed that the frequencies of SARS-CoV-2-specific T cells in unexposed individuals 40 varied markedly with the HLA allele (Fig. 2a) . While several dominant epitopes in HLA-A*02, A*24, and A*01 were associated with high-frequency responses in >40% of convalescent subjects (Fig. 2b) , the depth of the overall response was significantly lower in unexposed compared to convalescent individuals (p=2.3e-5, 2.2e-4, 1.1e-6 by Wilcoxon rank-sum, respectively). In stark contrast, there was no discernible difference in response frequency 45 detected across the most immunodominant epitopes in B*07:02 individuals (p=0.2). In fact, 5 CD8 + T cells recognizing nucleocapsid-derived SPRWYFYYL in B*07 (SPR-B07) were found in almost 80% of unexposed subjects with a mean frequency of 4 cells/M cells screened (Fig. 2b) , presaging the immunodominance of this epitope in convalescent COVID-19 patients, where reactivity was detected in 100% of the samples. 5 The broad presence of SARS-CoV-2-specific T cells in unexposed B*07 subjects could originate from fortuitous cross-reactivity of a public specificity, or from priming via previous exposure to a highly related endemic human coronavirus (HCoV). Indeed, SPR-B07 shows marked homology to the corresponding segments of the nucleocapsid proteins from multiple prevalent HCoVs, including HKU1 and OC43, with only a single amino acid residue mismatched at the N-10 terminus (Fig. 2c) . The nature of the homology preserves internal TCR-contact residues as well as the P and L anchors for HLA binding in peptide positions 2 and 9. Accordingly, the HCoV epitope (LPR-B07) is predicted to bind with high affinity to HLA-B*07 and could reasonably be expected to cross-react with SPR-B07-specific TCRs. Broader sequence alignment with HCoVs revealed very little homology to the immunodominant epitopes of A*02 and A*01 but did 15 identify a perfect match to VYIGDPAQL for A*24 (VYI-A24). Surprisingly, T cell specificity to VYI-A24 was not detected in a single unexposed subject. This likely reflects the lower frequency of response elicited by this epitope or an insufficient commitment to memory following exposure to HCoVs. Overall, we found that the response to SARS-CoV-2 is sharply distinguished by HLA genotype, as can be seen clearly in the case of A*02 and B*07, where it appears that highly 20 specific CD8 + responses are either generated de novo or amplified from an abundant pre-existing pool, respectively. To confirm the specificity and functionality of TCR-peptide-HLA interactions identified in this 25 study, we cloned several of the discovered α/β TCRs clonotypes and expressed them in the TCR-null Jurkat J76 cell line (33). Activation of these transductants upon stimulation by SARS-CoV-2 peptides, presented by an HLA-matched lymphoblast cell line, was evaluated by measuring the induction of surface CD69 (Fig. 3a, Methods) . Altogether, we validated 28 interactions for epitopes derived from Orf1ab, Spike, Nucleocapsid, Membrane, and Orf3a 30 proteins, spanning high confidence interactions observed across multiple cells as well as interactions observed exclusively in single cells (Table S6) . Dose-response curves for a subset of interactions in A*02 and B*07 are shown in Figure 3b . The EC50s measured for these interactions ranged from 1 to 100 nM, with no particular relationship to epitope immunodominance or clonotype frequency measured ex vivo from the respective subject. These 35 values are consistent with interactions measured for CMV-specific epitopes in A*02 using the same system. We next used these recombinant TCR expressing cell lines to compare the functional reactivity elicited by homologous epitopes from HCoVs (Fig. 3c) . Activation was insignificant for the closest homologs of Orf3a-derived LLY-A02 and Orf1ab-derived ALW-A02, all of which actually originated from HCoV spike proteins. In contrast, HKU1 and OC43 40 homologs of nucleocapsid-derived SPR-B07 and KPR-B07 epitopes drove substantial T cell activation (Fig. 3c) . We further assessed the sensitivity of B*07 interactions, comparing the reactivity of SPR-B07specific clonotypes identified from COVID-19 patients or unexposed subjects to SARS-CoV-2-45 derived SPR-B07 or HCoV-derived LPR-B07 (Fig. 3d) . The three TCRs identified from 6 COVID-19 individuals yielded EC50s that were essentially identical for the two epitopes, all falling between 50-100 nM (Fig. 3d, left) . Two of the TCRs from unexposed individuals yielded EC50s in the same range, again comparable for the HCoV and SARS-CoV-2 variants, while a third showed a >10-fold preference for the HCoV epitope (even though it was originally detected as binding to the SARS-CoV-2 peptide). Aside from providing validation that the specificities 5 detected in our barcoded tetramer technology indeed correspond to antigen-reactive T cells, these findings support that the homologies between SARS-CoV-2 and HCoV epitopes are functionally relevant, and that pre-existing cellular reactivity to SARS-CoV-2 in B*07 subjects likely result from previous exposure to HCoVs like HKU1 or OC43. 10 Given the comprehensive landscape of TCR specificity determined with our approach, we sought to elucidate the extent to which TCR usage is shared within and across subjects. We examined the linkage between paired TCR α/β sequences and their epitope specificity to determine if any features are implicated in the CD8 + T cell response to SARS-CoV-2. We used TCRs from 2,469 15 SARS-CoV-2-specific T cells to perform network mapping of epitope-specific subsets across several immunodominant epitopes identified (Fig. 4a) . Importantly, because it is known that during development, a TCR β-chain can be paired with many different α-chains, the network analysis allowed clonotype linkages by α or β CDR3 sequences (indicated by edges), identifying conserved motifs based on physicochemical similarity (via BLOSUM matrices) within in the 20 epitope specific T cell population (34). T cells from COVID-19 patients that recognize the most dominant A*02-, A*24-, and A*01-restricted epitopes, which have no counterpart in unexposed repertoires, showed a high degree of motif sharing with the exception of KLW-A02 (Fig. 4a) . Interestingly, all of these epitopes, including KLW-A02, show dominant usage of a single TCR alpha variable (TRAV) region, and in the cases of QYI-A24 and PTD-A01, dominant usage of 25 both TRAV and TCR beta variable (TRBV) regions (Fig. 4b) . In marked contrast, SPR-B07specific T cells, including those that also recognize homologs from HCoV, were far more diverse in CDR3 across subjects (Fig. 4a) , using 8 TRAV and 3 TRBV regions to cover 50% of the clonotypes represented. We observed two instances of CDR3 homology shared across cohorts, as indicated by the presence of nodes with unconnected edges, which are represented in both 30 network maps. These comparisons show that the reactivities that appear during SARS-CoV-2 infection may stem from both the amplification of highly related TCRs, or from the usage of diverse preexisting T cell populations. This conclusion extended to CDR3 lengths (Fig 4c) , which were 35 tightly distributed for α− and/or βchains in T cells reactive to the top epitopes in A*02, A*24, and A*01, but significantly less so for SPR-B07. To further elucidate the extent of the public nature of paired α/β TCR usage in COVID-19, we generated consensus sequences from select interconnected network clusters (Fig. 4d) . This representation provides insight into α/β linkage in the context of public responses that cannot be afforded by bulk sequencing approaches. Most 40 motifs were represented by multiple sequences and shared by 50% of the subjects studied, with the exception of KLW-A02 that was shared across only 22%, and SPR-B07 that was shared across only 14%, notably with identical α/β sequences (Fig. 4d) . Thus, we have observed divergent TCR repertoire utilization, conditioned by HLA and the presence of diverse, preexisting reactivity resulting from prior viral exposure. To examine how CD8 + T cell phenotype varied in relation to disease status, HLA/epitope specificity, and TCR diversity, we performed a more detailed analysis of the single-cell transcriptomic data. We leveraged, as an internal reference, the transcriptomic phenotype of T cells reactive to common acute and latent infections, including influenza, EBV, and CMV. To 5 relate these data to existing knowledge on differentially expressed genes that delineate CD8 + T subsets, we used supervised partition clustering based on imputed expression (Methods) of a set of 51 curated transcripts characteristic of naïve, memory, effector, or chronicallyactivated/exhausted populations (Fig. 5a) . This resulted in the identification of seven distinct cell clusters. Some were easily assigned (naïve cells in C1, central memory in C2, and fully 10 activated cytotoxic effectors in C7). Other memory/effector intermediates were more tentatively labeled, as they did not easily fit into existing categorizations (35-37). These included a puzzling population (C3, here "CD127 + Memory"), which expresses markers of naïve, memory and effector cells, and 3 other clusters with characteristics of memory or chronically activated cells (C4-6). 15 SARS-CoV-2 specific T cells were found in all clusters (Fig. 5b, bottom) , but at proportions that varied with stage of disease and epitope specificity (Table S7) . Cells from acute patients predominantly showed effector phenotypes, but also paradoxically naïve types. In convalescent donors, T cells from several epitope specificities were broadly distributed, consistent with the 20 resolution of an infection. Several epitope-specific T cell pools were predominantly found in central memory (C2), including PTD-A01 (49%) and LLY-A02 (42%), while others predominantly resided in the cytotoxic terminal effector cluster (C7), including TLM-A02 (80%), and LLL-A02 (61%). In most other reactivities, including SPR-B07, transcriptional profiles in convalescent patients were fairly broadly distributed across all clusters. In contrast, 25 the reactivity in unexposed subjects was dominated by the central memory pool, confirming that the CD8 + cells likely result from long-term exposure to cross-reactive antigens. This was especially clear in the case of B*07, where epitope-specific T cells for SPR-B07, QPG-B07, and SII-B07 were represented in central memory (C2) at proportions of 88%, 75%, and 67%, respectively. Other notable reactivities associated with central memory include TSQ-A24 (70%) 30 and NSS-A01 (68%), though the source of these memory cells, like QPG-B07 and SII-B07, does not appear to be from HCoV exposure based on a lack of homology. Overall, this analysis provides further evidence that SPR-B07 responses to SARS-CoV-2 are likely drawn from a preexisting memory pool and that commitment to different cell fate is dependent on epitope specificity. 35 We also observed some interesting dynamics between SARS-CoV-2 infection and existing T cell pools specific for common viral infections, with differentiated outcomes likely shaped by exposure history (Fig. 5b) . Influenza-specific CD8 + T cells, which result from vaccination or past infections, mapped primarily to the central memory (C1) and effector memory (C3) 40 compartments in unexposed individuals. Proportions were stable across epitope specificities in COVID-19 patients with the exception of GIL-A02, where the proportion of effector memory cells decreased from 50% to 0% and a naïve population representing 30% of the cells paradoxically emerged. CMV-and EBV-specific T cells, likely subject to more chronic stimulation from low-level re-activation of these integrated herpesviruses, mapped to more 45 activated pools in unexposed subjects, as has been described by others (38). After SARS-CoV-2 infection, EBV-specific cells shifted markedly from central memory (C2) and chronically stimulated compartments (C5) into the 127 + memory cluster (C3). These changes may reflect either bystander activation, perhaps as a result of the high cytokine release in COVID-19 patients, or from changes in homing or recirculation patterns that bring into the blood cells normally sequestered in tissues. These observations suggest that, in addition to inducing 5 lymphopenia, COVID-19 strongly reshuffles third-party antiviral T cell pools, the extent of which may be associated with exposure history and, at least to some degree, epitope specificity. Here we presented the first unified description of the CD8 + T cell response to SARS-CoV-2, 10 highlighting the importance of HLA genetics, TCR repertoire diversity, and epitope-specific navigation through a complex transcriptomic phenotype at various stages of disease. In building a comprehensive map of immunodominant, HLA-restricted epitopes broadly derived from proteins across the entire SARS-CoV-2 proteome, we highlight how only some HLA haplotypes are associated with the existence of a pre-existing CD8 + T cell memory pool in unexposed 15 individuals. We further show how HLA variation plays an important role in shaping the diversity of CD8 + T cell repertoires upon exposure to SARS-CoV-2, and that cellular phenotype and commitment to memory can be associated with epitope-specificity in the context of both SARS-CoV-2 and latent EBV infections. 20 The presence of SARS-CoV-2 reactive CD8 + T cells has been linked to milder disease (5, 11, 12) , although the precise link between cellular immunity and host protection still remains to be further understood (7, 39, 40) . We found that individuals carrying HLA-B*07 show a CD8 + T cell response that is dominated by pre-existing memory pools reactive to multiple SARS-CoV-2 epitopes, especially SPR-B07, which is likely induced by previous exposures to benign HCoVs. 25 In contrast, the immunodominant responses in A*02 individuals (e.g. to YLQ-A02, LLY-A02) are driven largely by the expansion of antigen-inexperienced SARS-CoV-2-specific T cells. It is interesting to note that CD8 + T cell cross-reactivity may be less widespread in unexposed individuals than for CD4 + T cross-reactivity, for which ~50% of unexposed individuals exhibited CD4 + T cell memory (16). Our data provides a basis for this limited representation of the CD8 + T 30 cell repertoire in that only a subpopulation of individuals carrying a specific HLA allele would have these pre-existing memory CD8 + T cells. The interplay between HLA-restricted epitope presentation and available TCR repertoire shapes the cellular response to SARS-CoV-2. There are few limited studies suggesting an influence of 35 HLA genotype on COVID-19 severity (27, 41-43) . Large-scale, high-resolution HLA mapping, consistent with what we have done for select HLAs in this work, may help identify relationships between HLA genotype and protection against severe disease, ideally uncovering mechanism. Here, we observed an interesting connection between TCR repertoire diversity and HLA restriction. Responses seen in A*02, A*24, and A*01 were more often associated with "public" 40 CDR3 motifs and consistent V gene segment usage in the α− and/or β− chains. In contrast, the dominant immune response in B*07 leveraged a significantly more diverse TCR repertoire. Several contributors to public TCR responses have been proposed, focusing on the physicochemical features of HLA-restricted peptides (e.g. "featureless" peptide-HLAs may drive a public response) and convergent recombination of TCR dimers (44). The method described in 45 this work provides an ideal system to address this question. Perhaps counterintuitively, our results show that in the case of COVID-19, the largest pool of potentially protective, pre-existing cellular immunity is derived from one of the least public epitope-specific repertoires, possibly reflecting the influence of repeated acute infections with HCoVs throughout the life of the individuals. 5 Beyond the comprehensive deciphering of TCR specificity reported here, we also provided a detailed picture of the complex and dynamic transcriptional landscape of the CD8 + T response to SARS-CoV-2. Importantly, we were able to demonstrate that the pre-existing SPR-B07 reactivity, observed in ~80% of unexposed subjects with HLA-B*07, was predominantly associated with a central memory-like transcriptional profile (88%), confirming that it originates 10 from prior exposures. In convalescent patients, we observed a much broader distribution of SPR-B07-reactive T cells spanning every functional state at proportions ranging from 5-29% (Table S7) . This is consistent with late contraction/early memory formation described for SARS-CoV-2 in a recent study (12) , where cells spanned naïve, central memory, various classifications of effector memory, and terminally differentiated effector memory expressing RA (TEMRA). There 15 was no evidence for a particularly frequent "exhausted" state among SARS-CoV-2-specific CD8 + T cells, as suggested elsewhere (45, 46) (acknowledging that the phenotypic state is a proxy for true reactivity testing, and that blood T cells may not fully reflect what happens in the lung). We also did not find evidence of "antigenic sin" resulting from HCoV pre-exposure (47) that would stifle an effective response to SARS-CoV-2-unexposed B*07 individuals. It will be 20 interesting to determine whether HLA haplotype plays a role in the durability of the CD8 + T cell responses, especially to SARS-CoV-2 vaccines, which may have profound impact for long term protection across different ethnic groups and geographic regions. Another interesting observation from this work, as noted by others (48) , is that even at the height 25 of infection or shortly after viral clearance, the cumulative anti-SARS-CoV-2 CD8 + T cell response barely reached the frequency of anti-influenza memory responses and was well below the frequencies that could be achieved by CMV-specific cells in the same individuals (Fig. S2) . This was particularly evident in the acutely infected individuals, at a time where the contribution of cytotoxic CD8 + T cells would have been most important. We acknowledge the caveat that 30 peripheral frequencies were measured, and some degree of sequestration in viral target tissues, such as the lung, is likely to occur in acute patients. Yet, the response seems much more muted than the "all hands-on deck" observed in some other viral infections (49). This meager outcome was seen both for the cross-reactive "secondary responses" by memory T cells pre-primed by endemic HCoVs, as well as for the primary responses of truly SARS-CoV-2 species-specific 35 CD8 + T cells amplified de novo. This suggests that the paucity likely does not result from a blocking of primary activation, but from a dampening of all specific CD8 + T cells. Consistent with this notion, frequencies of influenza/EBV/CMV reactive cells were also lower in acute COVID-19 patients, compared to SARS-CoV-2 "naïve" individuals. It has been proposed that the lethal cytokine storm in severe COVID-19 stems from innate immune functions 40 overcompensating for adaptive immune system failures (2) . In this line of reasoning, one might propose that SARS-CoV-2's noxiousness stems from a broad obstruction of antiviral CD8 + T cell responses. We have recently described a "super-Treg" phenotype in severe COVID-19, with heightened expression of FoxP3 and Treg effector molecules, akin to tumor-infiltrating Treg cells (50) , and one possible interpretation is that overactive Tregs are overly suppressing these 45 CD8 + T cells in severe COVID-19 patients. Given the widespread lymphopenia observed in acute COVID-19, we pondered the possibility of latent virus reactivation with the loss of protective CMV-and EBV-specific T memory pools. While we have no direct evidence of impact on disease outcome, we do see a significant alteration of cell state within these subsets. While CMV-reactive cells remained within, though 5 somewhat shuffled, the same effector/memory transcriptional phenotypes between unexposed and COVID-19 cohorts (including chronic stimulation, cytotoxic terminal effector, and terminal effector memory), we observed a dramatic shift of EBV-specific cells from chronic stimulation and central memory into the interesting "127 + memory" state in COVID-19-exposed individuals. These cells expressed moderate to high levels of many naïve (IL7R, SELL, CCR7), memory 10 (GZMK), and effector-associated genes (NKG7, CST7, GZMA), along with markers of activation/exhaustion (TIGIT, LAG3), making them particularly interesting and difficult to ascribe to conventional phenotype labels. Recently, two new transcriptionally distinct stem-like CD8 + T cell memory states were described, one of which was functionally committed to a dysfunctional lineage (37). As these cell states were differentiated by many of the same markers 15 observed in our "127 + memory" compartment, it would be interesting to see to what extent these "127 + memory" cells, dominated by EBV-reactive pools, experience similar fates of dysfunction. We speculate that this phenotype may be a consequence of the particular inflammatory milieu of COVID-19 patients. 20 In conclusion, we leveraged a powerful single-cell technology to better elucidate the roles of HLA variation, TCR diversity, and cellular phenotypes in establishing pre-existing immunity to SARS-CoV-2. We observed the presence of a diverse and immuno-dominant nucleocapsid epitope-specific memory pool in subjects with HLA-B*07 but saw little evidence of similar reactivity in individuals with other HLA alleles. Outside of the HLA-B*07, the epitope-specific 25 TCR repertoires observed were largely public in nature. We measured a diverse landscape of T cell phenotypes associated with SARS-CoV-2 infection, and also observed an influence on T cell repertoires reactive to persistent and latent infections with other viruses. Overall, this work provides a framework for the unified characterization of the cellular response to novel viral infections. The ability to understand the basis of cellular immunity to SARS-CoV-2 and other Clinical features of patients infected with 2019 novel coronavirus in Wuhan Adaptive immunity to SARS-CoV-2 and COVID-19 Memory T cell responses targeting the SARS coronavirus persist up to 40 11 years post-infection Longitudinal observation and decline of neutralizing antibody responses in the three months following SARS-CoV-2 infection in humans Antigen-Specific Adaptive Immunity to SARS-CoV-2 in Acute COVID-19 and Associations with Age and Disease Severity Immunological memory to SARS-CoV-2 assessed for up to eight months after infection. bioRxiv An Effective COVID-19 Vaccine Needs to Broad and strong memory CD4(+) and CD8(+) T cells induced by SARS-CoV-2 in UK convalescent individuals following COVID-19 SARS-CoV-2-reactive T cells in healthy donors and patients with COVID-19 SARS-CoV-2-derived peptides define heterologous and COVID-19-induced T cell recognition Robust T Cell Immunity in Convalescent Individuals with Asymptomatic 15 or Mild COVID-19 Characterization of pre-existing and induced SARS-CoV-2-specific CD8(+) T cells Selective and cross-reactive SARS-CoV-2 T cell epitopes in unexposed humans Unbiased Screens Show CD8(+) T Cells of COVID-19 Patients Recognize Shared Epitopes in SARS-CoV-2 that Largely Reside outside the Spike Protein Comprehensive analysis of T cell immunodominance and immunoprevalence of SARS-CoV-2 epitopes in COVID-19 cases. bioRxiv Targets of T Cell Responses to SARS-CoV-2 Coronavirus in Humans with COVID-19 Disease and Unexposed Individuals Phenotype and kinetics of SARS-CoV-2-specific T cells in COVID-19 patients with acute respiratory distress syndrome Magnitude and Dynamics of the T-Cell Response to SARS-CoV-2 CD8+ T cell responses in convalescent COVID-19 individuals target epitopes from the entire SARS-CoV-2 proteome and show kinetics of early differentiation. bioRxiv SARS-CoV-2 genome-wide T cell epitope mapping reveals 35 immunodominance and substantial CD8(+) T cell activation in COVID-19 patients Prevalence of antibodies to four human coronaviruses is lower in nasal secretions than in serum Influence of HLA supertypes on susceptibility and resistance to human immunodeficiency virus type 1 infection HLA-associated protection of lymphocytes during influenza virus infection Genomewide Association Study of Severe Covid-19 with 45 Respiratory Failure Elevated exhaustion levels and reduced functional diversity of T cells in peripheral blood may predict severe progression in COVID-19 patients Reduction and Functional Exhaustion of T Cells in Patients With Coronavirus Disease 2019 (COVID-19) Original Antigenic Sin: the Downside of Immunological Memory and Implications for COVID-19 Systematic Examination of Antigen-Specific Recall T Cell Responses to SARS-CoV-2 versus Influenza Virus Reveals a Distinct Inflammatory Profile Augmentation of HIV-specific T cell function by immediate treatment of hyperacute HIV-1 infection Profound Treg perturbations correlate with COVID-19 severity. bioRxiv Engineering T cells specific for a dominant severe acute respiratory syndrome coronavirus CD8 T cell epitope Prediction of SARS-CoV-2 epitopes across 9360 HLA class I alleles. bioRxiv A Sequence Homology and Bioinformatic Approach Can Predict Candidate Targets for Immune Responses to SARS-CoV-2 MHC-Peptide Tetramers to Visualize Antigen-Specific T Cells SCANPY: large-scale single-cell gene expression data analysis Efficient integration of heterogeneous single-cell transcriptomes using Scanorama Reference-based analysis of lung single-cell sequencing reveals a 30 transitional profibrotic macrophage Recovering Gene Interactions from Single-Cell Data Using Data Diffusion are employees and/or stockholders of Repertoire Immune Medicine reports compensation for consulting and/or SAB membership from Merck, Honeycomb Biotechnologies, Cellarity, Repertoire Immune Medicines, Hovione, Third Rock Ventures, Ochre Bio and Dahlia Biosciences. C.B. reports compensation for consulting for Repertoire Immune Medicines Data and materials availability: All data are available in the manuscript or supplementary materials. All reasonable request for data and material will be fulfilled MGH COVID-19 Collection and Processing Team participants