key: cord-0880041-zsugyr7v authors: Wang, Pingping; Jin, Xiyun; Zhou, Wenyang; Luo, Meng; Xu, Zhaochun; Xu, Chang; Li, Yiqun; Ma, Kexin; Cao, Huimin; Huang, Yan; Xue, Guangfu; Jin, Shuilin; Nie, Huan; Jiang, Qinghua title: Comprehensive analysis of TCR repertoire in COVID-19 using single cell sequencing date: 2020-12-28 journal: Genomics DOI: 10.1016/j.ygeno.2020.12.036 sha: b9c1ecf2b4e2b88fd3ac7b8238fbb651ed6680ef doc_id: 880041 cord_uid: zsugyr7v T-cell receptor (TCR) is crucial in T cell-mediated virus clearance. To date, TCR bias has been observed in various diseases. However, studies on the TCR repertoire of COVID-19 patients are lacking. Here, we used single-cell V(D)J sequencing to conduct comparative analyses of TCR repertoire between 12 COVID-19 patients and 6 healthy controls, as well as other virus-infected samples. We observed distinct T cell clonal expansion in COVID-19. Further analysis of VJ gene combination revealed 6 VJ pairs significantly increased, while 139 pairs significantly decreased in COVID-19 patients. When considering the VJ combination of α and β chains at the same time, the combination with the highest frequency on COVID-19 was TRAV12-2-J27-TRBV7-9-J2-3. Besides, preferential usage of V and J gene segments was also observed in samples infected by different viruses. Our study provides novel insights on TCR in COVID-19, which contribute to our understanding of the immune response induced by SARS-CoV-2. Since December 2019, the outbreak of Corona Virus Disease 2019 has posed a serious threat to global health [1] . The number of cases rose quickly. By September 2, 2020, there were more than 25 million infections and causing over 850,000 deaths worldwide [2] . Numerous researches have been conducted to study the patterns of epidemic [3, 4] , the disease etiology [5, 6] , and explore the potential treatment options of COVID-19 [7] [8] [9] . Like other viruses, adaptive immune responses play a central role in clearing severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and directly influence patients' clinical outcomes [10] . Especially T cells, which play a crucial role in the immune response to viral infection [11] . T cell receptor (TCR), expressed on the surface of T cell, together with the MHC molecules, plays an important role in peptide-MHC (pMHC) recognition and cellular activation [12, 13] . At the early stage of T cell development, TCR is generated by somatic rearrangement of variable (V), diversity (D), and joining (J) gene segments, known as V(D) J recombination [14] . When being activated, T cells proliferate rapidly and produce a large number of T cells with the same TCRs [10, 15] . Existing studies have revealed the changes of immune cell composition [16, 17] , cell-cell communication [18, 19] , and gene expression [20, 21] in patients with COVID-19. However, studies on the TCR repertoire of COVID-19 patients are lacking, and further exploration is urgently needed. Here, we conducted a systematically comparative analysis of TCR repertoire between 12 early-recovery COVID-19 patients and 6 healthy controls, as well as samples infected by several other types of viruses, including CMV, EBV and Influenza A. We comprehensively analyzed TCR repertoire diversity, CDR3 length distribution, the V and J gene segment preference, and the dominant combination of αβ VJ gene pairing in different viral infections. Our study provides novel insights into the preferential usage of V and J gene segments in COVID-19, which contribute to our understanding of the immune response induced by SARS-CoV-2 infection. COVID-19 patients compared with healthy controls For 12 COVID-19 patients at the time of hospital discharge and 5 healthy controls, we collected fresh blood samples to perform single-cell V(D)J sequencing. A total of 22,407 and 37,282 T cells with complete TCR alpha and beta chain sequences were obtained from COVID-19 patients and healthy controls, respectively (Table S1) . When being activated, T cells undergo a process known as "clonal expansion", in which the activated T cells proliferate rapidly and produce a large number of T cells with the identical TCRs [10, 15] . To explore the TCR bias in COVID-19, we conducted a comparative analysis of T cell repertoire between COVID-19 patients and healthy controls. Although most cells contained unique TCR, there were varying degrees of reuse patterns in COVID-19 patients and healthy controls (Fig. 1a) . On average, about ~25% of the T cells harbored cloned TCRs in COVID-19 patients, compared with ~17% in healthy controls ( Supplementary Fig. 1a ). The distribution of amino acid (aa) length in CDR3β was consistency between COVID-19 patients and healthy controls, ranges from 7 to 21 in COVID-19 patients, and from 6 to 22 in healthy controls. The most frequently used length was 15 aa in both types of samples (Supplementary Fig. 1b) . However, for clonally expanded TCRs, the proportion of T cells with the same CDR3 length was significantly higher in COVID-19 patients compared with healthy controls (p = 0.016, Mann-Whitney U test) (Fig. 1b) . Next, we compared the clonal expansion of TCRs between COVID-19 patients and healthy controls. TCR clonal size of COVID-19 patients ranged from 3 to 327, and the clonal diversity was significantly lower than that in healthy controls (p = 1.08e-4, Mann-Whitney U test) (Fig. 1c, Supplementary Fig. 1c ). Only 63 TCRs were shared between the COVID-19 patients and healthy controls. Among all disease-specific TCRs, 505 were clonally expanded, accounting for about 99.21% of the total clonally expanded TCRs (Fig. 1d ). T cell receptors are generated by rearrangement of variable (V), diversity (D), and joining (J) gene segments for the TCR β chain (TRB), and V and J gene segments for the TCR α chain (TRA) [22] . Here, we explored the usage bias of V and J gene segments for COVID-19 patients c. Box plot shows that the clonal diversity of COVID-19 is significantly lower than that of healthy controls. Each dot represents the clonal diversity of each sample. p-value was calculated using the Mann-Whitney U test. d. Venn diagram shows the common and specific TCR numbers of COVID-19 patients and healthy controls. Bar plots show the distribution of COVID-19 specific clonotypes (top right) and the distribution of COVID-19 specific clonally expanded clonotypes (bottom right). compared with healthy controls. First, we focused on every single V and J gene segment and found that some V and J gene segments on the alpha/beta chain were significantly increased or decreased in frequency compared with healthy controls (p < 0.05, Fisher's exact test) (Fig.2a) . Among COVID-19 patients, the most frequently used gene segments were TRAV12-2, TRAJ49, TRBV20-1 and TRBJ2-7. Moreover, compared with healthy controls, the frequencies of TRAV4, TRAJ27, TRBV4-2 and TRBJ2-6 were significantly higher in COVID-19 patients. Consistent with the previous study [23] , we also observed an increase in the frequency of some V gene segments in the disease, including TRAV22, TRAV23/DV6, TRAV3, TRAV9-2, TRBV13 and TRBV14. Next, we compared the V-J pairing of alpha and beta chains separately. In COVID-19, there were 1876 unique TRAV-TRAJ pairs, of which 1827 were significantly shared with healthy controls (p < 0.05, hypergeometric test). More than 99% and 98% of the T cells used these common VJ pairs in COVID-19 patients and healthy controls, respectively (Fig. 2b) . For all common TRAV-TRAJ pairs, we performed a differential analysis based on their frequency (|Fold Change| > 1 and p < 0.05, Mann-Whitney U test). The results showed that there were 122 gene pairs with significant differences, among which four pairs (TRAV23/DV6-TRAJ13, TRAV26-1-TRAJ26, TRAV8-1-TRAJ32 and TRAV29/DV5-TRAJ44) had significantly increased frequency in COVID-19 patients, another 118 pairs were significantly decreased (Fig. 2c, d) . Similar results were obtained from the TRBV-TRBJ pairs, of all the 553 unique gene pairs in COVID-19 patients, 546 were shared with healthy controls (Fig. 2b) . And more than 99% of the cells harbored these common VJ pairs in both two types of samples. Only seven βVJ pairs were disease-specific. Differential analysis showed that TRBV4-2-TRBJ2-2 and TRBV7-7-TRBJ2-3 were significantly increased and 21 pairs were significantly decreased under the disease condition (Fig. 2c, d) . From the above results, it was found that the VJ pairs of alpha or beta chain in the COVID-19 patients were consistent with the healthy controls. Next, we focused on the VJ pairs of two chains. In contrast to the single chain, only 1634 αβ VJ pairs were common, and most of the pairs were sample-specific (Fig. 2b) . Cells harbored common αβ VJ pairs only accounting for 11.76% and 7.74% of the TCRs in COVID-19 patients and healthy controls, respectively. Of all the common αβ VJ pairs, only 19 pairs showed significant differences in frequency between COVID-19 patients and healthy controls (p < 0.05, Fisher's exact test) Fig. 2a) . For the sample-specific αβ VJ pairs, we removed the ones that appeared only once, resulting in 1217 and 2592 αβ VJ pairs in COVID-19 patients and healthy controls, respectively. The most frequent αβ VJ pair in COVID-19 patients was TRAV12-2-J27-TRBV7-9-J2-3, while in healthy controls was TRAV12-2-J12-TRBV12-4-J2-3 (Fig. 2e, Supplementary Fig. 2b ). T cells recognize antigenic peptides through surface T cell receptors and play a central role in cell-mediated immune responses [24] [25] [26] [27] . Then what are the similarities and differences in the TCR repertoire between COVID-19 patients and other virus-infected samples? To this end, we downloaded single-cell TCR data of other virus-infected samples from VDJdb (https://vdjdb.cdr3.net/) [28] . There was a total of 12 types of viral infection with single-cell TCR data. After screening the cells with complete TCR alpha and beta chain information, only samples with a cell number greater than 100 were selected for subsequent analysis (Table S2) . As with COVID-19 patients, most of the TCRs in other viral infections were unique, and there were more clonal TCRs in influenza A and EBV infected samples (Fig. 3a) . The proportion of clonally expanded TCRs in COVID-19 patients was significantly higher than that in CMV and YFV infected samples, and significantly lower than that in EBV and Influenza A infected samples (p < 0.05, Fisher's exact test) (Fig. 3b) . Then we conducted clonal diversity analysis to compare the TCR clonal expansion under different virus infection conditions. TCR clonal diversity showed different degrees (Fig. 3c) . EBV infected samples presented high clonal diversity, while the diversity of COVID-19 patients was at a moderate level. Further similarity analysis revealed low TCR clonal similarity between COVID-19 patients and other virus-infected samples (Fig. 3d) . Moreover, there were no overlapped clonal TCRs between COVID-19 patients and other virus-infected samples ( Supplementary Fig. 3a) . The CDR3β length distribution of all TCRs ranged from 7 to 26, showing a unimodal distribution (Supplementary Fig. 3b ). Like COVID-19 patients, the CDR3β length of CMV, EBV and HCV infected samples were mainly 15 aa. The TCR length distribution of influenza A infected samples was the most concentrated, in which more than 60% of the TCRs were 13 aa. For clonal TCRs, the proportion of cells with the same CDR3β length was significantly higher in COVID-19 patients compared with CMV infected samples (p = 0.012, Mann-Whitney U test) (Fig. 3e) . There were also differences in the usage of single V and J gene segment in these data, some V and J gene segments that had a significantly higher frequency in COVID-19 patients than healthy controls, were also significantly higher than those in other virus-infected samples ( Supplementary Fig. 3c) . The VJ pairs of the alpha chain in COVID-19 patients were mostly consistent with other virus-infected samples, about 13% being specific (Fig. 3f, Supplementary Fig. 4a ). There were fewer specific pairs on the beta chain, only about 3% (Fig. 3g, Supplementary Fig. 4b ). Differential analysis of the frequency of overlapped VJ pairs between COVID-19 patients and other samples revealed 137 and 125 significantly differential alpha and beta VJ pairs ( Supplementary Fig. 4c, d, e) . Most differential VJ pairs between COVID-19 patients and different virus-infected samples were inconsistent, but there was also several common VJ pairs. Specifically, 5 α-chain and 18 β-chain VJ pairs were significantly increased, and 11 α-chain and 17 β-chain VJ pairs were significantly decreased in COVID-19 patients compared with several virus-infected samples (Supplementary Fig. 4e) . Finally, we compared the αβ VJ pairs of all clonal TCRs (Fig. 4a) . Very few αβ VJ pairs were shared between different virus-infected samples, and only 0.46% (3/656) pairs in COVID-19 patients were consistent with other samples (Fig. 4b) . The combination of TRAV12-2-J27-TRBV7-9-J2-3 also appeared most frequently in COVID-19 specific αβ VJ pairs, which represented the highest clonally expanded TCR, suggesting its important role in COVID-19 (Fig. 4c) . The diversity and specificity of T cell receptors (TCR) are the core of adaptive immunity [29] . TCR variability is necessary for the effective identification of antigen peptides that are presented by the MHC molecules on the surface of antigen-presenting cells [29] . To date, numerous examples of TCR bias have been observed in various diseases [30] [31] [32] [33] [34] . However, studies on the TCR repertoire of COVID-19 patients are lacking. In this study, we for the first time systematically analyzed the changes of TCR clone and preferential usage of V and J gene segments in COVID-19 patients compared with healthy controls and other virusinfected samples. We observed a significant decrease of TCR clone diversity in COVID-19 patients compared with healthy controls. The frequencies of some V and J gene segments, such as TRAV4, TRAJ27, TRBV7-9 and TRBJ2-3, were significantly higher in COVID-19 patients than that in healthy controls. Preferential usage of V and J gene segments was also observed between COVID-19 patients and other virusinfected samples. Due to the spectacularly large and highly diverse TCR repertoire, the combinations of αβ VJ gene were highly sample-specific between COVID-19 patients and healthy controls. A larger population cohort is urgently needed to provide more statistical T cell receptor preference. Overall, our study provides novel insights into the preferential usage of V and J gene segments in COVID-19, which contribute to our understanding of the immune response induced by SARS-CoV-2 infection. Patient samples: COVID-19 patients were hospitalized at the Harbin sixth Hospital in China. Patients meeting the following criteria can be discharged: 1). Afebrile for >3 days, 2). Improved respiratory symptoms, 3). Pulmonary imaging shows obvious absorption of inflammation, 4). Nucleic acid tests negative for respiratory tract pathogen twice consecutively (sampling interval ≥ 24 h). Fresh blood samples were collected from a total of 12 COVID-19 patients at the time of hospital discharge (Table S1 ). The clinical details were obtained from the Harbin sixth Hospital. Informed consent was obtained from the patients and this study was approved by the Ethics Committee in the Harbin sixth Hospital (Approval number: 2020NO.12). Control samples: Six healthy control samples were included in this study, among which N1 was from the official website of 10× genomics (https://support.10xgenomics.com/single-cell-vdj/datasets), the blood samples of another five samples were collected from healthy donors before the COVID-19 outbreak. CD3+ T cells were then isolated from PBMCs by fluorescence-activated cell sorting (FACS) analysis. Single-cell 5 ′ and V(D)J sequencing was performed following the protocol provided by the 10× genomics Chromium Single Cell Immune Profiling Solution. Briefly, cell suspensions (400-1000 living cells per microliter determined by CounterStar) were loaded on a Chromium Single Cell Controller (10× Genomics) to generate single-cell gel beads in emulsion (GEMs) by using Chromium Single Cell V(D)J Reagent Kits. Captured cells were lysed and the released RNAs were barcoded through reverse transcription in individual GEMs. Each single-cell 5 ′ and V(D)J libraries were sequenced on the Illumina's Novaseq 6000 using 150 paired-end reads. The analysis pipelines in Cell Ranger (10× Genomics, version 3.1.0) were used for single-cell sequencing data processing. V(D)J sequence assembly and paired clonotype calling were performed using cellranger vdj with -reference = refdata-cellranger-vdj-GRCh38-alts-ensembl-3.1.0 for each sample. The clonal diversity of each sample was calculated using Shannon's entropy, which took into account both CDR3 abundance and the frequency distribution of each CDR3 sequence [35] . Entropy was calculated by summing the product of the frequency of each clone with the log2 of that frequency [36] . The higher Shannon's entropy, the more diverse the distribution of CDR3 clones. In particular, for the diversity comparison of COVID-19 patients and VDJdb samples, since the cell number of COVID-19 patients and CMV infected samples were more than 3 times higher compared with other samples, we conducted 1000 times of down-sampling for them, and the down-sampling value was the average number of cells in the other 5 samples. Finally, the mean Shannon's entropy was used to represent the clonal diversity of these two samples. where f xi and f yi represent the frequency of overlapped clonotype i in sample x and y, and N represents the total number of overlapped clonotype [37, 38] . The hypergeometric test was used to assess whether it is significantly shared in VJ pairs between COVID-19 patients and healthy controls. Differences between the two groups were compared using either the Mann-Whitney U test or the Fisher's exact test. The significance thresholds were p < 0.05. All statistical analysis was implemented with R software (Version 3.5.1). Supplementary data to this article can be found online at https://doi. org/10.1016/j.ygeno.2020.12.036. The authors declare no competing interests. SARS-CoV-2 vaccines: status report An interactive web-based dashboard to track COVID-19 in real time The COVID-19 epidemic The effect of human mobility and control measures on the COVID-19 epidemic in China Pathological findings of COVID-19 associated with acute respiratory distress syndrome Liver diseases in COVID-19: etiology, treatment and prognosis Convalescent plasma as a potential therapy for COVID-19 Effect of convalescent plasma therapy on time to clinical improvement in patients with severe and life-threatening COVID-19: a randomized clinical trial COVID-19: combining antiviral and anti-inflammatory treatments Deciphering the TCR repertoire to solve the COVID-19 mystery Antiviral CD8(+) T cells restricted by human leukocyte antigen class II exist during natural HIV infection and exhibit clonal expansion TCR-mediated recognition: relevance to tumor-antigen discovery and cancer immunotherapy The TCR is an allosterically regulated macromolecular machinery changing its conformation while working The histone methyltransferase Setd2 is indispensable for V(D)J recombination Select sequencing of clonally expanded CD8(+) T cells reveals limits to clonal expansion Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19 A single-cell atlas of the peripheral immune response in patients with severe COVID-19 COVID-19 severity correlates with airway epithelium-immune cell interactions identified by single-cell analysis Immune cell profiling of COVID-19 patients in the recovery stage by single-cell sequencing A dynamic immune response shapes COVID-19 progression Down-regulated gene expression spectrum and immune responses changed during the disease progression in COVID-19 patients Characterization of human alphabetaTCR repertoire and discovery of D-D fusion in TCRbeta chains Adaptive immune responses to SARS-CoV-2 infection in severe versus mild individuals Virus infection, antiviral immunity, and autoimmunity Molecular and cellular insights into T cell exhaustion Transcriptional control of effector and memory CD8+ T cell differentiation The integration of T cell migration, differentiation and function VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium Jing, alphabeta T-cell receptor bias in disease and therapy (review) Selection of T cell clones expressing high-affinity public TCRs within human cytomegalovirus-specific CD8 T cell responses Strong TCR conservation and altered T cell cross-reactivity characterize a B*57-restricted immune response in HIV-1 infection Large TCR diversity of virus-specific CD8 T cells provides the mechanistic basis for massive TCR renewal after antigen exposure T-cell receptor bias and immunity Extensive conservation of alpha and beta chains of the human T-cell antigen receptor recognizing HLA-A2 and influenza A matrix peptide TCR repertoire as a novel indicator for immune monitoring and prognosis assessment of patients with cervical cancer CD8(+) T-cell pathogenicity in Rasmussen encephalitis elucidated by large-scale T-cell receptor sequencing VDJtools: unifying post-analysis of T cell receptor repertoires Dynamics of individual T cell repertoires: from cord blood to centenarians We thank all the patients (and their families) and healthy controls for participating in the clinical trial and permitting us to use their specimens for this research. We thank Harbin sixth Hospital in China for providing samples. And we also thank Jinliang Li and Dongming Zhao for their guidance in sample collection, cell extraction and single-cell library construction. This work was funded by the National Natural Science Foundation of China (Nos. 61822108, 62041102 and 6203000303), and the Emergency Research Project for COVID-19 of Harbin Institute of Technology (No. 2020-001).