key: cord-0897226-78t533p9 authors: Jin, Xiyun; Zhou, Wenyang; Luo, Meng; Wang, Pingping; Xu, Zhaochun; Ma, Kexin; Cao, Huimin; Xu, Chang; Huang, Yan; Cheng, Rui; Xiao, Lixing; Lin, Xiaoyu; Pang, Fenglan; Li, Yiqun; Nie, Huan; Jiang, Qinghua title: Global characterization of B cell receptor repertoire in COVID-19 patients by single-cell V(D)J sequencing date: 2021-05-20 journal: Brief Bioinform DOI: 10.1093/bib/bbab192 sha: 453a30d9e0327f28e64870538aa4ff4e359ab6cb doc_id: 897226 cord_uid: 78t533p9 The world is facing a pandemic of Corona Virus Disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Adaptive immune responses are essential for SARS-CoV-2 virus clearance. Although a large body of studies have been conducted to investigate the immune mechanism in COVID-19 patients, we still lack a comprehensive understanding of the BCR repertoire in patients. In this study, we used the single-cell V(D)J sequencing to characterize the BCR repertoire across convalescent COVID-19 patients. We observed that the BCR diversity was significantly reduced in disease compared with healthy controls. And BCRs tend to skew toward different V gene segments in COVID-19 and healthy controls. The CDR3 sequences of heavy chain in clonal BCRs in patients were more convergent than that in healthy controls. In addition, we discovered increased IgG and IgA isotypes in the disease, including IgG1, IgG3 and IgA1. In all clonal BCRs, IgG isotypes had the most frequent class switch recombination events and the highest somatic hypermutation rate, especially IgG3. Moreover, we found that an IgG3 cluster from different clonal groups had the same IGHV, IGHJ and CDR3 sequences (IGHV4-4-CARLANTNQFYDSSSYLNAMDVW-IGHJ6). Overall, our study provides a comprehensive characterization of the BCR repertoire in COVID-19 patients, which contributes to the understanding of the mechanism for the immune response to SARS-CoV-2 infection. and healthy controls. The CDR3 sequences of heavy chain in clonal BCRs in patients were more convergent than that in Corona Virus Disease 2019 , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has posed a serious threat to global health [1, 2] . Over the past year, SARS-CoV-2 has infected more than 63 000 000 people and caused more than 1 400 000 deaths worldwide [3] . Numerous studies have contributed to the diagnosis, characterization and treatment of COVID-19. Response to SARS-CoV-2 of patients differs ranging from asymptomatic to requiring intensive care. Similar to severe acute respiratory syndrome coronavirus (SARS-CoV) and middle east respiratory syndrome coronavirus (MERS-CoV), SARS-CoV-2 uses angiotensin-converting enzyme 2 (ACE2) on the surface to enter cells and can be detected in multiple organs, including lungs, pharynx, brain, liver, kidneys and heart [4] [5] [6] . Studies have shown that SARS-CoV-2 suppresses the innate immune response and reduces the levels of type I and III interferons [7, 8] . It was also found that the number of lymphocytes was decreased and serum inflammatory cytokines levels were increased in the peripheral blood of patients [9, 10] . Adaptive immune responses play a central role in clearing SARS-CoV-2 infection and directly influence patients' clinical outcomes. After entering cells through ACE2, SARS-COV-2 is mainly sensed by toll-like receptor 7 (TLR7), which exists in the endosome [11] . Activation of TLR7 results in the production of alpha interferon, TNF-alpha, and the secretion of IL-12 and IL-6. This leads to the formation of CD8 + cytotoxic T cells and, through CD4 + helper T cells, results in antigen-specific B cells formation and antibody production [12] . The diversity of B cell receptors (BCRs) can recognize and counter the invasion of multiple pathogens, and antibodies produced by B cells can provide long-term protection for bodies [13] . The initial diversity of the BCR repertoire is the result of a somatic recombination process called V(D)J recombination [14] . After encountering antigen, BCRs undergo a process of affinity maturation, in which rapid somatic hypermutation (SHM) and class switch recombination (CSR) lead to improved antigen binding [15, 16] . Multiple studies have isolated SARS-CoV-2specific neutralizing antibodies from COVID-19 patients [17] [18] [19] [20] [21] [22] . However, further exploration of the BCR repertoire in COVID-19 patients is urgently needed. Recent studies suggest that early-stage recovery patients still maintain various immune responses in the circulation [23] . Here, we conducted a comprehensive analysis of the peripheral blood BCR repertoire in 12 early-recovery COVID-19 patients. The BCR diversity, different gene segment usage, CDR3 length distribution and the SHM and CSR of isotypes were analyzed between COVID-19 patients and healthy controls ( Figure 1 ). Our study provides detailed insights on BCR repertoire in COVID-19, contributing to a better understanding of the humoral immune response after SARS-CoV-2 infection. Twelve COVID-19 patients were enrolled in this study at the Harbin sixth Hospital. Fresh blood samples were collected from patients at the time of hospital discharge ( Figure 1 , Table S1 , see Supplementary Data available online at http://bib. oxfordjournals.org/). The discharged patients must meet the following four criteria: (i) Afebrile for more than 3 days; (ii) Improved respiratory symptoms; (iii) Pulmonary imaging shows obvious absorption of inflammation; (iv) Nucleic acid tests negative for respiratory tract pathogen twice consecutively (sampling interval ≥ 24 h). Six healthy donors were enrolled as the control group, whose blood samples were collected before the COVID-19 outbreak (Table S1 , see Supplementary Data available online at http://bib.oxfordjournals.org/). Single-cell V(D)J sequencing was performed following the protocol provided by the 10X genomics Chromium Single Cell Immune Profiling Solution. The analysis pipelines in Cell Ranger (10X Genomics, version 3.1.0) were used for singlecell sequencing data processing. V(D)J sequence assembly and paired clonotype calling were performed using cellranger vdj with -reference = refdata-cellranger-vdj-GRCh38-alts-ensembl-3.1.0 for each sample. The clonal diversity of each sample was calculated using Shannon's entropy [24, 25] . Entropy was calculated as follows: where N is the number of all clonotypes in each sample and P(i) represents the frequency of the ith clonotype. The higher Shannon's entropy, the more diverse the distribution of CDR3 clones. We conducted the BCR group analysis for each patient separately using SCOPer, which can identify B cell relationships from adaptive immune receptor repertoire sequencing data [26] . Then, we combined SCOPer results from all patients and obtained 9268 BCR groups. Groups contain at least three BCRs were defined as clonal expanded. We first grouped BCRs according to different CDR3 lengths and then used R package 'ggseqlogo' to identify the motif of CDR3 with the same length [27] . SHM analysis was conducted by SHazaM, which is part of the Immucantation analysis framework [28] . The Class Switch (CS) analysis took all BCRs sequences as input. For each sample, we calculated the pairwise sharing number of groups between isotypes. Then, we normalized the number of CSR by dividing the total number of groups of pairwise isotypes. Considering the real order of the CS event in the immunological process, we adjusted the CS direction of each pair of isotypes to the right order, resulting in 13 CS events in the end (IGHD/M -> IGHG1/ IGHA1/ IGHG2, IGHG3 -> IGHG1/ IGHA1/ IGHG2/ IGHG4, IGHG1 -> IGHA1/ IGHG2/ IGHG4, IGHA1 -> IGHG4/ IGHA2, IGHG2 -> IGHA2). Fisher's exact test was used to assess differences in usage of V and J gene segments and the proportion of Ig isotypes between COVID-19 patients and healthy controls. Wilcoxon's signed-rank test was used to analyze differences in clonal diversity, CDR3 length and SHM rate among different groups. The significance thresholds were P < 0.05. All statistical analyses were implemented with R software (Version 3.5.1). (Table S1 , see Supplementary Data available online at http://bib.oxfordjournals.org/, Method). BCR is comprised of variable region and constant region, and its immensely diverse repertoire is attributed to the V(D)J recombination by assembling V, D and J gene segments into different combinations [29, 30] . We first investigated gene segment usage in the whole BCR repertoires of COVID-19 patients and healthy controls. For V gene segments in the heavy chain, IGHV3, IGHV4, IGHV1 and IGHV2 gene families were frequently used in both COVID-19 patients and healthy controls (Figure 2A ), especially IGHV3 and IGHV4 family, which accounted for more than 75% of all BCRs. However, the frequencies of gene segments in each IGHV family were significantly different between COVID-19 and healthy controls ( Figure 2B -E). Specifically, IGHV3-30, IGHV3-23, IGHV3-15 and IGHV3-33 of the IGHV3 family, IGHV4-4, IGHV4-31 and IGHV4-30-2 of the IGHV4 family, IGHV1-3 of the IGHV1 family and IGHV2-70 of the IGHV2 family were significantly increased in COVID-19 (Fisher's exact test, P-value < 0.05, Odd Ratio > 1). Increased use of IGHV3-30 and IGHV3-15 has been observed in human antibodies against influenza virus, cytomegalovirus (CMV) and Ebola virus [31] [32] [33] . For V gene segments in the light chain, IGKV1 and IGKV3 were the most frequently used segments, followed by IGLV1, IGLV2, IGLV3, IGKV2 and IGKV4 ( Figure S1A , see Supplementary Data available online at http://bib.oxfordjournals.org/). Similarly, gene segments in each IGKV or IGLV family were significantly different between COVID-19 and healthy controls (Figure S1B-I, see Supplementary Data available online at http://bib.oxfordjournals.org/). Analysis of IGHJ gene segments showed that the extremely preferential segment was IGHJ4, which accounted for more than 40% of all BCRs in both disease and healthy controls ( Figure 2F ). And the proportion of IGHJ3, IGHJ4 and IGHJ5 was significantly different between the two types of samples ( Figure 2G ). There was no bias Analysis of BCR clonotypes showed that the clonal diversity of COVID-19 patients was significantly lower than healthy controls (Wilcoxon's signed-rank test, P-value = 2.45 × 10 −2 , Figure 3A and Table S2 , see Supplementary Data available online at http://bib.oxfordjournals.org/). Then, we clustered BCRs to identify expanded B cell clonotypes in each patient separately. We assembled a total of 9268 BCR groups from all 12 COVID-19 patients, 310 of which contain at least three BCRs, which were defined as clonal B cells. The proportion of clonal cells was much higher in COVID-19 compared with healthy controls. On average, about ∼25% clonal cells in COVID-19 patients, compared with ∼7% in healthy controls (Wilcoxon's signedrank test, P-value = 4.74 × 10 −3 , Figure 3B ). There are no BCRs shared between COVID-19 and healthy controls. We further examined whether clonal BCRs were shared across patients and found that little identical BCRs were only present in two patients, which suggests a potentially wide range of BCRs with antivirus function. Then, we examined the usage of V and J gene segments in clonal BCRs. The most frequently used V and J gene segments were those mentioned above that were significantly higher in disease than healthy controls, such as IGHV4-39, IGHV3-23, IGHV1-3, IGHV2-70 and IGHJ3 of the heavy chain, and IGKV1D-13, IGLV1-51, IGLJ3 and IGKJ2 of the light chain ( Figures 3C and S4A The complementarity-determining region 3 (CDR3), a highly variable region in BCR, plays the most important role in specific antigen recognition in B cells [30, 34] . We next explore the characteristics of CDR3 sequences of clonal expended BCR in COVID-19. In the heavy chain, the length of CDR3 was concentrated in three lengths,15, 18 and 22 amino acids, among which 18 amino acids accounted for the largest proportion, and it was significantly longer than that in non-cloned BCRs and healthy controls (Wilcoxon's signed-rank test, P-value < 2.22 × 10 −16 , Figure 3D and E). Further motif enrichment analysis showed that the amino acid sequences of CDR3s in the heavy chain were more convergent compared with that of healthy controls, especially the CDR3s with 15, 18 and 23 amino acids, suggesting high specificity of BCRs after antivirus immune ( Figures 3F and S3 , see Supplementary Data available online at http://bib.oxfordjournals.org/). Interestingly, the CDR3 with different lengths correspond to different V gene segments ( Figure 3F ). Although the average length of CDR3s in the light chain of clonal expended BCRs was significantly increased, the proportion of BCRs with different CDR3 lengths was highly consistent between disease and healthy controls. The CDR3 length of all light chains was mainly 11, 12 and 13 amino acids, and the amino acid sequence was more conserved ( Figure S4B -D, see Supplementary Data available online at http://bib.oxfordjournals.org/). Altogether, our result suggested that there was a significantly clonal expansion of peripheral blood B cells and preferential usage changes of BCR repertoire in COVID-19 patients, especially in the heavy chain of BCR. The constant region of the BCR heavy chain determines the type of immunoglobulin (Ig) and the effector function they can induce. There are five types of immunoglobulin in humans, including IgM, IgD, IgG, IgA and IgE. Different Ig types usually will have a different degree of antibody affinity [35, 36] . We first compared the frequency of different Ig isotypes between COVID-19 and healthy controls. IgM presents the largest proportion in all BCR repertoire (>50%), but we observed that its proportion was significantly decreased in COVID-19 patients, which is consistent with the truth of other isotypes with higher affinity arising up under anti-infective process (Fisher's exact test, P-value < 0.05, Odd Ratio > 1, Figures 4A and S5A and B, see Supplementary Data available online at http://bib.oxfordjournals.org/). The proportions of IgG1, IgG3 and IgA1 were significantly increased in COVID-19 patients and the ratios of IgG to IgM/D and IgA to IgM/D in COVID-19 were much higher than that in healthy controls ( Figure 4B ). These changes indicated a strong humoral immune response to clear viral particles in the blood and respiratory mucosa [37, 38] . For clonal BCRs of COVID-19, the proportion of IgM was significantly decreased (Figures 4C and S5A , see Supplementary Data available online at http://bib.oxfordjournals.org/). And IgG1 and IgA1 were the major Ig classes, accounting for 40.15 and 18.81% of all clonal BCRs, respectively ( Figure 4C ). IgGexpressing B cells have higher plasma cell differentiation ability, thus increasing antibody secretion in response to viral infection [39, 40] . Statistics of the gene segments in each isotype showed that IgM and IgD utilized multiple V gene segments, while IgG and IgA showed skewing of different V gene segments ( Figures 4D and E and S5C -H, see Supplementary Data available online at http://bib.oxfordjournals.org/). Specifically, IgG was primarily preferred to use IGHV4, in which IgG1 mainly used IGHV4-39, IgG2 mainly used IGHV4-30-2, IgG3 and IgG4 mainly used IGHV4-4, while IgA1 and IgA2 were primarily preferred to use IGHV2-70 and IGHV3-33, respectively, indicating respective potential local induction in serum and mucosa ( Figures 4D and E and S5E -H, see Supplementary Data available online at http://bib.oxfordjournals.org/). Further analysis revealed that the length of CDR3 in each isotype was also different. The distribution of CDR3 length in IgM and IgD was consistent with that of healthy controls and non-clonal BCRs of COVID-19 ( Figures 3D and 4F) . However, the length of CDR3 of different isotypes in IgG and IgA tend to be concentrated in a certain length and longer than that of IgM and IgD, except for IgG2 ( Figure 4F ). Studies have revealed the enrichment and dominance of certain CDR3 lengths of B cells in antigen-specific populations [41, 42] . And another analysis of HIV-1 + samples showed that CDR3 lengths of IgG and IgA were longer than that in healthy donors [43] . Therefore, the enrichment and longer CDR3 with certain amino acid lengths in clonal IgG and IgA BCRs may play an important role in virus clearance. After B cells respond to antigen, many BCRs undergo CSR to progressively refine the antibody. This process entails the DNA deletion and recombination of IgM to generate 'downstream' isotypes and mutation of nucleotides in the antigen-binding region (SHM), a process followed by selection and resulting in affinity maturation. We explored the SHM and CSR in COVID-19 patients compared with healthy controls. The class-switch types detectable in COVID-19 patients were increased, and most CSR only take place in clonal expanded BCRs ( Figure 5A-D) . Compared with other isotypes, IgG isotypes frequently undergo class switch, especially IgG3. Further analysis found that the SHM ratio of clonal BCRs was significantly higher than that of non-clonal BCRs or healthy controls (Wilcoxon's signedrank test, Figure 5E ). Among all isotypes of clonal BCRs in COVID-19, the SHM rate of IgG was higher than that of other isotypes, and IgG3 also showed the highest SHM rate compared with other IgG isotypes (Wilcoxon's signed-rank test, Figures 5F and S6 , see Supplementary Data available online at http://bib.oxfordjournals.org/). High titers of IgG antibodies have been detected in the blood samples of discharged patients [44, 45] . Next, we focus on the IgG3 isotype due to its frequent CSR events and high SHM rate. There were 12 IgG3 associated clonal BCR groups, 7 of which had CSR events. We found that the V, J gene segment and CDR3 sequence of most IgG3 was IGHV4-4-CARLANTNQFYDSSSYLNAMDVW-IGHJ6, the CDR3 of which was a part of the motif described above (Table 1, Figure 3F ). Besides, none of the CDR3 sequences of clonal IgG3 BCRs exists in healthy controls. Our results suggested that the IgG3 isotype may be a key factor in virus clearance. Since December 2019, COVID-19 has led to a worldwide pandemic [46] . In addition to severe respiratory pathology, COVID-19 can also cause thrombotic complications, myocardial dysfunction and arrhythmia, acute coronary syndromes, acute kidney injury, gastrointestinal symptoms, even neurologic illnesses and so on [47] . Given the serious threat COVID-19 poses to global health, a comprehensive understanding of immune responses in patients is essential for both pathological study and therapy [48] . The humoral immune responses play a central role in clearing SARS-CoV-2 infection and directly influence patients' clinical outcomes. SARS-CoV-2 elicits a robust B cell response and virus-specific antibodies can be detected in the days following infection [46] . Analysis of BCR repertoire is critical for understanding the immune response of COVID-19 patients. In this study, we delineated the BCR repertoire in peripheral blood of convalescent COVID-19 patients. Compared with the healthy controls, our analysis revealed higher frequent changes in the heavy chain of BCRs in disease. The BCR of disease converged on different IGHV3 and IGHV4 rearrangements, and the CDR3H was significantly longer than healthy controls and tended to converge on a few different lengths, while there was no significant difference in the light chain. These results imply that the BCR heavy chain plays a major role in clearing virus infection. Subsequent analysis of the isotype found a remarkably increased proportion of IgG and IgA in COVID-19 patients. IgA plays an important role in mucosal immunity [49] . An increase in IgA showed that it could be synthesized and widely migrated to the respiratory tract, gastrointestinal tract or other mucosal sites to perform immune functions [50] . IgG can provide long-term protection, and high titers of virus-specific IgG antibodies have been detected in the plasma of convalescent patients [51] . Besides, among all IgG isotypes, IgG3 had the most frequent SHM and CSR events, and we found an IgG3 cluster from different clonal groups which had the same IGHV, IGHJ and CDR3 sequence. These results imply that IgG3 may play an important role in fighting against SARS-CoV-2 infection. Our study systematically characterized the BCR repertoire of COVID-19 patients. However, the data did not cover the BCR repertoire status before and during the onset of the disease, and the samples were obtained from 5 ml peripheral blood, so some clonotypes may not be included. In addition, the comparison between COVID-19 and normal infected samples was not performed because of the lack of data. Therefore, large population cohort studies and normal infected samples are urgently needed to obtain a more detailed view of the specific B cell immune response in COVID-19 patients. Taken together, our study systematically characterizes the BCR repertoire of COVID-19 patients, which provides substantial insight into exploring the immune response induced by SARS-CoV-2. The differential analysis of SHM rate among different Ig isotypes in clonal BCRs. Statistical significance was evaluated using the Wilcoxon rank-sum test. Data used in this study are available from the corresponding author by request. • We performed scBCR-seq for 12 COVID-19 patients and six healthy controls. • The clonal diversity of BCR is significantly reduced in COVID-19 patients. • BCRs skew toward different V gene segments in COVID-19 and healthy controls. • IgG3 was frequently undergoing Class Switch Recombination events and the highest somatic hypermutation in COVID-19. A novel coronavirus from patients with pneumonia in China Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia An interactive web-based dashboard to track COVID-19 in real time Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor Immunological considerations for COVID-19 vaccine strategies Imbalanced host response to SARS-CoV-2 drives development of COVID-19 Severe immunosuppression and not a cytokine storm characterizes COVID-19 infections Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirusinfected pneumonia in Wuhan, China Activation of TLR7 and innate immunity as an efficient method against COVID-19 pandemic: imiquimod as a potential therapy Why the immune system fails to mount an adaptive immune response to a COVID-19 infection Human adaptive immune receptor repertoire analysis-past, present, and future High frequency of shared clonotypes in human B cell receptor repertoires The diversity and molecular evolution of B-cell receptors during infection Combined influence of Bcell receptor rearrangement and somatic hypermutation on B-cell class-switch fate in health and in chronic lymphocytic leukemia Potent neutralizing antibodies from COVID-19 patients define multiple targets of vulnerability Convergent antibody responses to SARS-CoV-2 in convalescent individuals Potent neutralizing antibodies against SARS-CoV-2 identified by high-throughput singlecell sequencing of convalescent patients' B cells Human neutralizing antibodies elicited by SARS-CoV-2 infection A noncompeting pair of human neutralizing antibodies block COVID-19 virus binding to its receptor ACE2 Single cell RNA and immune repertoire profiling of COVID-19 patients reveal novel neutralizing antibody Breadth of concomitant immune responses prior to patient recovery: a case report of non-severe COVID-19 TCR repertoire as a novel indicator for immune monitoring and prognosis assessment of patients with cervical cancer CD8(+) T-cell pathogenicity in Rasmussen encephalitis elucidated by large-scale T-cell receptor sequencing A spectral clustering-based method for identifying clones from high-throughput B cell repertoire sequencing data Ggseqlogo: a versatile R package for drawing sequence logos Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data Mechanisms of programmed DNA lesions and genomic instability in the immune system BCR selection and affinity maturation in Peyer's patch germinal centres A broadly neutralizing anti-influenza antibody reveals ongoing capacity of haemagglutinin-specific memory B cells to evolve Polyclonal and convergent antibody response to Ebola virus vaccine rVSV-ZEBOV Structures of preferred human IgV genes-based protective antibodies identify how conserved residues contact diverse antigens and assign source of specificity to CDR3 loop variation Diversity in the CDR3 region of V(H) is sufficient for most antibody specificities More than one antibody of individual B cells revealed by single-cell immune profiling Regulation of IgM and IgD expression in human B-lineage cells IgA function-variations on a theme Virusneutralizing antibodies of immunoglobulin G (IgG) but not of IgM or IgA isotypes can cure influenza virus pneumonia in SCID mice The role of BCR isotype in B-cell development and activation The generation of antibody-secreting plasma cells Statistical analysis of CDR3 length distributions for the assessment of T and B cell repertoire biases Novel E2 glycoprotein tetramer detects hepatitis C virus-specific memory B cells 5' Rapid amplification of cDNA ends and Illumina MiSeq reveals B cell receptor features in healthy adults, adults with chronic HIV-1 infection, cord Blood, and humanized mice Detection of SARS-CoV-2-specific humoral and cellular immunity in COVID-19 convalescent individuals Targets of T cell responses to SARS-CoV-2 coronavirus in humans with COVID-19 disease and unexposed individuals Immunology of COVID-19: current state of the science Extrapulmonary manifestations of COVID-19 Multi-omics resolves a sharp disease-state shift between mild and moderate COVID-19 Rethinking mucosal antibody responses: IgM, IgG and IgD join IgA The immune geography of IgA induction and function Treatment of 5 critically ill patients with COVID-19 with convalescent plasma National Natural Science Foundation of China (Nos. 61822108 and 62032007); Emergency Research Project for COVID-19 of Harbin Institute of Technology (No. 2020-001). We declare that there is no conf lict of interest regarding the publication of this article. Supplementary data are available online at https://academic. oup.com/bib. Q.J. conceived the project. Q.J., X.J., W.Z., M.L., P.W., Z.X., K.M., H.C., C.X., Y.H., R.C., L.X., X.L., F.P., Y.L. and H.N. contributed to data analysis. Q.J. and X.J. wrote the manuscript.