key: cord-0980451-87gn2ohz authors: Yu, Xueping; Ho, Kuoting; Shen, Zhongliang; Fu, Xiaoying; Huang, Hongbo; Wu, Delun; Lin, Yancheng; Lin, Yijian; Chen, Wenhuang; Su, Milong; Qiu, Chao; Zhuang, Xibin; Su, Zhijun title: The Association of Human Leukocyte Antigen and COVID-19 in Southern China date: 2021-08-06 journal: Open Forum Infect Dis DOI: 10.1093/ofid/ofab410 sha: e412e4c1cc570e70aff5f09c3b507050b3647546 doc_id: 980451 cord_uid: 87gn2ohz HLA polymorphism is hypothesized to be associated with diverse immune responses towards infectious diseases. Herein, by comparing against multiple subpopulation groups as control, we confirmed that HLA-B*15:27 and HLA-DRB1*04:06 were associated with COVID-19 susceptibility in China. Both alleles were predicted to have weak binding affinities towards viral proteins. M a n u s c r i p t 4 Severe respiratory syndrome coronavirus 2 (SARS-CoV-2) has contributed to nearly 4 million deaths worldwide (1) . Delayed or inadequate treatment contributes to a large portion of COVID-19 mortality. Improving risk stratification and clinical management is crucial to identify the potentially vulnerable population. Several clinical features and genetic factors have been suggested to escalate risks of SARS-CoV-2 infection, the occurrence of severe conditions, and mortality (2) . Human leukocyte antigen (HLA), the most polymorphic genetic locus, plays a pivotal role in viral antigen recognition and presentation (3) . After decades of study, several HLA alleles had been identified to be responsible for individual differences in host response, such as viral clearance and disease prognosis (4) (5) (6) (7) (8) . Thus far, a few HLA alleles have been reported to be predisposed to COVID-19 infection (9) (10) (11) (12) . However, these findings are difficult to replicate (13) . Beyond the reasons such as small sample size and HLA being population dependent, the leading cause is that these case-control studies are extra sensitive to population stratification. A previous study pointed out that, even within the same ethnic group, the choice of controls would drastically affect the perceived associations (9) . On top of that, complicated issues are specifically present in the COVID-19 study. In the case of investigating patients recruited during an early stage of the pandemic in southern China, it could be inappropriate to use the local population as control, since most cases were made up of people who fled from Wuhan. To better elucidate these situations, the use of multiple control groups was being suggested (14) . Herein, we performed an empirical study in 399 unrelated patients to explore the effect of HLA on SARS-CoV-2 infection. We highlighted the influence of subpopulation by comparing it against multiple controls. Our findings provide deeper insights into the pathogenetic mechanism of COVID-19 and lay the valuable groundwork for future HLA research. A c c e p t e d M a n u s c r i p t 5 Patients with confirmed COVID-19 diagnosis in Quanzhou, Fujian (n=42) during early 2020 were enrolled. Their demographic information, clinical characteristics, and peripheral blood samples were collected with informed consent. The HLA alleles were identified via NGSbased typing, using NanoWES Human Exome V1.0 (Berrygenomics, Beijing, China), according to the manufacturer's instructions. Class I and class II HLA alleles were estimated with OptiType v1.3.3 (15) and ATHLATES v1.0 (16), respectively. HLA allele frequencies and clinical information for patients in the other two centers (Shenzhen n=332 and Zhejiang n=82) were retrieved from prior studies (9, 17) . Aiming to make our findings more plausible, we performed analyses against multiple sets of control. Firstly, we chose three controls representing the general population in China, we assessed the binding affinities from peptides derived from the receptor binding domain (RBD) region. We later labeled these bindings as the strong binder (IC50 < 50nM) and the weak binder (IC50 < 500nM), following the recommendations from the previous studies (24, 25) . The numbers of strong binders and weak binders were calculated per HLA allele. Prior to analyses, several quality control procedures were taken. Closely related individuals were excluded through kinship estimation in PLINK v1.9 (26) . Hardy-Weinberg equilibrium (HWE) was assessed in Arlequin v3.5 (27) . Cochran's rule was applied to eliminate HLA alleles with a frequency less than five. The following statistical analyses were performed in R A c c e p t e d M a n u s c r i p t 7 v4.0.2. HLA heterogeneity among three centers was measured by pairwise comparisons using Fisher's exact test. The homogeneities among each control group were examined using the chi-square test. In addition, a t-test was performed to compare allele frequency between controls representing the general and the southern populations. Subsequently, to estimate the effect of HLA on disease susceptibility, we utilized Fisher's exact test to compare the HLA allele frequencies between COVID-19 patients and the selected controls. The corrected Pvalue (P c ) was then calculated with Benjamini-Hochberg method to adjust for multiple comparisons. Couples of alleles appeared to have significant heterogeneity among intra-group controls (full list presented in Supplementary Table 1 and Supplementary Table 2 ). Besides, ten alleles showed significant allele frequency differences between the general and the southern populations, as detailed in A c c e p t e d M a n u s c r i p t 8 We estimated the relationship between HLA and COVID-19 susceptibility by comparing cases against three controls representing the general population in China. We determined an HLA allele to have strong evidence for the association if the associations were observed to be significant against two or more controls. As illustrated in Table 2 Our study confirmed that HLA-B*15:27 and HLA-DRB1*04:06, which appeared uniquely in Asia, according to AFND, were risk alleles for COVID-19 susceptibility. In spite that HLA-DRB1*04:06 was not significant after multiple comparison correction, both alleles had been mentioned as risk alleles in a prior study (17) . It is interesting that, while we suspect the association of HLA-A*11:01, HLA-A*11:02, and HLA-B*40:01 may be driven by population stratification, both HLA-A*11 and HLA-B*40 had been previously linked with COVID-19 infection (11, 12, 28) . Despite that HLA-B*40 did show weak peptide binding capability, HLA-A*11 exhibited moderate binding affinities in bioinformatic prediction. Taken together, a comprehensive investigation of the impact of the subpopulation is much needed. Additionally, DRB1*04:06 had been identified as a risk factor for autoimmune diseases (29, 30) and drug-induced maculopapular exanthema (31) . This finding is consistent with previous observations that autoimmune diseases correlating with COVID-19 severity and mortality (12, 32) . One popular explanation is that the two conditions share the same etiologies such as extensive inflammation. However, recent findings also suggested that the locus between DRB1 and DQA1 was related to circulating IL-6 levels (33) which often accompanying cytokine storms. Taken together, these findings indicate that individual vulnerability to infection may go beyond sole HLA affinity. The major limitations of this study include the partial availability of haplotype data and the # cell counts smaller than 5; * significant A c c e p t e d M a n u s c r i p t Figure 1 WHO Coronavirus Disease (COVID-19) Dashboard. *Internet+. 2020 Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan The roles of HLA-DQB1 gene polymorphisms in hepatitis B virus infection Influence of MHC class II genotype on outcome of infection with hepatitis C virus. The HENCORE group. Hepatitis C European Network for Cooperative Research The Associations of HLA-A*02:01 and DRB1*11:01 with Hepatitis C Virus Spontaneous Clearance Are Independent of IL28B in the Chinese Population HLA Class II-DRB1 Alleles with Hepatitis C Virus Infection Outcome in Egypt: A Multicentre Family-based Study Association of HLA class I with severe acute respiratory syndrome coronavirus infection Initial whole-genome sequencing and analysis of the host genetic contribution to COVID-19 severity and susceptibility HLA and AB0 Polymorphisms May Influence SARS-CoV-2 Infection and COVID-19 Severity HLA genetic polymorphisms and prognosis of patients with COVID-19 Human Leukocyte Antigen Complex and Other Immunogenetic and Clinical Factors Influence Susceptibility or Protection to SARS-CoV-2 Infection and Severity of the Disease Course A role for human leucocyte antigens in the susceptibility to SARS-Cov-2 infection observed in transplant patients OptiType: precision HLA typing from next-generation sequencing data ATHLATES: accurate typing of human leukocyte antigen through exome sequencing Distribution of HLA allele frequencies in 82 Chinese individuals with coronavirus disease-2019 (COVID-19) High-Resolution Analyses of Human Leukocyte Antigens Allele and Haplotype Frequencies Based on 169,995 Volunteers from the China Bone Marrow Donor Registry Program HLA common and well-documented alleles in China. HLA Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease Analysis on polymorphism of high-resolution HLA-A,-B,-C,-DRB1 and-DQB1 in hematopoietic stem cells donors of Chinese Han population from Southern China Human Leukocyte Antigen Susceptibility Map for Severe Acute Respiratory Syndrome Coronavirus 2 NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data Possible role of HLA class-I genotype in SARS-CoV-2 infection and progression: A pilot study in a cohort of Covid-19 Spanish patients PLINK: a tool set for wholegenome association and population-based linkage analyses Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows Retrospective in silico HLA predictions from COVID-19 patients reveal alleles associated with disease prognosis. medRxiv Investigation of the predisposing factor of pemphigus and its clinical subtype through a genome-wide association and next generation sequence analysis HLA-II genes are associated with outcomes of specific immunotherapy for allergic rhinitis Factors associated with COVID-19-related death using OpenSAFELY Genome-Wide Association Study of Circulating Interleukin 6 Levels Identifies Novel Loci We would like to acknowledge all the medical staff in the Department of Infectious Diseases and COVID-19 special ward, First Hospital of Quanzhou, affiliated with Fujian Medical University, for their work of clinical specimen collection during these trying times. A c c e p t e d M a n u s c r i p t 12 A c c e p t e d M a n u s c r i p t