key: cord-1004886-r76bc73b authors: Zhu, H.; Zheng, F.; Li, L.; Jin, Y.; Luo, Y.; Li, Z.; Zeng, J.; Tang, L.; Xia, N.; Liu, P.; Han, D.; Shan, Y.; Zhu, X.; Liu, S.; Xie, R.; Chen, Y.; Liu, W.; Liu, L.; Xu, X.; Wang, J.; Yang, H.; Shen, X.; Jin, X.; Cheng, F. title: A Chinese host genetic study discovered type I interferons and causality of cholesterol levels and WBC counts on COVID-19 severity date: 2021-06-09 journal: nan DOI: 10.1101/2021.06.04.21258335 sha: ecb7d0202516c90712c2d7ad8f35333438fa2075 doc_id: 1004886 cord_uid: r76bc73b As of early May 2021, the ongoing pandemic COVID-19 has caused over 160 million of infections and over 3 million deaths worldwide. Many risk factors, such as age, gender, and comorbidities, have been studied to explain the variable symptoms of infected patients. However, these effects may not fully account for the diversity in disease severity. Here, we present a comprehensive analysis of a broad range of patients laboratory and clinical assessments to investigate the genetic contributions to COVID-19 severity. By performing GWAS analysis, we discovered several concrete associations for laboratory features. Based on these findings, we performed Mendelian randomization (MR) analysis to investigate the causality of laboratory traits on disease severity. From the MR study, we identified two causal traits, cholesterol levels and WBC counts. The functional gene related to cholesterol levels is ApoE and people with particular ApoE genotype are more likely to have higher cholesterol levels, facilitating the process that SARS-CoV-2 binds on its receptor ACE2 and aggravating COVID-19 disease. The functional gene related to WBC counts is MHC system that plays a central role in the immune system. The host immune response to the SARS-CoV-2 infection greatly affects the patients severity status and clinical outcome. Additionally, our gene-based and GSEA analysis revealed interferon pathways, including type I interferon receptor binding, regulation of IFNA signaling, and SARS coronavirus and innate immunity. We hope that our work will make a contribution in studying the genetic mechanisms of disease illness and serve as useful reference for the clinical diagnosis and treatment of COVID-19. . We then tested whether the presence or selected for further analysis and 99.6% of these variants had imputation score over 0. 8 The study workflow is designed as in Figure 2 . When we applied multiple-testing correction to the Asian, and Chinese populations. The gene ApoE is a type of apolipoprotein that participates in 181 lipid metabolism and particular ApoE genotype results in higher risk of elevated LDL-C levels. The rs9268517-WBC is a novel genetic association identified by our GWAS analysis. However, 183 its closest gene BTNL2 was previously identified to be associated with WBC by a GWAS analysis 184 with 408,112 European individuals 30 . The gene BTNL2 encodes MHC II type I transmembrane 185 protein and binding to its receptor can inhibit T cell activation and cytokine production. data, laboratory measurements, and clinical severity to perform one-sample MR analysis, we 188 choose not to do so due to the small sample size and low powers and also its less powerful 189 performance in controlling for confounders. We instead examined the causal relationships between 190 the laboratory measurements with concrete genome-wide associations and COVID -19 191 susceptibility and severity tested by various phenotype types from the Host Genetics Initiative 192 (HGI) database based on the two-sample MR analysis. Typically, the two-sample MR study 193 requires two independent studies from one population to ensure consistent SNP sites. We Table S1 ). In brief, we identified three leniently significant causal associations (p- CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint (BTNL2), respectively. After SNPs clumping and pruning, there was only one SNP used in MR 201 analysis for each trait and therefore the Wald ratio method 31 was used to estimate the causal effects. Typically, the minimum number of independent SNPs is three 26 Table S2 and showed four significant associations corresponding to four HGI 219 phenotypes. The q-values (FDR) of these associations are also less than 0.1 with two less than 0.05. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 9, 2021. findings. Specifically, people with ApoE ε4/ε4 have increased risk of high cholesterol levels. When 247 they are exposed to SARS-CoV-2, the accumulation of cholesterol in alveolar epithelial cells 248 increased the density of lipid rafts, from which the virus binds to its target receptor ACE2. Therefore, higher density of lipid rafts facilitates the bindings in cell membranes and eventually 250 raised the susceptibility to SARS-CoV-2 infection and severity of COVID-19 14, 40, 41 . The genetic 251 mechanism is illustrated in Figure 4D . The second point of view is that as the disease condition 252 worsened, the lipid levels including apoA and LDL-C largely decreased 42 . Our results in the next 253 section of "Time-series analysis of laboratory features" supported this association, showing a very 254 low level of cholesterol in blood is a risk sign for suffering severe symptoms in COVID-19 cases. 255 We further tested on the MHC system genes. The SNPs (rs114398276) mapped to MHC 256 genes in BBJ database were removed when harmonizing with HGI results due to large difference 257 in allele frequency. As an alternative, we downloaded another summary result based on East Asian CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. provided in Supplementary Table S6 and showed four significant associations corresponding to 262 four HGI phenotypes. The q-values (FDR) of these four associations are also less than 0.05 263 suggesting the candidate pathway that MHC family has an effect on COVID-19 disease by 264 controlling WBC counts. Second, we did the MR analysis based on all associated 81 SNPs. After illness is consistently positive based on our dataset and the tested Asian database, suggesting that 272 WBC count is likely a risk predictor to disease status. 273 We searched the gene-trait "major histocompatibility complex" + COVID-19 on PubMed severity of infection symptoms. The genetic mechanism is illustrated in Figure 5D . EIGENSTRAT software 53, 54 . We set a genome-wide significance threshold at the level of 5E-08 381 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint and a study-wide significance threshold at the level of 6.41E-10 (=5E-08/78) by applying Two-sample Mendelian randomization. Several significant associations were identified from 384 the GWAS analysis of laboratory measurements. Given the potential genetic correlation between 385 these features and the COVID-19 susceptibility and severity, we performed two-sample Mendelian 386 randomization analyses to examine causal effects between them and uncover genetic variants that CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint and innate immunity, overview of interferons-mediated signaling pathway, and type I interferon 465 receptor binding. Several considerable studies observed that low levels of IFNs production was 466 highly correlated with severe COVID-19. Most of these studies were based on bulk RNA-seq, 467 scRNA-seq, or experimental designs, while our analysis is built on genomic data, supporting this 468 solid conclusion from a new perspective. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint The authors declare no competing interests. F.C. and X.J. conceived the study, designed the research program and managed the project. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint Figure 2 . The workflow of the main analyses performed in this study . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint Figure 5 . The genetic mechanisms of MHC system determining severity by controlling WBC counts . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint The pink, light blue, and orange lines indicate the patients are grouped into mild, recovery, and death team, respectively. The y-axis denotes the quantities of each trait. The bottom three figures show regression associations between each trait with disease severity (green) and clinical outcome (yellow)over time. The y-axis denotes -log10(p-value) multiplied by the effect direction (positive effect is 1 and negative effect is -1). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 9, 2021. ; https://doi.org/10.1101/2021.06.04.21258335 doi: medRxiv preprint Covid-19: risk factors for severe disease and death Gender Differences in Patients With COVID-19: Focus on Severity and Mortality. Frontiers in public health 8 COVID-19 and Sex Differences: Mechanisms and Biomarkers COVID-19 and comorbidities: Deleterious impact on infected patients Initial whole-genome sequencing and analysis of the host genetic contribution to COVID-19 severity and susceptibility The ABO blood group locus and a chromosome 3 gene cluster associate with SARS-CoV-2 respiratory failure in an Italian-Spanish genome-wide association analysis Genetic mechanisms of critical illness in COVID-19 Trans-ancestry analysis reveals genetic and nongenetic associations with COVID-19 susceptibility and severity The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic Genomewide Association Study of Severe Covid-19 with Respiratory Failure Genetic mechanisms of critical illness in COVID-19 Mendelian randomization analysis identified genes pleiotropically associated with the risk and prognosis of COVID-19 Mendelian randomization: use of genetics to enable causal inference in observational studies. Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association The role of high cholesterol in age-related COVID19 lethality. bioRxiv : the preprint server for biology Monocyte HLA-DR Measurement by Flow Cytometry in COVID-19 Patients: An Interim Review Type I IFN immunoprofiling in COVID-19 patients Antiviral activities of type I interferons to SARS-CoV-2 infection Immunophenotyping of COVID-19 and influenza highlights the role of type I interferons in development of severe COVID-19 Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia (Trial Version 7) Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Risk Factors Associated With Mortality Among Patients With COVID-19 in Intensive Care Units in Risk factors for predicting mortality in elderly patients with COVID-19: A review of clinical data in China Does comorbidity increase the risk of patients with COVID-19: evidence from meta-analysis rMVP: A Memory-efficient, Visualization-enhanced, and Parallel-accelerated tool for Genome-Wide Association Study Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases Genetic analysis of a novel missense mutation (Gly542Ser) with factor XII deficiency in a Chinese patient of consanguineous marriage A genome-wide association meta-analysis on lipoprotein (a) concentrations adjusted for apolipoprotein (a) isoforms cDNA sequence of human apolipoprotein(a) is homologous to plasminogen The Polygenic and Monogenic Basis of Blood Traits and Diseases The fitting of straight lines if both variables are subject to error Mendelian randomization analysis with multiple genetic variants using summarized data Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases An atlas of genetic influences on human blood metabolites Apolipoprotein E in lipoprotein metabolism, health and cardiovascular disease Modulation of plasma triglyceride levels by apoE phenotype: a meta-analysis APOE e4 Genotype Predicts Severe COVID-19 in the UK Biobank Community Cohort. The journals of gerontology. Series A, Biological sciences and medical sciences 75 Does apolipoprotein E genotype predict COVID-19 severity? COVID-19 enters the expanding network of apolipoprotein E4-related pathologies Hypolipidemia is associated with the severity of COVID-19 Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 Individuals from 5 Global Populations COVID-19 and the immune system Functional exhaustion of antiviral lymphocytes in COVID-19 patients Software for More Flexible Gene-Based Testing Profiler--a web-based toolset for functional profiling of gene lists from large-scale experiments Type I interferon: From innate response to treatment for COVID-19 Dysregulation of type I interferon responses in COVID-19 Robust relationship inference in genome-wide association studies Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering Second-generation PLINK: rising to the challenge of larger and richer datasets Principal components analysis corrects for stratification in genome-wide association studies Population structure and eigenanalysis The MR-Base platform supports systematic causal inference across the human phenome Orienting the causal relationship between imprecisely measured traits using GWAS summary data The control of the false discovery rate in multiple testing under dependency Gene ontology: tool for the unification of biology. The Gene Ontology Consortium KEGG: kyoto encyclopedia of genes and genomes Reactome: a knowledgebase of biological pathways WikiPathways: pathway editing for the people CNSA: a data repository for archiving omics data. Database : the journal of biological databases and curation CNGBdb: China National GeneBank DataBase Notes. AF indicates the allele frequency for the effect/alternate allele; R2 indicates the imputation score based on EAS population from the 1000 Genome Project; N is the sample size used in GWAS analysis; "√" and "X" indicate the corresponding associations were previously reported and not reported in a population based on genomic studies, respectively.