key: cord-353283-rlvfk8w8 authors: Liu, D.; Yang, J.; Feng, B.; Lu, W.; Zhao, C.; Li, L. title: Pleotropic association between risk and prognosis of COVID-19 and gene expression in blood and lung: A Mendelian randomization analysis date: 2020-09-03 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2020.09.02.20187179 sha: doc_id: 353283 cord_uid: rlvfk8w8 Background: Coronavirus disease 2019 (COVID-19) has caused a large global pandemic. Patients with COVID-19 exhibited considerable variation in disease behavior. Pervious genome-wide association studies (GWAS) have identified potential genetic factors involved in the risk and prognosis of COVID-19, but the underlying biological interpretation remains largely unclear. Methods: We applied the summary data-based Mendelian randomization (SMR) method to identify genes that were pleiotropically/potentially causally associated with the risk and various outcomes of COVID-19, including severe respiratory confirmed COVID-19 and hospitalized COVID-19. The GWAS summarized data for COVID-19 were provided by the COVID-19 Host Genetics Initiative and the Severe Covid-19 GWAS Group. Analyses were done for blood and lung, respectively. Results: In blood, we identified 2 probes, ILMN_1765146 and ILMN_1791057 tagging IFNAR2, that showed pleiotropic association with hospitalized COVID-19 (beta [SE]=0.42 [0.09], P=4.75E06 and beta [SE]=-0.48 [0.11], P=6.76E06, respectively). Although no other probes were significant after correction for multiple testing in both blood and lung, multiple genes as tagged by the top 5 probes were involved in inflammation or antiviral immunity, and several other tagged genes, such as PON2 and HPS5, were involved in blood coagulation. Conclusion: We identified IFNAR2 and other potential genes that could be involved in of the susceptibility or prognosis of COVID-19. These findings provide important leads to a better understanding of the mechanisms of cytokine storm and venous thromboembolism in COVID-19 and potential therapeutic target for effective treatment of COVID-19. a promising tool to search for pleotropic/potentially causal effect of an exposure (e.g., gene expression) on the outcome (e.g., COVID-19 susceptibility). MR minimizes confounding and reverse causation that are commonly encountered in traditional association studies (8, 9) , and has been successful in identifying gene expression sites or DNA methylation loci that are pleiotropically/potentially causally associated with various phenotypes, such as cardiovascular diseases, BMI, and rheumatoid arthritis (10-13). In this paper, we applied the summary data-based MR (SMR) method integrating summarized GWAS data for COVID-19 and cis-eQTL (expression quantitative trait loci) data to prioritize genes that are pleiotropically/potentially causally associated with the risk and prognosis of COVID-19. In the SMR analysis, cis-eQTL genetic variants were used as the instrumental variables (IVs) for gene expression. We performed SMR analysis for gene expression in blood and lung separately. For blood, we used the CAGE eQTL summarized data (14) , which included 2765 participants. For lung, we used the V7 release of the GTEx eQTL summarized data (15) , which included 278 participants. The eQTL data for blood and lung can be downloaded at https://cnsgenomics.com/data/SMR/#eQTLsummarydata. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.02.20187179 doi: medRxiv preprint The GWAS summarized data were provided by the COVID-19 host genetics initiative (6) and can be downloaded at https://www.covid19hg.org/results/. Three phenotypes were examined, including severe respiratory confirmed COVID-19, COVID-19 and hospitalized COVID-19. Details on definition of the phenotypes can be found in Table S1 . The control groups were subjects from the general population without the specific phenotype, subjects who were COVID-19 negative based on prediction or self-report, or subjects who had COVID-19 without hospitalization, making a total of five comparisons: severe respiratory confirmed COVID- 19 participants from the general population without COVID-19 as the control ( Table 1) . The definition of severe respiratory confirmed COVID-19 in this study was different from the one as defined by the COVID-19 host genetics initiative ( Table S1 ). The GWAS summarized data can be downloaded at https://ikmb.shinyapps.io/COVID-All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.02.20187179 doi: medRxiv preprint 19_GWAS_Browser/. MR was undertaken with cis-eQTL as the IV, gene expression as the exposure, and each of the three phenotypes as the outcome (in comparison with different control groups). MR analysis was done using the method as implemented in the software SMR. Detailed information regarding the SMR method was reported in a previous publication (10). In brief, SMR applies the principles of MR to jointly analyze GWAS and eQTL summary statistics in order to test for pleotropic association between gene expression and a trait due to a shared and potentially causal variant at a locus. The heterogeneity in dependent instruments (HEIDI) test was performed to evaluate the existence of linkage in the observed association. Rejection of the null hypothesis (i.e., PHEIDI<0.05) indicates that the observed association might be due to two distinct genetic variants in high linkage disequilibrium with each other. We adopted the default settings in SMR (e.g., PeQTL <5 × 10 -8 , minor allele frequency [MAF] > 0.01, removing SNPs in very strong linkage disequilibrium [LD, r 2 > 0.9] with the top associated eQTL, and removing SNPs in low LD or not in LD [r 2 <0.05] with the top associated eQTL), and used false discovery rate (FDR) to adjust for multiple testing. Annotations of transcripts were based on the Affymetrix exon array S1.0 platforms. To functionally annotate putative transcripts, we conducted functional enrichment analysis using the functional annotation tool "Metascape" (16) for the top tagged genes in blood and lung, separately. Gene symbols corresponding to putative genes (P<0.05) were used as the input of the gene ontology (GO) and Kyoto All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.02.20187179 doi: medRxiv preprint Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. Data cleaning and statistical/bioinformatical analysis was performed using R version 4.0.0 (https://www.r-project.org/), PLINK 1.9 (https://www.coggenomics.org/plink/1.9/) and SMR (https://cnsgenomics.com/software/smr/). The number of cases and controls varied dramatically among different analyses. The number of probes were approximately 8,500 and 5,500 in the analysis for blood and lung, respectively. The detailed information was shown in Table 1 . Information of the top 5 probes for each phenotype was presented in Table 2 Table 2 ). In addition, we found that multiple probes tagging TRIM5 were among the top 5 probes in the analysis of severe COVID-19 NEJM and hospitalized COVID-19 in blood. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. (Table S2) . We didn't identify any significant pleiotropic association after correction for multiple testing ( Figure S6-11) . Information of the top 5 probes for each phenotype was presented in Table 3 . We found that 3 probes tagging AP006621 were among the top probes in the analysis of severe COVID-19 NEJM ( Table 3 ). In addition, multiple genes, including C7orf25, ITGAD, FCER1G and ZNF589, were tagged by the top 5 probes in both blood and lung. The genes tagged by the top 5 probes were involved in two GO terms, including stimulatory C-type lectin receptor signaling pathway (GO: 0002223) and leukocyte activation involved in immune response (GO: 0002366). Multiple genes as tagged by the top 5 probes, including ARSA, FCER1G, XOSC6 and PSMD13, were involved inflammation or antiviral immunity (Table S3) . All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.02.20187179 doi: medRxiv preprint In the present study, we integrated GWAS and eQTL data in the MR analysis to explore putative genes that showed pleiotropic/potentially causal association the susceptibility/prognosis of COVID-19. We identified 2 probes tagging IFNAR2 showing pleiotropic association with hospitalized COVID-19 in blood. Multiple genes as tagged by the top 5 probes were involved in inflammation and antiviral immunity in both blood and lung. Several genes tagged by the top probes were in blood coagulation. Our findings provided important leads to a better understanding of the mechanism of cytokine storm and venous thromboembolism in COVID-19 and revealed potential therapeutic targets for the effective treatment of COVID-19. Interferons (IFNs) refer to a group of signaling proteins made and released by the host cells in response to viral invasion (17) . There are three types of IFNs: type I IFNs (IFN-α/β), type II IFNs (IFN-γ) and type III IFNs (IFN-λ) (18) (19) (20) . IFNAR2 (interferon alpha and beta receptor subunit 2), located at 21q22.11, encodes one of the two type I IFNs (21) . In the cascade of host's response to coronavirus, IFNs play an essential role in the establishment of antiviral state and in intensifying the antiviral response (22) . It was found that IFNs could have both beneficial and detrimental effect on SARS-Cov-2 replication (23). A recent retrospective study of 77 adults found that IFN-α2b treatment with or without arbidol significantly reduced the duration of detectable virus in the upper respiratory tract of COVID-19 patients (24). However, the COVID-19 Treatment Guidelines Panel recommend against the use of interferons for the treatment of patients with severe and critical COVID-19, except in All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . a clinical trial, due to insufficient data to support the beneficial or detrimental effects of interferons (25). Because IFNAR2 was tagged by the top 5 probes for multiple phenotypes, we think that it is likely involved in determining COVID-19 severity and could be a potential therapeutic target for the treatment of COVID-19. More studies are needed to elucidate the mechanisms underlying the dual role of IFNs during SARS-Cov-2 and whether/how IFNAR2 is involved in this process. We found that multiple genes as tagged by the top 5 probes were involved in inflammation. ATF4 (activating transcription factor 4) is an endoplasmic reticulum stress sensor that defends lungs via induction of heme oxygenase 1 (26). ATF4 were decreased in inflamed intestinal mucosa from patients with active Crohn's disease or active ulcerative colitis, and ATF4 deficiency promotes intestinal inflammation in mice (27). ATF4 was downregulated in the alveolar type II cells of the elderly, compared with the young (28) . ATF4 has more than 20 candidate downstream factors, with the majority of them being significantly downregulated in the elderly, who demonstrated compromised ATF4-dependent ability to respond to endoplasmic reticulum stress. Lung-specific delivery of ATF4-related antioxidants has the potential to work in synergy with promising antiviral drugs to further improve COVID-19 outcomes in the elderly (28) . Some of the tagged genes, such as TRIM5, NLRC5, MCTP1, PTPN1, ARSA and FCER1G, are also involved antiviral immunity (29-35). These genes have not been reported to be associated with COVID-19 in previous studies. It was estimated that approximately 40% of COVID-19 patients were considered All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . https://doi.org/10.1101/2020.09.02.20187179 doi: medRxiv preprint Our study has some limitations. The GWAS analyses did not control confounding factors which might affect the outcome. It is also unclear whether selection of the subjects in the GWAS studies was a representative of the exposureoutcome distributions in the overall population, and therefore, the possibility of selection bias, which can affect estimation accuracy, could not be ruled out. The GWAS studies only examined the short-term effect of COVID-19 due to the limited duration of the COVID-19 pandemic, and we were unable to assess the long-term outcomes/lingering effects of COVID-19. Similarly, we could not analyze the genetic contribution of other interesting phenotypes, such as different disease behaviors among children/teens, adults and the elderly patients, and asymptomatic COVID-19, due to a lack of the corresponding GWAS summarized data. We only performed analyses using blood and lung eQTL data, more studies are needed to explore tissueand cell-type-specific genes involved in the host responses to COVID-19 infection. Due to a lack of individual eQTL data, we could not quantify the changes in gene expression in patients with COVID-19 in comparison with the control. We identified IFNAR2 and other potential genes that could be involved in the susceptibility/prognosis of COVID-19. These findings provide important leads to a better understanding of the mechanism of cytokine storm in COVID-19 and reveals potential therapeutic targets for the effective treatment of COVID-19. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. The authors confirmed that all authors have reviewed the contents of the article being submitted, approved its contents, and validated the accuracy of the data. DL, JY and LL designed and registered the study. DL, BF and WL analyzed data and performed data interpretation. DL and JY wrote the initial draft and BF, WL, CZ and LL contributed writing to subsequent versions of the manuscript. All authors reviewed the study findings and read and approved the final version before submission. All data generated or analyzed during this study are included in this published article and its supplementary information files. No potential conflicts of interest were disclosed by the authors. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted September 3, 2020. . * The GWAS summarized data were provided by the COVID-19 host genetics initiative (6) unless otherwise noted. The eQTL data for blood (14) and lung (15) can be downloaded at https://cnsgenomics.com/data/SMR/#eQTLsummarydata. # The GWAS summarized data were provided by the Severe Covid-19 GWAS Group (5). ¶ The control is the non-predicted and non-self-reported COVID-19. § The control is COVID-19 without hospitalization, including laboratory confirmed or self-reported COVID-19. COVID-19, coronavirus disease 2019; SMR, summary data-based Mendelian randomization (6) and the control is the population unless otherwise noted. The eQTL data for lung (15) can be downloaded at https://cnsgenomics.com/data/SMR/#eQTLsummarydata. # Summary data were provided by the Severe Covid-19 GWAS Group (5). ¶ The control is the non-predicted and non-self-reported COVID-19. § The control is COVID-19 without hospitalization, including laboratory confirmed or self-reported COVID-19. PeQTL is the P value of the top associated cis-eQTL in the eQTL analysis, and PGWAS is the P value for the top associated cis-eQTL in the GWAS analysis. Beta is the estimated effect size in SMR analysis, SE is the corresponding standard error, PSMR is the P value for SMR analysis, PHEIDI is the P value for the HEIDI test and Nsnp is the number of SNPs involved in the HEIDI test. CHR, chromosome; COVID-19, coronavirus disease 2019; HEIDI, heterogeneity in dependent instruments; SNP, single-nucleotide polymorphism; SMR, summary data-based Mendelian randomization; QTL, quantitative trait loci Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia The COVID-19 Pandemic: A Global Natural Experiment. Circulation An interactive web-based dashboard to track COVID-19 in real time The Lancet Infectious diseases The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nature microbiology Genomewide Association Study of Severe Covid-19 with Respiratory Failure The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. European journal of human genetics : EJHG Beyond GWASs: illuminating the dark road from association to function of Gene Expression in Peripheral Blood Genetic effects on gene expression across human tissues Metascape provides a biologist-oriented resource for the analysis of systems-level datasets The interferon system: an overview Functional role of type I and type II interferons in antiviral defense Type I interferons in host defense Type III interferons: Balancing tissue tolerance and resistance to pathogen invasion. The Journal of experimental medicine Human IFNAR2 deficiency: Lessons for antiviral immunity. Science translational medicine from interferons to cytokines. The Journal of biological chemistry Is Low Alveolar Type II Cell SOD3 in the Lungs of Elderly Linked to the Observed Severity of COVID-19? A role for the human nucleotide-binding domain, leucine-rich repeat-containing family member NLRC5 in antiviral responses. The Journal of biological chemistry TRIM5 acts as more than a retroviral restriction factor NK cell-intrinsic FcεRIγ limits CD8+ T-cell expansion and thereby turns an acute into a chronic viral infection Rapid and Efficient Stable Gene Transfer to Mesenchymal Stromal Cells Using a Modified Foamy Virus Vector. Molecular therapy : the journal of the American Society of Gene Therapy Journal of Huazhong University of Science and Technology Medical sciences = Hua zhong ke ji da xue xue bao Yi xue Ying De wen ban = Huazhong keji daxue xuebao Yixue Yingdewen ban Leptin and leptinrelated gene polymorphisms, obesity, and influenza A/H1N1 vaccine-induced immune responses in 11 ILMN_2060652 BET1L rs4980320 2 * The GWAS summarized data were provided by the COVID-19 host genetics initiative (6) and the control is the population unless otherwise noted. The eQTL data for blood (14) can be downloaded at ¶ The control is the non-predicted and non-self-reported COVID-19 § The control is COVID-19 without hospitalization, including laboratory confirmed or self-reported COVID-19 PeQTL is the P value of the top associated cis-eQTL in the eQTL analysis, and PGWAS is the P value for the top associated cis-eQTL in the GWAS analysis. Beta is the estimated effect size in SMR analysis, SE is the corresponding standard error, PSMR is the P value for SMR analysis CHR, chromosome; COVID-19, coronavirus disease in dependent instruments; SNP, single-nucleotide polymorphism; SMR, summary data-based Mendelian randomization; QTL, quantitative trait loci Bold font means statistical significance after correction for multiple testing using false discovery rate