key: cord-1038581-tn2q5wta authors: Andolfo, I.; Russo, R.; Lasorsa, A. V.; Cantalupo, S.; Rosato, B. E.; Bonfiglio, F.; Frisso, G.; Abete, P.; Cassese, G. M.; Servillo, G.; Esposito, G.; Gentile, I.; Piscopo, C.; Villani, R.; Fiorentino, G.; Cerino, P.; Buonerba, C.; Pierri, B.; Zollo, M.; Iolascon, A.; Capasso, M. title: Common variants at 21q22.3 locus influence MX1 gene expression and susceptibility to severe COVID-19 date: 2020-12-20 journal: nan DOI: 10.1101/2020.12.18.20248470 sha: 9215fa27618b86ab3e8129dfaf41a575d78f6b34 doc_id: 1038581 cord_uid: tn2q5wta The COVID-19 disease, caused by the SARS-Cov-2, presents a heterogeneous clinical spectrum. The risk factors do not fully explain the wide spectrum of disease manifestations, so it is possible that genetic factors could account for novel insights into its pathogenesis. In our previous study, we hypothesized that common variants on chromosome 21, near TMPRSS2 and MX1 genes, may be genetic risk factors associated to the different clinical manifestations of COVID-19. Here, we performed an in-depth genetic analysis of chromosome 21 exploiting the genome-wide association study data including 6,406 individuals hospitalized for COVID-19 and 902,088 controls with European genetic ancestry from COVID-19 Host Genetics Initiative. We found that five single nucleotide polymorphisms (SNPs) within TMPRSS2 and near MX1 gene show suggestive associations (P[≤]1x10-5) with severe COVID-19. All five SNPs replicated the association in two independent cohorts of Asian subjects while two and one out of the 5 SNPs replicated in African and Italian populations, respectively (P[≤]0.05). The minor alleles of these five SNPs correlated with a reduced risk of developing severe COVID-19 and increased level of MX1 expression in blood. Our findings provide further evidence that host genetic factors can contribute to determine the different clinical presentations of COVID-19 and that MX1, an antiviral effector of type I and III interferon pathway, may be a potential therapeutic target. The COVID-19 disease, caused by the SARS-Cov-2, presents a heterogeneous clinical spectrum. The risk factors do not fully explain the wide spectrum of disease manifestations, so it is possible that genetic factors could account for novel insights into its pathogenesis. In our previous study, we hypothesized that common variants on chromosome 21, near TMPRSS2 and MX1 genes, may be genetic risk factors associated to the different clinical manifestations of COVID-19. Here, we performed an in-depth genetic analysis of chromosome 21 exploiting the genome-wide association study data including 6, 406 individuals hospitalized for COVID-19 and 902,088 controls with European genetic ancestry from COVID-19 Host Genetics Initiative. We found that five single nucleotide polymorphisms (SNPs) within TMPRSS2 and near MX1 gene show suggestive associations (P≤1x10 -5 ) with severe COVID-19. All five SNPs replicated the association in two independent cohorts of Asian subjects while two and one out of the 5 SNPs replicated in African and Italian populations, respectively (P≤0.05). The minor alleles of these five SNPs correlated with a reduced risk of developing severe COVID-19 and increased level of MX1 expression in blood. Our findings provide further evidence that host genetic factors can contribute to determine the different clinical presentations of COVID-19 and that MX1, an antiviral effector of type I and III interferon pathway, may be a potential therapeutic target. The recent SARS-Cov-2 pandemic has caused so far more than one million of deaths (https://covid19.who.int/). The COVID-19, caused by the SARS-Cov-2, has several clinical conditions ranging from no or mild symptoms to rapid progression to respiratory failure (1, 2) . Advanced age is a major risk factor, as well as male sex and some co-morbidities such as hypertension and diabetes (3) . Since clinical risk factors do not fully explain the wide spectrum of disease manifestations, it is conceivable that dissecting the genetics of the disease may provide novel insights into its pathogenesis. A recent genome-wide association study (GWAS) identified two susceptibility loci of severe COVID-19: the first locus on chromosome 3 harbors multiple genes (SLC6A20, LZFTL1, CCR9, CXCR6, XCR1, FYCO1) that could be functionally implicated in COVID-19 pathology; the second on chromosome 9 that defines the ABO blood groups (4) . Another study showed that inactivating rare mutations in genes belonging to type I interferon pathway predispose to life-threatening COVID-19 pneumonia (5) . Other very recent papers reported the results from the analysis of two large independent GWASs that validated the two previous risk loci and found novel variants at chromosome 19p13.3, 12q24.13 and 21q22.1 associated with severe COVID-19 (6, 7) . In our previous opinion article, based on allele frequencies across different populations and expression quantitative loci (eQTLs) data, we hypothesized that common variants on chromosome 21 near TMPRSS2 and MX1 genes may be genetic risk factors associated to the COVID-19 different clinical manifestations (8) . In this study, to further support our hypothesis, we exploited GWAS data from the COVID-19 Host Genetics Initiative (9) and performed an in-depth mapping of chromosome 21 using summary statistics where common variants at this chromosome were associated with severe COVID-19 at genome-wide significance level (P ≤ 5×10 -8 ). Using the cohort of 908,494 subjects with European origins, we found 5 single nucleotide polymorphisms (SNPs) at the TMPRSS2/MX1 locus showing suggestive association with the disease. All 5 SNPs replicated the association in two independent cohorts of Asian subjects whereas two SNPs confirmed the association in African and one SNP in Italian cohort. Significant eQTLs signals were found for MX1 in blood. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint The summary statistics of chromosome 21 were obtained from the GWAS dataset "B2_ALL_eur_leave_23andme" deposited in the COVID-19 Host Genetics Initiative website (9) . It includes 6,406 laboratory confirmed SARS-CoV-2 infection and hospitalized for COVID-19 cases and 902,088 controls from general population with European genetic ancestry. The summary statistics of the SNPs used for the replication study were retrieved from the GenOMICC study (6) . Three independent cohorts of cases and controls with different Using the 74 significant SNPs with P≤1x10 -5 of chromosome 21, we defined three independent associated loci by the following computational process. The SNPs were first sorted according to their association P value. Then, the lead SNP, considered as the most significant SNP in a given genomic locus, was removed from this list and assigned to an independent locus together with all other SNPs which have a r 2 value less than or equal to 0.01 with this SNP. This procedure was recursively applied to the remaining SNPs in the list, so that each SNP could be assigned to a locus and no SNPs were left in the original list. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint (https://cadd.gs.washington.edu/) (12) . The distribution of the eQTLs was explored by Genotype Tissue Expression (GTEx) database (https://www.gtexportal.org/home). Allele frequencies were compared using the Chi-square test. A two-sided P≤.05 was considered statistically significant. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint To prove that common variants at TMPRSS2/MX1 (21q22.3) locus may affect the susceptibility to severe COVID-19 onset, we analyzed the summary statistics of a large available GWAS dataset released from COVID-19 Host Genetics Initiative. The database includes 6,406 hospitalized cases and 902,088 controls with European ancestry. The region on chromosome 21 appears to be significantly associated with severe COVID-19 at the genome-wide level (https://www.covid19hg.org/results/) (6) . To investigate whether more than one association signal may exist at chromosome 21, we selected 74 SNPs showing a P≤1×10 -5 and among these we identified 3 independent loci (Supplementary Table 1 Figure 1c) , respectively. The rs3787946 maps in an intronic region of TMPRSS2 and the first closest gene was MX1 (Figure 1c) ; herein we named this locus as "TMPRSS2/MX1". An in-depth inspection of TMPRSS2/MX1 locus showed that 13 SNPs were in LD with the lead rs3787946 (r 2 >0.8, Table 1 ) and that the 5 most significant SNPs (P-values ranged from 2.7×10 -6 to 5.8×10 -6 , Table 1 ) were in strong LD with each other (r 2 >=0.90, Supplementary Figure 1 ). The other 9 SNPs showed an LD with the lead SNP rs3787946 ranging from 0.8 to 0.9 and P-values ranging from 6×10 -4 to 0.04 ( Table 1) . We then sought to replicate the associations of the 14 SNPs in three independent cohorts of cases and controls of GenOMMIC GWAS (6) with non-European ancestry. All the 11 available SNPs replicated in the EAS population; the top 5 SNPs replicated in SAS population whereas two out five SNPs in the AFR one ( We tested if the 14 SNPs (Table 1) and their proxy SNPs (r 2 >0.8) were significantly overrepresented in active enhancers and promoters in multiple cell types and tissues by using HaploReg v4.1. These SNPs were enriched in the regulatory regions of several tissues All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint (Supplementary Table 3 ), but the best enrichment was found in induced pluripotent stem cell and thymus (Figure 2a) . We then investigated the predicted functional role of the 14 SNPs by GWAVA and CADD tools. We found that 2 (rs12329760 and rs9983330) out of 5 of the most significant SNPs showed the second and third most significant score ( Table 1) . The rs12329760 was classified as coding variant (p.Val197Met) localized in the exon 6 of TMPRSS2 gene and was predicted to be pathogenic (PolyPhen=probably damaging and SIFT=deleterious). We verified if the top 5 SNPs ( Table 1 ) might cause gene expression alterations interrogating the GTEx portal. We found that all the top 5 SNPs had eQTL signals for MX1 exclusively in blood tissue. Particularly, the minor alleles of these SNPs correlated with higher expression of MX1 compared to the major alleles (Figure 2b, Supplementary Figure 2a) . Of note, all the 9 other SNPs, except for rs2298660, did not have eQTL signals for MX1 in blood (Supplementary Table 4 ). The two SNPs, rs12329760 and rs2298660, were confirmed as eQTLs for MX1 in blood (P=1.79×10 -6 and 2.8×10 -6 , minor alleles correlated with a higher expression compared to the major alleles) by interrogation of another independent publicly available dataset (13) . TMPRSS2 is highly expressed in lung (8), so we investigated if the top 5 SNPs were eQTL for TMPRSS2 in lung tissues at nominally statistically significant level (P≤0.05). We found that the minor alleles of 4 out 5 SNPs correlated with lower expression of TMPRSS2 compared to the major alleles (Figure 2c and Supplementary Figure 2b) . Notably, rs12329760 is also an eQTL for TMPRSS2 in osteoblasts treated with dexamethasone (14) . All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint Despite the substantial advances made in the recent months in the field of the SARS-CoV-2 infection, the major question remains about the identification of the factors that modulate the variable clinical spectrum of COVID-19. Host genetic risk factors are emerging as a potential explanation for clinical heterogeneity of COVID-19 and are also crucial to find new druggable therapeutic targets (15) . The main host cell entry factors of SARS-CoV-2 are ACE2 and TMPRSS2 (16, 17) . The transmembrane spike (S) glycoprotein of virus binds to the ACE2 making it essential for the invasion of the virus into the host cell, followed by attachment of the virus to the target cells. S-protein priming by TMPRSS2 allows the binding of viral and cellular membranes, resulting in virus entry and replication in the host cells (18) . In our previous study, we hypothesized that common variants at chromosome 21, driving TMPRSS2 and MX1 expression, might have a mild-to-moderate effect in the susceptibility to SARS-CoV-2 infection. Particularly, genetic variants associated with reduced TMPRSS2 and elevated MX1 expression might confer less individual susceptibility to SARS-CoV-2 infection and favor a better outcome (8) . Here, to further support our hypothesis, we exploited GWAS data of a cohort of 908,494 subjects with European origins from the COVID-19 Host Genetics Initiative (9) and performed an in-depth genetic analysis of chromosome 21. We identified five common variants (rs3787946, rs9983330, rs12329760, rs2298661 and rs9985159) at locus 21q22.3 within TMPRSS2 and near MX1 gene that showed suggestive associations (P≤1x10 -5 ) with severe COVID-19. In particular, we found that the alleles with minor frequency were less recurrent among the hospitalized patients when compared to the control individuals, suggesting their protective role against the progression of disease. Interestingly, all the five SNPs replicated in two cohorts of Asian origin, whereas two SNPs replicated in a case series of African ancestry. Additionally, we replicated the association of the rs12329760 SNP in an independent case-control cohort of Italian origins. These results strongly indicate 21q22.3 as novel susceptibility locus to unfavorable outcome of COVID-19 and suggest that molecular mechanisms underlying this genetic predisposition may be common among individuals with different ethnicity. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020. 12.18.20248470 doi: medRxiv preprint Of note, the SNPs at this locus were enriched in the regulatory regions of induced pluripotent stem cell and thymus. Thymus plays a significant role in the regulation of adaptive immune responses. The effect of aging on the thymus and immune senescence is well established, and the resulting inflammaging is found to be implicated in the development of many chronic diseases. Both aging and diseases of inflammaging are associated with severe COVID-19, and a dysfunctional thymus may be implicated in the unfavorable outcome of disease (19, 20) . Thus, our finding suggests that the identified SNPs could be severe COVID-19 risk factors also for their implications in the thymus function. The five SNPs, here identified, had eQTL signals for MX1 exclusively in blood tissue. Particularly, the minor allele of these SNPs correlated with higher expression of MX1 and associated with a minor risk of developing severe COVID-19. These results support the evidence that MX1 can play a relevant role in determining less severe forms of disease and are in line with a recent study that suggests MX1 as antiviral effector against SARS-CoV-2 (21) . Indeed, the expression of MX1 was found to be higher in SARS-CoV-2 positive subjects, negatively correlated with age and independently associated with increased viral load (21) . MX1 is part of the antiviral response induced by type I and III interferons (IFN) (22) . Inactivating mutations in genes belonging to type I IFN pathway and the consequent decreased levels of proteins have been shown to recur in patients with severe COVID-19 (5 (21) . Therefore, it is highly relevant to spawn new drugs that have the capacity to boost the host immune response while controlling tissue damage as a consequence of these infections. We also report that the minor allele of 4 of the top 5 SNPs might reduce the expression of TMPRSS2 in lung tissues and that the rs12329760 coding variant (p.Val197Met) correlate with lower expression of TMPRSS2 in osteoblast treated with dexamethasone (14) , a drug currently used to inhibit an excessive inflammation response (23) . Of note, the rs12329760 All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint All rights reserved. No reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted December 20, 2020. ; https://doi.org/10.1101/2020.12.18.20248470 doi: medRxiv preprint Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Genomewide Association Study of Severe Covid-19 with Respiratory Failure Inborn errors of type I IFN immunity in patients with life-threatening COVID-19 Genetic mechanisms of critical illness in Covid-19 Trans-ethnic analysis reveals genetic and non-genetic associations with COVID-19 susceptibility and severity Genetic Analysis of the Coronavirus SARS-CoV-2 Host Protease TMPRSS2 in Different Populations The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants Functional annotation of noncoding sequence variants CADD: predicting the deleteriousness of variants throughout the human genome Systematic identification of trans eQTLs as putative drivers of known disease associations Global analysis of the impact of environmental perturbation on cis-regulation of gene expression Susceptibility to severe COVID-19 ACE2 gene variants may underlie interindividual variability and susceptibility to COVID-19 in the Italian population ACE2 and TMPRSS2 variants and expression as candidates to sex and country differences in COVID-19 severity in Italy ACE2 and TMPRSS2 polymorphisms in various diseases with special reference to its impact on COVID-19 disease The role of the thymus in COVID-19 disease severity: implications for antibody treatment and immunization What chances do children have against COVID-19? Is the answer hidden within the thymus? SARS-CoV-2 Infection Boosts MX1 Antiviral Effector in COVID-19 Interferon-Inducible Myxovirus Resistance Proteins: Potential Biomarkers for Differentiating Viral from Bacterial Infections Dexamethasone in Hospitalized Patients with Covid-19 -Preliminary Report Initial whole-genome sequencing and analysis of the host genetic contribution to COVID-19 severity and susceptibility Genetic variants in TMPRSS2 and Structure of SARS-CoV-2 spike glycoprotein and TMPRSS2 complex In bold the SNPs that replicated in at least one cohort