key: cord-0843702-hfbi7sk5 authors: Ellinghaus, D.; Degenhardt, F.; Bujanda, L.; Buti, M.; Albillos, A.; Invernizzi, P.; Fernandez, J.; Prati, D.; Baselli, G.; Asselta, R.; Grimsrud, M. M.; Milani, C.; Aziz, F.; Kassens, J.; May, S.; Wendorff, M.; Wienbrandt, L.; Uellendahl-Werth, F.; Zheng, T.; Yi, X.; de Pablo, R.; Chercoles, A. G.; Palom, A.; Garcia-Fernandez, A.-E.; Rodriguez-Frias, F.; Zanella, A.; Bandera, A.; Protti, A.; Aghemo, A.; Lleo de Nalda, A.; Biondi, A.; Caballero-Garralda, A.; Gori, A.; Tanck, A.; Latiano, A.; Fracanzani, A. L.; Peschuck, A.; Julia, A.; Pesenti, A.; Voza, A.; Jimenez, D.; Mateos, B.; Jimenez, B. title: The ABO blood group locus and a chromosome 3 gene cluster associate with SARS-CoV-2 respiratory failure in an Italian-Spanish genome-wide association analysis date: 2020-06-02 journal: nan DOI: 10.1101/2020.05.31.20114991 sha: 8187ff29e1ce2f642f2a2b7a9ee84fb193d94f64 doc_id: 843702 cord_uid: hfbi7sk5 Background. Respiratory failure is a key feature of severe Covid-19 and a critical driver of mortality, but for reasons poorly defined affects less than 10% of SARS-CoV-2 infected patients. Methods. We included 1,980 patients with Covid-19 respiratory failure at seven centers in the Italian and Spanish epicenters of the SARS-CoV-2 pandemic in Europe (Milan, Monza, Madrid, San Sebastian and Barcelona) for a genome-wide association analysis. After quality control and exclusion of population outliers, 835 patients and 1,255 population-derived controls from Italy, and 775 patients and 950 controls from Spain were included in the final analysis. In total we analyzed 8,582,968 single-nucleotide polymorphisms (SNPs) and conducted a meta-analysis of both case-control panels. Results. We detected cross-replicating associations with rs11385942 at chromosome 3p21.31 and rs657152 at 9q34, which were genome-wide significant (P<5x10-8) in the meta-analysis of both study panels, odds ratio [OR], 1.77; 95% confidence interval [CI], 1.48 to 2.11; P=1.14x10-10 and OR 1.32 (95% CI, 1.20 to 1.47; P=4.95x10-8), respectively. Among six genes at 3p21.31, SLC6A20 encodes a known interaction partner with angiotensin converting enzyme 2 (ACE2). The association signal at 9q34 was located at the ABO blood group locus and a blood-group-specific analysis showed higher risk for A-positive individuals (OR=1.45, 95% CI, 1.20 to 1.75, P=1.48x10-4) and a protective effect for blood group O (OR=0.65, 95% CI, 0.53 to 0.79, P=1.06x10-5). Conclusions. We herein report the first robust genetic susceptibility loci for the development of respiratory failure in Covid-19. Identified variants may help guide targeted exploration of severe Covid-19 pathophysiology. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was discovered in Wuhan in China late 2019 and rapidly evolved into a global pandemic. 1 As of May 28 th 2020, there are over 5.1 million confirmed cases worldwide, with total deaths exceeding 355,000 (access John Hopkins). In Europe, Italy and Spain were early severely affected with epidemic peaks starting in the second half of February 2020 (Figure 1) with 60,189 fatal cases reported by May 28 th 2020. Coronavirus disease 2019 (Covid-19) has variable behavior, 2 with the vast majority of infected individuals experiencing only mild or even no symptoms. 3 Mortality rates are predominantly driven by the subset of patients developing severe respiratory failure secondary to bilateral interstitial pneumonia and acute respiratory distress syndrome. 4 Severe Covid-19 with respiratory failure requires early and prolonged support by mechanical ventilation. 5 The pathogenesis of respiratory failure in Covid-19 is poorly understood, but mortality consistently associates with older age and male gender. [6] [7] [8] Clinical associations have also been reported for obesity and cardiovascular disease traits, hypertension and diabetes in particular, but the relative role of these risk factors in determining Covid-19 severity has not yet been clarified. [6] [7] [8] [9] Observations on lymphocytic endothelitis and diffuse microvascular and macrovascular thromboembolic complications may suggest that Covid-19 is a systemic disease that primarily injures the vascular endothelium, but provide mostly hypothetical insights to the underlying pathogenesis in severe Covid-19. [10] [11] [12] On this background, at the peak of the epidemic in Italy and Spain, we performed a genome-wide association study (GWAS) to possibly delineate host genetic factors contributing to respiratory failure in Covid- 19 . The relatively low Covid-19 disease burden in Norway and Germany allowed for a complementary team to be set up, whereby rapid analysis could occur in parallel with rapid patient recruitment in the affected Italian and Spanish epicenters. We recruited in total 1,980 patients with severe Covid-19 infection defined by hospitalization with respiratory failure and confirmed SARS-CoV-2 viral replication from nasopharyngeal swabs or other relevant biological fluids cross-sectionally from intensive care units and general wards of seven hospitals in five cities in the pandemic epicenters in Italy and Spain ( genotyped for the purpose of the present study. We also included two control panels with genotype data derived from previous studies using the same genotyping array; from Italy 8 n=396 controls from reference 13 and from Spain n=987 controls recruited from blood donors (San Sebastian). The project protocol outlined a rapid patient inclusion with principally no additional projectrelated procedures (material from clinically indicated venipunctures) and with the opportunity of complete anonymity with only minimal data collected. Differences in recruitment and consent procedures between centers were determined by 1) some centers integrating the project in larger Covid-19 biobanking efforts and others doing dedicated inclusion for this project and 2) variability regarding the local ethical committee handling of anonymization vs. deidentification as well as consent procedures. Written informed consent was obtained from all study subjects at each center when possible, alternatively exempt as defined by delayed consent, oral consent or consent via next of kin was collected depending on local ethical committee regulations. For some severely ill patients, where this was not possible, an exemption from informed consent was obtained by the local ethical committee or per local regulations during the Covid-19 pandemic to allow the use of completely anonymized surplus material from diagnostic venipuncture. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. We performed DNA extraction from all 1,980 cases and 1,394 Italian controls using a To take imputation uncertainty into account, we tested for phenotypic associations with allele dosage data separately for both Italian and Spanish case-control panels through the use of PLINK's logistic regression framework for dosage data (PLINK v1.9). 15 Two adjusted All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . association analyses including covariates from principal component analysis (PCA) were conducted (analysis I and II) to control for (I) potential population stratification as well as (II) potential population stratification as well as age and gender bias. A fixed-effects metaanalysis was conducted using the meta-analysis tool METAL 16 on variants overlapping between both studies using the BETA and its standard error (SE) from the study specific association analyses. We used the commonly accepted threshold of 5 × 10 -8 for joint P-values to define statistical significance. Based on results from TOPMed genotype imputation, we utilized three ABO SNPs Table 1) . A similar assessment was made for lead SNPs rs11385942 and rs657152, and at these broader loci (3p21.31 and 9q34.2) we also performed Bayesian fine-mapping analysis (see Supplementary Methods). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . 1 1 The milestones of the study in the context of the peak outbreaks in Italy and Spain are shown in Figure 1 . Age, gender and maximum respiratory support up until time of blood sampling for patients included in the final analysis are given in Table 1 and Supplementary Table 1 . By utilizing GSA-only data, we were able to perform a uniform quality control of merged Italian and Spanish batches, thus reducing potential batch effects, and conducted Italian and We found two loci to be associated with Covid-19 induced respiratory failure with genomewide significance (P<5×10 -8 ) in the meta-analysis (analysis I) (Figure 2 and (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . Association signals at 3p21.31 and 9q34.2 were fine-mapped to 22 and 38 variants, respectively, with greater than 95% certainty ( Figure 3A and 3B and Supplementary Table 3 ). The association signal at 3p21.31 comprised six genes ( Figure 3A and Table 4 Table 5 ). Since several viral infections are known to be controlled by genetic variation at the HLA complex at chromosome 6p21, we scrutinized the extended HLA region (chr6:25-34Mb; Supplementary Figure 9 ). There were no SNP or allele associations signals at the HLA All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . 1 3 complex meeting neither genome-wide nor suggestive association significance threshold of P=1×10 -5 (Supplementary Table 6 ). Furthermore, we found no significant differences in allelic distribution between patients with oxygen supplementation only and those with mechanical ventilation of any kind (assessed by direct HLA typing, data not shown). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. in about two months. We detected cross-replicating findings at chromosome 3 and chromosome 9, which achieved genome-wide significance in meta-analysis of both study panels. On chromosome 3p21 the peak association signal covers a cluster of several genes with potentially relevant functions to severe Covid-19. One notable candidate is SLC6A20, which encodes the Sodium/Imino-acid (proline) Transporter 1 (SIT1) that functionally interacts with angiotensin converting enzyme 2 (ACE), the SARS-CoV-2 cell surface receptor. 19 ,20 SIT1 expression in the lungs is mainly present in pneumocytes 21 , where SIT1 should be scrutinized for involvement in SARS-CoV-2 viral entry. However, the relevant locus also contains a cluster of genes encoding chemokine receptors, including the CC-motif chemokine receptor 9 (CCR9) and the C-X-C motif chemokine receptor 6 (CXCR6), the latter have been shown to regulate the partitioning of lungresident memory CD8 T-cells throughout the sustained immune response to airway pathogens, including influenza viruses. 22 In the publicly available results from the Covid-19 Host Genetics Consortium 23 , a similar association has been observed in an analysis of Covid-19 affected cases vs. a population based sample, however not at genome-wide significant levels, still corroborating our observations. These parallel observations with our analysis, which focused on severe cases with pulmonary failure only, points to the relevance of ascertainment bias in genetic studies of Covid-19, as clinically significant Covid-19 patients are more likely to be included in research projects than asymptomatic cases. The significantly higher frequency of the risk allele at the chromosome 3 locus found in the present study in patients requiring mechanical ventilation compared with oxygen only, provides further support to a role for this genetic region in modifying Covid-19 severity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . Preliminary clinical reports have suggested the involvement of ABO blood groups in Covid-19 susceptibility (preprints by Zhao et al. 24 and Zietz et al. 25 ). Similar reflections can thus be made for case ascertainment as for the chromosome 3 locus, and ABO blood groups have also been implicated in SARS-CoV-1 susceptibility. 26 Our data thus aligns with the suggestions that blood group O is associated with lower risk compared with non-O blood groups whereas blood group A is associated with higher risk of acquiring Covid-19 compared with non-A blood groups. 24,25 Unlike for Chromosome 3, we found no difference between patients receiving oxygen supplementation only and those with mechanical ventilation any kind. 25 However, it should be noted that the lead SNP at the ABO locus in our study (rs657152) has been associated with elevated interleukin-6 (IL-6) levels in childhood obesity in previous GWAS 27 , providing a hypothetical link to the established association of elevated IL-6 with severity and mortality of Covid-19. 28 Furthermore, genetic variation at the ABO locus has previously been associated with a number of procoagulant markers such as von Willebrand factor and Factor VIII, and the potential relationship between our genetic findings and the significant coagulopathy that is observed in severe Covid-19 warrants further attention. We are fully aware that the pragmatic aspects leading to feasibility of this massive undertaking in a very short period of time during extreme clinical circumstances of the pandemic led to certain limitations that will be important to explore in follow-up studies. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . ethnicities (genetically population outliers). That said, we took great care to minimize variability between cases and controls arising from such sources, and that could have been introduced from differences between genotyping platforms 29 , e.g. limiting our inclusion of controls to those genotyped on the Illumina Global Screening Array, despite thus reducing our statistical power. Further exploration of current findings, both as to their utility in clinical risk profiling of Covid-19 patients and mechanistic understanding of the underlying pathophysiology, is now warranted. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. IJC2018-035131-I. We would also like to thank Goncalo Abecasis and his team for providing the Michigan imputation server. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . 2 1 ECMO: Extracorporeal membrane oxygenation; IQR: interquartile range All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . https://doi.org/10.1101/2020.05.31.20114991 doi: medRxiv preprint 2 2 All association test statistics were adjusted for the top 10 principal components from principal component analysis. *Two analyses were performed, "main", only correcting for principal components, and "corr. age, gender", correcting for age and gender in addition to 10 principal components. In the corrected analysis, 25 controls are excluded from the Spanish and meta-analysis due to missing covariate data. Patient samples from three Italian and four Spanish hospitals were collected around the peak of the local epidemics and ethics applications were quickly obtained through fast-track procedures, i.e. every local ethical review board supported Covid-19 studies by rapid turn-around times, facilitating this fast de novo data generation. Within 6 weeks, all collected blood samples were centrally isolated, genotyped and analysed. The rapid workflow from patients to target identification illustrates the utility of GWAS, a standardized tool in research that often relies on international and interdisciplinary cooperation. One centre alone could not have completed this study, not mentioning the increase in statistical power through multi-centre patient contribution. Speed of data production depended heavily on lab automation and speed of analyses reflect existing analytical pipelines and generous support of public so-called "imputation servers" (here, the Michigan imputation server of the Abecasis group). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 2, 2020. . https://doi.org/10.1101/2020.05.31.20114991 doi: medRxiv preprint A Novel Coronavirus from Patients with Pneumonia in China Coronaviridae Study Group of the International Committee on Taxonomy of V. The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72314 Cases From the Chinese Center for Disease Control and Prevention Severe Covid-19 Management of COVID-19 Respiratory Distress Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Risk factors for severity and mortality in adult COVID-19 inpatients in Wuhan Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19 Risk Factors of Fatal Outcome in Hospitalized Subjects With Coronavirus Disease 2019 From a Nationwide Analysis in China Coagulation abnormalities and thrombosis in patients with COVID-19 Endothelial cell infection and endotheliitis in COVID-19 Pulmonary Vascular Endothelialitis, Thrombosis, and Angiogenesis in Covid-19 Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program Second-generation PLINK: rising to the challenge of larger and richer datasets METAL: fast and efficient meta-analysis of genomewide association scans Blood Group ABO Genotyping in Paternity Testing IPD-IMGT/HLA Database Human intestine luminal ACE2 and amino acid transporter expression increased by ACE-inhibitors Trilogy of ACE2: a peptidase in the reninangiotensin system, a SARS receptor, and a partner for amino acid transporters Proteomics. Tissue-based map of the human proteome CXCR6 regulates localization of tissue-resident memory CD8 T cells to the airways The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic Recombination rate (cM/Mb)