key: cord-1055743-gqetghc6 authors: Ong, Jue-Sheng; Gharahkhani, Puya; Vaughan, Thomas L; Whiteman, David; Kendall, Bradley J; MacGregor, Stuart title: Assessing the genetic relationship between gastro-esophageal reflux disease and risk of COVID-19 infection date: 2021-09-06 journal: Hum Mol Genet DOI: 10.1093/hmg/ddab253 sha: 04022a5b9ccaf89c99f042f15ab18c7f6d663929 doc_id: 1055743 cord_uid: gqetghc6 BACKGROUND: Symptoms related with Gastro-esophageal reflux disease (GERD) were previously shown to be linked with increased risk for the 2019 coronavirus disease (COVID-19). We aim to interrogate the possibility of a shared genetic basis between GERD and COVID-19 outcomes. METHODS: Using published GWAS data for GERD (78 707 cases; 288 734 controls) and COVID-19 susceptibility (up to 32 494 cases; 1.5 million controls), we examined the genetic relationship between GERD and three COVID-19 outcomes: risk of developing severe COVID-19, COVID-19 hospitalization and overall COVID-19 risk. We estimated the genetic correlation between GERD and COVID-19 outcomes followed by Mendelian randomization (MR) analyses to assess genetic causality. Conditional analyses were conducted to examine whether known COVID-19 risk factors (obesity, smoking, type-II diabetes, coronary artery disease) can explain the relationship between GERD and COVID-19. RESULTS: We found small to moderate genetic correlations between GERD and COVID-19 outcomes (rg between 0.06–0.24). MR analyses revealed a OR of 1.15 (95% CI: 0.96–1.39) for severe COVID-19; 1.16 (1.01–1.34) for risk of COVID-19 hospitalization; 1.05 (0.97–1.13) for overall risk of COVID-19 per doubling of odds in developing GERD. The genetic correlation/associations between GERD and COVID-19 showed mild attenuation towards the null when obesity and smoking was adjusted for. CONCLUSIONS: Susceptibility for GERD and risk of COVID-19 hospitalization were genetically correlated, with MR findings supporting a potential causal role between the two. The genetic association between GERD and COVID-19 was partially attenuated when obesity is accounted for, consistent with obesity being a major risk factor for both diseases. More recently, symptoms related with gastro-esophageal reflux disease have been reported to be associated with COVID-19 in several epidemiological studies (3, 4) . Gastro-esophageal reflux is a chronic disease characterised by the frequent regurgitation of acid arising from the stomach to the esophagus. Prior in-vivo studies have shown evidence for digestive systems to play a role in the pathogenesis of SARS-CoV-2 as the key receptors ACE2 and TMPRSS2 were shown to be co-expressed in both the upper epithelial and gland cells along the esophagus and the colon (5) . On the other hand, the observational link between GERD and COVID-19 is unsurprising since both diseases share some major risk factors and present common symptoms (Figure 1 ) (6) (7) (8) (9) (10) . For instance, obesity and smoking are established risk factors for both GERD and COVID-19 (9, 11) . However, most of these studies were observational in nature and causality cannot be assumed. Genetic data offers an interesting avenue to validate these associations given the availability of large scale genome-wide association study (GWAS) data on both COVID-19 susceptibility and gastro-esophageal reflux disease (GERD). If there is a direct causal effect between GERD and COVID-19 we would expect that genetic variants which increase GERD risk will also increase COVID-19 risk. Genetic-derived findings can hence provide a complementary perspective into the biological mechanisms linking both diseases (12) . In this study, we interrogate the possibility of a shared genetic architecture between COVID-19 and GERD in two ways. We first estimated the genetic correlation between GERD diagnosis and several COVID-19 related outcomes, followed by a genetic instrumental variable analysis to evaluate whether genetically predicted GERD diagnosis is linked with risk of COVID-19 infection. We further implemented a conditional analysis on common risk factors between GERD and COVID-19 to evaluate potential mediation mechanisms. The sample size for each COVID-19 outcome was shown in Table 1 The genetic correlation between GERD and COVID-19 outcomes after conditioning on known The correlation matrix in Figure 3 (a) reveals the estimated genetic correlation between GERD, COVID-19 risk factors and COVID-19. Each of these risk factors showed strong evidence of genetic overlap with GERD (rg between 0.2 and 0.4). We also found moderate genetic correlations for BMI, CAD, cigarette smoked per day, risk of Type-2 diabetes with risk of COVID-19 hospitalization (COVID-A2) and risk of severe COVID-19 infection (COVID-B2) (Figure 3(a) ). However, the estimated correlation of these risk factors with overall COVID-19 susceptibility (COVID-C2) is much weaker. We hence re-estimated the genetic correlation between GERD and COVID-19 outcomes after adjusting for these risk factors. Apart from obesity-adjusted COVID-19 and smoking-adjusted COVID-19 where our revised rg estimate between GERD and COVID-19 outcomes showed evidence of attenuation towards the null, findings for other covariate-adjusted models remain widely consistent with the original unadjusted rg estimates (see To assess our genetic instruments for potential horizontal pleiotropic association via known COVID-19 risk factors, we evaluated the association between each of the 88 variants and the genetic effect sizes on BMI, smoking phenotypes, risk of T2D and risk of CAD using publicly available summary statistics (13) (14) (15) (16) . In our analysis, at least 50 SNPs showed evidence of pleiotropy (i.e. absolute Z-scores > 5) on one or more of the aforementioned COVID-19 risk factors. The heatmap in Figure 4 illustrates the distribution of Z-scores across these risk factors, indicating pervasive pleiotropic association between GERD instruments and established COVID-19 risk factors. To control for potential genetic confounding between GERD and COVID-19 outcomes, we performed a Our genetic analyses reveal strong genetic evidence supportive of a potentially causal relationship between GERD and COVID-19 susceptibility. Genetically higher odds of having GERD was associated with ~15% increase in risk of severe COVID-19 and COVID-19 hospitalization, consistent with retrospective observational findings (3, 17) . Adjustment for known COVID-19 risk factors, most notably obesity, partially weakens the association between GERD and COVID-19 although with 95% confidence intervals that overlap the unadjusted estimates. Drawing a direct causal inference between GERD and COVID-19 can be difficult, as both diseases share common risk factors such as smoking, diabetes and obesity. For instance, obesity was previously shown to be causally associated with both GERD and COVID-19 outcomes in earlier MR findings. (18, 19) Genetic correlation analyses further revealed strong genetic overlap between these risk factors with GERD and COVID-19 risks. We controlled for these risk factors via mediation analyses, which typically reduced (but did not completely eliminate) the genetic correlations between GERD and COVID-19 outcomes (Figure 3) . Similarly, our MR estimate on COVID-19 outcomes after adjusting for these risk factors showed partial attenuation towards the null, but remains widely consistent with a moderate adverse effect ( Figure 5 ). Our analysis primarily focused on GERD susceptibility as opposed to previous findings that evaluated the use of PPIs specifically. (3) We showed in our previous study that GERD diagnosis attained through selfreport and medication use (i.e. GERD cases inferred through individuals using GERD-related medications such as PPIs) were genetically very similar (20) and hence our genetics based approach cannot reliably separate the cause and effect of PPI use. That is, our data cannot clarify whether increased risk from COVID-19 relates to GERD and its complications per se, or to associated treatments such as PPIs. Most of our GERD instruments showed strong evidence of being replicated in an independent cohort (21), reducing chances of there being winner's curse bias (22) . Our analysis also tried to control for genetic confounding arising from a comorbidity with obesity, smoking behaviour, diabetes and risk of cardiovascular complications -upon which the association with COVID-19 was unchanged apart from the model adjusting for obesity. While these findings are consistent with the role of PPIs promoting greater risk for COVID-19 through a mechanism independent of obesity and/or detection bias (5), further validations are required. Several limitations ought to be considered. Firstly, our study focused chiefly on evaluating COVID-19 susceptibility and severity/hospitalization relative to the non-infected population; while GWAS data was available for severe vs non-severe COVID-19 within COVID-19 cases, the sample sizes were more limited and our power was limited. To obtain heritability estimates in the genetic liability scale one is required to provide an approximate prevalence (heritability does not change dramatically if this is misspecified), we tentatively assume a population-wide COVID-19 prevalence of 10% based on availability of data (23) . We note however that the mis-specification of the prevalence has only negligible impact on the genetic correlation estimates (24) . Instruments used for GERD susceptibility were derived from a symptom-based GERD definition (20) . Even though we previously showed that the genetic architecture of our broad GERD definition is similar to those obtained via more robust clinical diagnosis, these instruments are highly heterogeneity in nature. We attempt to minimize biases arising from instrument misspecification by adopting MR models such as median and mode-based estimators, though these models are typically less powered. Whilst sample size for the analyses on overall COVID-19 infection outcome was the largest, we did not observe strong evidence for an association with GERD. One possible explanation is that the overall COVID-19 phenotype might be heterogeneous, which is reflected in the relatively lower estimated heritability. We performed post-hoc analyses using an earlier dataset (based on the COVID-19 HGI Release 3 in September 2020) before the availability of vaccines across US and Europe, in an attempt to capture a better phenotype. We did observe evidence for an association with GERD, though the estimated effect sizes for overall risk of COVID-19 were still lower than those derived for COVID-19 severity and risk of COVID-19 hospitalization (Supplementary Table 14) . Lastly, our analysis found very little evidence for reverse causality from COVID-19 susceptibility to GERD, though our power was limited as there are only a handful of COVID-19 associated variants. Due to the nature of our study, we were also unable to directly compare our findings to those evaluating duration of PPIs used on COVID-19 severity (3). Genetically-derived findings were conceivably less biased by confounding and reverse causality, however issues of residual pleiotropy cannot be completely We obtained the largest GWAS summary data for GERD susceptibility from a study of 71,522 GERD cases and 261,039 controls of European ancestry from the UK Biobank and the Australian QSKIN cohorts (20) . Both UK Biobank and QSKIN are population-based cohorts with predominantly middle aged participants. (26, 27) Details on the genotyping, genetic-QC and imputation for these studies had been previously described (26, 27) . GERD cases in both studies were derived through a combination of self-report, ICD-10 codes (K21), GERD-related medication or heartburn status (QSKIN only), though our previous findings have found strong genetic correlations between broad and clinical GERD definitions (20) . For the instrumental variable analyses, 88 GERD-associated variants were selected based on SNPs which reached genome-wide significance in a multi-trait GERD GWAS meta-analysis (21) . Initiative website (https://www.covid19hg.org/results/) (28, 29) . For the genetic analyses, we used the genetic summary statistics for each COVID-19 trait excluding both the participants from 23andMe and UK Biobank (January 2021 data release) to prevent bias for the genetic causal inference analyses. initiative are available in the COVID-19 HGI partners interactive website (https://www.covid19hg.org/partners/). In total, GWAS data from 26 individual studies were metaanalysed to generate the combined GWAS summary statistics including up to 32,494 cases and more than 1 million controls. Information on the analysis protocol and how the genetic data from each study was quality controlled is provided in Supplementary Notes. We manually mapped the chromosome:basepair variant notation in the GWAS summary statistics back to RSID (under built 37) using the 1000G European reference panel for all our analyses. The three primary COVID-19 related outcomes evaluated in this study are provided below. Genetically predicted susceptibility towards GERD, derived through combination of self-report status, ICD-10 codes, hospital records and use of GERD-related medications such as omeprazole. Note that we have previously shown that the GERD defined through these definitions were highly consistent, i.e. the genetic correlations between self-report, medication-inferred and clinical GERD definitions were very close to one (20) . Genetic Table 1 . Genetic correlation analysis: The LD-score regression (30) (32) . To avoid over-inflation of effect sizes from the multi-trait model, we adopted genetic effect size estimates for the 88 SNPs from the published (univariate) GERD GWAS (20) . Harmonisation of effect alleles were performed to remove variants with palindromic alleles that cannot be under weak violation of key MR assumptions. Technical details of these alternative models have been previously described (35, 37, 38) . To aid the interpretation of our association estimates, we scaled our results to reflect the log(OR) on COVID-19 outcomes per doubling of odds on developing GERD. This was done by multiplying the resultant genetically derived log(OR) estimate from the IVW model by log(2)=0.693 (25) . We also checked for reverse causality by performing a MR analysis for COVID-19 outcomes on GERD susceptibility (see Supplementary Notes). All statistical analyses were performed using the statistical software R v4.0.3. Genetic instrumental variable analyses were conducted using the TwoSampleMR and MendelianRandomization R packages (39, 40) . To evaluate potential horizontal pleiotropy between GERD and COVID-19 susceptibility, we first estimated the genetic effect size for the 88 GERD SNPs on a series of established risk factors(9) on COVID-19 susceptibility. These risk factors include: body mass index (BMI), type-2 diabetes (T2D), smoking traits (cigarettes per day, smoking initiation) and coronary artery disease (CAD). The largest and most recent GWAS summary statistics for these risk factors were obtained from publicly available repositories (13) (14) (15) (16) . Details of the data source and curation of GWAS datasets for these phenotypes available in Supplementary Note (sources in Supplementary Table 2 ). Genetic effect size estimates for each risk factor are aligned based on the GERD-increasing allele. We then explored whether the pattern of genetic correlation/causality changed when risk factors such as body mass index and smoking were adjusted for in the model. We implemented the GCTA-mtCOJO We repeated our MR analyses between GERD and COVID-19 outcomes after adjusting for each COVID-19 risk factor in turn. This was done by regressing the GERD instrumental variables against the genetic estimate on COVID-19 susceptibility after conditioning on the aforementioned risk factors via mtCOJO. Similarly, MR estimates of the covariate-adjusted model derived from alternative MR techniques were used to validate robustness of the IVW findings. Given that obesity is potentially a major risk factor for both GERD and COVID-19 susceptibility, we performed a leave-one-SNP-out MR analysis to assess whether the genetic association between these traits are potentially driven by specific variants, such as those in BMI-associated genes (e.g. the FTO gene). BMI: Body mass index; CAD: Coronary artery disease; T2D: Type-II diabetes. For every COVID-19 outcome, each row represents the estimated IVW OR on risk of COVID-19 based on each unadjusted(original) model or model adjusted for known COVID-19 risk factors using GCTA-mtCOJO. Error-bars represent the 95% CI around the derived OR estimate. The smoking-adjusted model includes adjustment for both cigarette smoked per day and smoking initiation. Among all the aforementioned risk factors, the BMI-adjusted model showed the largest attenuation of effect towards the null. However, none of these covariate-adjusted estimates were meaningfully different from the unadjusted model. COVID-19 pathophysiology: A review Vaccines for COVID-19 Increased Risk of COVID-19 Among Users of Proton Pump Inhibitors Proton Pump Inhibitors are Risk Factors for Viral Infections: Even for COVID-19? COVID-19 and the gastrointestinal tract: more than meets the eye The prevalence of symptoms in 24 COVID-19): A systematic review and meta-analysis of 148 studies from 9 countries Risk factors for gastroesophageal reflux disease, reflux esophagitis and non-erosive reflux disease among Chinese patients undergoing upper gastrointestinal endoscopic examination Risk factors for Covid-19 severity and fatality: a structured literature review Gastroesophageal Reflux Disease (GERD) Risk factors for gastroesophageal reflux disease and analysis of genetic contributors Mendelian Randomization: New Applications in the Coming Age of Hypothesis-Free Causality Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use Association analyses based on false discovery rate implicate new loci for coronary artery disease Severe clinical outcomes of COVID-19 associated with proton pump inhibitors: a nationwide cohort study with propensity score matching Causal Inference for Genetic Obesity, Cardiometabolic Profile and COVID-19 Susceptibility: A Mendelian Randomization Study Genetic evidence that higher central adiposity causes gastro-oesophageal reflux disease: a Mendelian randomization study Gastroesophageal reflux GWAS identifies risk loci that also associate with subsequent severe esophageal diseases Multitrait genetic association analysis identifies 50 new risk loci for gastro-oesophageal reflux, seven new loci for Barrett's oesophagus and provides insights into clinical heterogeneity in reflux diagnosis Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies A systematic review and meta-analysis of published research data on COVID-19 infection fatality rates Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood Mendelian randomization with a binary exposure variable: interpretation and presentation of causal estimates Cohort profile: the QSkin Sun and Health Study The UK Biobank resource with deep phenotyping and genomic data Genomewide Association Study of Severe Covid-19 with Respiratory Failure The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic LD Score regression distinguishes confounding from polygenicity in genome-wide association studies Estimating missing heritability for disease from genome-wide association studies Calculating statistical power in Mendelian randomization studies Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases Robust methods in Mendelian randomization via penalization of heterogeneous causal estimates A Primer in Mendelian Randomization Methodology with a Focus on Utilizing Published Summary Association Data Evaluating the potential role of pleiotropy in Mendelian randomization studies The MR-Base platform supports systematic causal inference across the human phenome MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data Causal associations between risk factors and common diseases inferred from GWAS summary data Adjusting for heritable covariates can bias effect estimates in genome-wide association studies JSO has full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The funding bodies for our study had no role in the design or conduct of the study; collection, management, analysis and/or interpretation of the data; preparation, review or approval of the manuscript or the decision to submit the manuscript for publication. The summary statistics for the GERD GWAS meta-analysis can be downloaded from here (https://doi.org/10.6084/m9.figshare.8986589). GWAS summary statistics for the COVID-19 outcomes provided by the COVID-19 HGI (Release 5) can be accessed here (https://www.covid19hg.org/results/r5/). The individual GWAS summary statistics for traits used in the mediation analyses are publicly available with data sources provided in Supplementary Notes. All authors declare no conflict of interests.