key: cord-0813842-xovapqi9 authors: Chen, W.; Zeng, Y.; Suo, C.; Yang, H.; Chen, Y.; Hou, C.; Hu, Y.; Ying, Z.; Sun, Y.; Qu, Y.; Lu, D.; Fang, F.; Valdimarsdampoacutettir, U. A.; Song, H. title: Genetic predisposition to psychiatric disorders and risk of COVID-19 date: 2021-02-25 journal: nan DOI: 10.1101/2021.02.23.21251866 sha: fb72b681a587d8a91831fb86755d9ae78ca2a3c5 doc_id: 813842 cord_uid: xovapqi9 Background Pre-pandemic psychiatric disorders have been associated with an increased risk of COVID-19. However, the underlying mechanisms remain unknown, e.g. to what extent genetic predisposition to psychiatric disorders contributes to the observed association. Methods The analytic sample consisted of white British participants of UK Biobank registered in England, with available genetic data, and alive on Jan 31, 2020 (i.e., the start of the COVID-19 outbreak in the UK) (n=346,554). We assessed individuals' genetic predisposition to different psychiatric disorders, including substance misuse, depression, anxiety, and psychotic disorder, using polygenic risk score (PRS). Diagnoses of psychiatric disorders were identified through the UK Biobank hospital inpatient data. We performed a GWAS analysis for each psychiatric disorder in a randomly selected half of the study population who were free of COVID-19 (i.e., the base dataset). For the other half (i.e., the target dataset), PRS was calculated for each psychiatric disorder using the discovered genetic variants from the base dataset. We then examined the association between PRS of each psychiatric disorder and risk of COVID-19, or severe COVID-19 (i.e., hospitalization and death), using logistic regression models. The ascertainment of COVID-19 was through the Public Health England dataset, the UK Biobank hospital inpatient data and death registers, updated until July 26, 2020. For validation, we repeated the PRS analyses based on publicly available GWAS summary statistics. Results 155,988 participants (including 1,451 COVID-19 cases), with a mean age of 68.50 years at COVID-19 outbreak, were included for PRS analysis. Higher genetic liability forwards psychiatric disorders was associated with increased risk of both any COVID-19 and severe COVID-19, especially genetic risk for substance misuse and depression. The adjusted odds ratios (ORs) for any COVID-19 were 1.15 (95% confidence interval [CI] 1.02-1.31) and 1.26 (1.11-1.42) among individuals with a high genetic risk (above the upper tertile of PRS) for substance misuse and depression, respectively, compared with individuals with a low genetic risk (below the lower tertile). Largely similar ORs were noted for severe COVID-19 and similar albeit slightly lower estimates using PRSs generated from GWAS summary statistics from independent samples. Conclusion In the UK Biobank, genetic predisposition to psychiatric disorders was associated with an increased risk of COVID-19, including severe course of the disease. These findings suggest the potential role of genetic factors in the observed phenotypic association between psychiatric disorders and COVID-19, underscoring the need of increased medical surveillance of for this vulnerable population during the pandemic. COVID-19, especially genetic risk for substance misuse and depression. The adjusted odds ratios (ORs) for any COVID-19 were 1.15 (95% confidence interval [CI] 1.02-1.31) and 1.26 (1.11-1.42) among individuals with a high genetic risk (above the upper tertile of PRS) for substance misuse and depression, respectively, compared with individuals with a low genetic risk (below the lower tertile). Largely similar ORs were noted for severe COVID-19 and similar albeit slightly lower estimates using PRSs generated from GWAS summary statistics from independent samples. In the UK Biobank, genetic predisposition to psychiatric disorders was associated with an increased risk of COVID-19, including severe course of the disease. These findings suggest the potential role of genetic factors in the observed phenotypic association between psychiatric disorders and COVID-19, underscoring the need of increased medical surveillance of for this vulnerable population during the pandemic. With over 110 million infected people and 2.4 million related deaths, the coronavirus disease 2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus, has led to an unprecedented crisis worldwide 1 . Evidence suggest that individuals have varying propensity for being infected with COVID-19 and that infected patients demonstrate heterogeneous outcomes 2 , identification of populations with increased susceptibility to the disease, especially ones prone to severe disease course, is critical for optimizing preventive measures. Previous studies have reported an increased risk of infections, including life-threatening infections, among individuals with psychiatric disorders 3, 4 . Likewise, after the COVID-19 outbreak, accumulating evidence revealed that psychiatric disorders 5 , such as depression 2,6 , schizophrenia 7 , and substance abuse 8 were also associated with an elevated risk of COVID-19, possibly through similar mechanisms as those leading to other infections 9 . Beside the immune dysfunction as widely observed among individuals with psychiatric illness 10 , other explanations might include unfavorable lifestyle, such as smoking and physical inactivity 11 . Furthermore, one recent investigation suggested a shared genetic vulnerability to both psychiatric disorders and infection, reporting a strong genetic association between having at least one psychiatric diagnosis and the occurrence of infection 12 . However, to our knowledge, no study has so far explored whether the genetic predisposition to psychiatric disorders contributes to susceptibility for COVID-19 infection and severe disease course. Based on the results of genome-wide association studies (GWAS), a polygenic risk score (PRS), or the sum of all risk alleles weighted by the effect size of each variant, can be generated . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint and represent an individual's overall genetic risk for a given disease such as psychiatric disorders. It can further be used to predict risk of developing a second disease outcome, and thereby illustrate the genetic association between a disease pair [13] [14] [15] . As a continuation of our previous study that demonstrated a robust association between prepandemic psychiatric disorders and COVID-19 risk 9 , we here aimed to explore potential underlying mechanisms by testing whether genetic predisposition to psychiatric disorders is associated with risk of SARS-CoV-2 infection and progressive COVID-19 illness using the UK Biobank databases. Our study is based on data from the large-scale prospective cohort of UK Biobank, which enrolled 502,507 individuals aged between 40 and 69 years across the UK during 2006-2010. The genotyping data were obtained from 488,377 blood samples collected at baseline for each participant. They were assayed using the Applied Biosystems UK BiLEVE and UK Biobank Axiom Array 16 . After the quality control following the UK Biobank pipeline, genotype imputation was further completed using the Haplotype Reference Consortium (HRC) and UK10K haplotype resource as reference panels 16 . Kinship coefficient and principal components (PCs), calculated using the KING tool, were also provided by the UK Biobank. Details about the UK Biobank quality control pipeline and imputation methods have been described previously 16 . The final quality controlled and imputed genotypes dataset was the basis of the present analysis, containing more than 93 million autosomal SNPs for 346,554 individuals. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Phenotypic data such as sex and birth year were collected at recruitment using questionnaire. Health-related outcomes were obtained through periodically linked data from multiple national datasets, including death registries and inpatient hospital data from across England, Scotland, and Wales 16 To keep consistent with our previous analysis of phenotypic association 9 , we used the same approach for the ascertainment of psychiatric disorders and COVID-19 in the present study. Briefly, we defined five broad diagnostic categories of psychiatric disorders, including substance misuse, depressive disorders, anxiety disorders, psychotic disorders, and stressrelated disorders, based on hospital admissions with a diagnosis of these disorders according to cases from the latter two resources, i.e., hospitalization or death record, were considered as is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.21251866 doi: medRxiv preprint ´sever cases´. The genetic analysis was conducted among white British UK Biobank participants who were registered in England at recruitment, alive and trackable on Jan 31, 2020, and having available genetic data (n=346,554). Standard GWAS quality control was performed. Briefly, we restricted our analysis to the autosomal biallelic SNPs and removed variants with a call rate <98%, a minor allele frequency <0.01, or deviation from Hardy-Weinberg equilibrium (P < 10 −6 ). We then removed individuals having genotyping rate <98% and outlier samples based on abnormal heterozygosity level, leaving 340,632 participants for further analysis. Details of the quality control are summarized in Supplementary Figure1 . Due to the human heterogeneity and thereby possibly limited portability of PRS between populations, even within those with similar ancestries 18 , we performed a GWAS followed by the PRS analysis for each type of the studied psychiatric disorders by splitting the UK Biobank data into a base and target dataset (study design in Figure 1 ). To avoid the influence of the phenotypic association (e.g., between depression and COVID-19) on the identification of genetic background for the exposure trait (e.g., depression), in the first step, we removed all individuals with confirmed COVID-19 (n=1,451). Then performed GWAS for each trait in a subsample of the study population, namely 50% of the participants randomly selected from the study population (i.e., the base dataset). Second, we calculated a PRS for each exposure trait for the remaining participants (i.e., the target dataset) 19 , as the weighted sum of the risk alleles based on the summary statistics derived from the GWAS results of the base dataset. Here, the summary statistics referred to the effect sizes and standard errors for the variants. We computed . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.21251866 doi: medRxiv preprint the PRS under ten p value thresholds (i.e., 5 × 10 −8 , 1 × 10 −6 , 1 × 10 −4 , 1 × 10 −3 , 0.05, 0.1, 0.2, 0.3, 0.4 and 0.5). Related individuals, up to the third degree (i.e., kinship coefficients >0.044) 20 , were removed prior to each GWAS or PRS analysis, with the principle of prioritizing the stay of individuals with the corresponding phenotype, if any. Furthermore, in a validation dataset including all eligible and unrelated participants (n=287,240), we also generated PRS for the above psychiatric disorders based on summary statistics from publicly available GWAS [21] [22] [23] [24] . PLINK (version 1.9) was used for GWAS and PRS calculation. First, to validate the predictability of PRS on its each psychiatric phenotype, we used the logistic regression model to measure the association between PRS and each psychiatric disorder category in the base dataset, adjusting for sex, birth year, genotyping batch and significant (i.e., p<0.05) PCs for population heterogeneity. Second, in the target dataset, we examined the association between PRS of a specific psychiatric disorder category and the risk of COVID-19, as well as severe COVID-19, using odds ratios (ORs) with 95% CIs derived from logistic regression models, adjusting for the covariates mentioned above. In addition to considering the standardized PRS as a continuous variable, we also divided participants into low, moderate and high genetic risk groups based on the tertile distribution of the PRS and compared the risk of COVID-19 outcomes using the low genetic risk group (i.e., below the lower tertile of PRS) as a reference. The variance explained by PRS was assessed as the difference in variance, as measured by Nagelkerke's squared (R square) from the full model including the PRS and the basic model adjusting for sex, birth year, genotyping batch and significant PCs. The analysis . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint To test the robustness of our results, we repeated all the analyses in the validation dataset using PRSs based on summary statistics from publicly available GWAS [21] [22] [23] [24] . A 2-sided p<0·05 was considered statistically significant. All the analyses were done with R software, version 4.0. In total, 340, 632 participants ( Based on the base dataset, the GWAS results were summarized using Manhattan plot in Supplementary Figure 2 . In brief, except for stress-related disorders, the PRSs (as continuous variables) were significantly associated with increased risk of the corresponding psychiatric disorder in the target dataset (Supplementary Tables 2-6 ). However, the highest variance explained by one standard deviation increase of PRSs was moderate, with ORs ranging between 1.12 (95% CI 1.09-1.15) and 1.17 (95% CI 1.14-1.20). . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Using PRS as a proxy of genetic predisposition to a given psychiatric disorder, we examined the association between genetic risk of a psychiatric disorder and risk of COVID-19, and severe COVID-19, in the target dataset. Given the relatively poor accuracy of the PRS for stressrelated disorders, we did not proceed with that PRS in this analysis (Supplementary Table 6 ). While different p value thresholds were used for PRS calculation, the PRS models with the highest R square, interpreted as the ones with the largest variance explained by the specific psychiatric disorder, were selected as main models for further analyses (Figure 2 ). Despite the marginally significant association for anxiety, we obtained elevated risks of any COVID-19 in relation to one standard deviation increase in the PRS of all studied psychiatric disorders (Table 1 ). After adjusting for all covariates, the most pronounced OR was observed for depression, indicating a 10% increased risk of any COVID-19 per standard deviation increase of depression PRS. Analysis of categorized PRS revealed similar results ( Figure 3) . Notably, for substance misuse and depression, we also observed a dose-response relationship. Compared to individuals with a low genetic risk of depression (PRS