key: cord-0793615-1d07b4y0 authors: Rescenko, R.; Peculis, R.; Ustinova, M.; Ansone, L.; Litvina, H. D.; Terentjeva, A.; Birzniece, L.; Megnis, K.; Kolesova, O.; Rozentale, B.; Viksna, L.; Rovite, V.; Klovins, J. title: Replication of LZTFL1 gene region as a susceptibility locus for COVID-19 in Latvian population. date: 2021-04-06 journal: nan DOI: 10.1101/2021.03.31.21254708 sha: 9af18c543b189e33cbf1797791c0061d08912028 doc_id: 793615 cord_uid: 1d07b4y0 The severity of COVID-19 disease is partly determined by host genetic factors that have been reported by GWAS. We evaluated nine previously reported genome-wide significant associations regardless of the disease severity in a representative sample from the population of Latvia. Our cohort consisted of 475 SARS-CoV-2 positive cases, from which 146 were hospitalized individuals and 2217 controls. We found three variants from Neanderthal introgression event at the 3p21.31 region to be significantly associated with increased risk of SARS-CoV-2 infection and hospitalization status. The strongest association was displayed by rs71325088 with Bonferroni adjusted P=0.007, OR=1.46 [95% CI 1.17-1.81]. We performed fine-mapping by exploring 1 Mb region at 3p21.31 locus and identified 9 SNPs with even lower p-values with the strongest association estimated for rs2191031 P=5e-05, OR = 1.40[CI 95% 1.19-1.64] located in the LZTFL1. We show clear replication of 3p.21.31 locus in an independent cohort which favors further functional investigation of leading variants. females was 64.5% in the case group and 62.0 % in the controls. After the genotype quality control and imputation using the TOPMed r2 imputation server, we performed an association study with sex and age as covariates. Association was performed with PLINK 1.9 (Chang et al., 2015) according to the parameters defined in SAIGE software (Zhou et al., 2020) , adding the first 20 PCs to control for population stratification. We selected a total of nine significantly associated SNPs reported in the GWAS catalog from two previously published studies analyzing the association between severe COVID-19 cases and population controls (Pairo-Castineira et al., 2021; The Severe Covid-19 GWAS Group, 2020) . Table 1 summarizes the list of the selected SNPs. No significant deviation in allele frequencies (AF) was found between the control group and the global AFs from the 1000G Phase 3 reference set (The 1000 Genomes Project Consortium, Auton, et al., 2015) . Out of the nine selected variants, we identified three significantly associated LZTFL1 gene polymorphisms rs71325088 (0. (Table 1) . We observed a high degree of linkage disequilibrium (LD) between all three SNPs reflected by almost identical frequencies in case and control groups. These variants were also the most significant (Phet = 7.2e-25) in the worldwide metaanalysis of individuals with the SARS-CoV-2 infection, hospitalization, and critical illness (The COVID-19 Host Genetics Initiative and Ganna, 2021). All of these variants are believed to represent the region to be a remnant of Neanderthal gene pool introgression into the modern human population (Zeberg and Pääbo, 2020) . LZFTL1 gene has been implicated in ciliogenesis and intracellular trafficking of ciliary proteins, probably impacting airway epithelial cell function (Promchan et al., 2020; Shelton et al., 2021) . Even though the primary analysis was performed on the case group that included all patients regardless of severity, there is an obvious bias toward the inclusion of symptomatic patients as they are more likely to be tested for the SARS-CoV-2 presence than asymptomatic cases. We also tested the association of the selected SNPs with disease severity in our study group using hospitalized patients as cases against the same group of controls (Table 1) . All three SNPs from the 3p21.31 region displayed a stronger association in the frame of this comparison (P=6e-05, OR=2.14 [CI 95% 1.54-2.99] and 4.2 times higher homozygote prevalence for All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We also explored the 500 kb regions around the rs11385942 to evaluate the association of other SNPs with the main trait in our study. In total nine 3p21.31 locus polymorphisms rs2191031, rs3774641, rs7289367, rs35896106, rs13071258, rs34668658, rs17763742, rs17763537 and rs73062389 displayed lower p-value in our cohort compared to the leading rs11385942, with the strongest association estimated for rs2191031 P=5e-05, OR = 1.40[CI 95% 1.19-1.64] located in the LZTFL1 gene ( Figure 1 ). Rs2191031 has a 2.3 times higher prevalence in our population compared to replicated polymorphisms. To date, no function or clinical relevance has been ascribed for this variant; however, it is equivalently associated with differential expression in the esophagus mucosa (1e-5, NES -.22) (GTEx Consortium, 2018) and multiple other tissues pointing to similar etiology as replicated variants (Shelton et al., 2021) . However, it is important to note that none of these nine variants had a lower P value than rs11385942 in a cohort of hospitalized COVID-19 patients. Using the retrospectively collected samples from a population-based biobank has some limitations. We cannot assess exposure to SARS-CoV-2 for this group of people and match that with the case group, as we do not have information on the possible rate of infection among this group. However, such a design does not increase type I error, and we did not aim to find genetic factors facilitating protection against SARS-CoV-2 infection in our study where such a design would not be appropriate. Another shortcoming is that most patients included in this study were recruited in December 2020, and extensive phenotype data, including follow-ups, are not yet available for analysis. Such data All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254708 doi: medRxiv preprint would allow elaborating interaction between different clinical data and genotype and include the development of post-COVID-19 complications as an essential phenotype for association study. In conclusion, we demonstrate supportive evidence for the involvement of human 3p21.31 locus in the pathophysiology of COVID-19 disease using an independent cohort of patients and controls from the Latvian population. It highlights the importance of this genomic region for genetic risk estimation in relation to SARS-CoV-2 infection and the robustness of proper genetic association studies for replication purposes. Notably, the results presented here provide a preliminary indication of variants with possible functional effects and calls for further studies exploring the validation of these variants. rs11385942 variant is emphasized with a diamond shape and set as reference for LD estimation with other SNPs in the region. Rs codes for three SNPs selected for replication are depicted in dark blue, while rs codes for other SNPs displaying lower p-value in our association study are in light blue. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 6, 2021. ; https://doi.org/10.1101/2021.03.31.21254708 doi: medRxiv preprint COVID-19 related mortality from UK Biobank data Mendelian randomization analysis identified genes pleiotropically associated with the risk and prognosis of COVID-19 Lifelines Chort Study, 2020. Lack of Association Between Genetic Variants at ACE2 and TMPRSS2 Genes Involved in SARS-CoV-2 Infection and Human Quantitative Phenotypes Genetic mechanisms of critical illness in COVID-19 The Clinical Manifestations and Chest Computed Tomography Findings of Coronavirus Disease 2019 (COVID-19) Patients in China: A Proportion Meta-Analysis Leucine zipper transcription factor-like 1 binds adaptor protein complex-1 and 2 and participates in trafficking of transferrin receptor 1