key: cord-0327701-qauso93y authors: Patil, S.; Pemmasani, S. K.; Chitturi, N.; Bhatnagar, I.; Acharya, A.; Subash, L. V. title: COVID-19 and Indian population: a comparative genetic analysis date: 2021-12-15 journal: nan DOI: 10.1101/2021.12.15.21267816 sha: 83b8320332d6c66f4f0f68958b00394d272f030f doc_id: 327701 cord_uid: qauso93y Background Major risk factors of COVID-19 include older age, male gender, and comorbidities. In addition, host genetic makeup is also known to play a major role in COVID-19 susceptibility and severity. To assess the genetic predisposition of the Indian population to COVID-19, a comparative analysis of the frequencies of polymorphisms directly or potentially associated with COVID-19 susceptibility, severity, immune response, and fatal outcomes was done between the Indian population and other major populations (European, African, East Asian, South Asian, and American). Materials and methods Polymorphisms directly or potentially associated with COVID-19 susceptibility, severity, immune response, and mortality were mined from genetic association studies, comparative genetic studies, expression quantitative trait loci studies among others. Genotype data of these polymorphisms were either sourced from the GenomegaDB database of Mapmygenome India Ltd. (sample size = 3054; Indian origin) or were imputed. Polymorphisms with minor allele frequency >= 0.05 and that are in Hardy-Weinberg equilibrium in the Indian population were considered for allele frequency comparison between the Indian population and 1000 Genome population groups. Results Allele frequencies of 421 polymorphisms were found to be significantly different in the Indian population compared to European, African, East Asian, South Asian, and American populations. 128 polymorphisms were shortlisted based on linkage disequilibrium and were analyzed in detail. Apart from well-studied genes, like ACE2, TMPRSS2, ADAM17, and FURIN, variants from AHSG, IFITM3, PTPN2, CD209, CCL5, HEATR9, SELENBP9, AGO1, HLA-G, MX1, ICAM3, MUC5B, CRP, C1GALT1, and other genes were also found to be significantly different in Indian population. These variants might be implicated in COVID-19 susceptibility and progression. Conclusion Our comparative study unraveled multiple genetic variants whose allele frequencies were significantly different in the Indian population and might have a potential role in COVID-19 susceptibility, its severity, and fatal outcomes. This study can be very useful for selecting candidate genes/variants for future COVID-19 related genetic association studies. Coronavirus disease -2019 (COVID- 19) , caused by severe acute respiratory syndrome coronavirus 2 66 (SARS-CoV2), has strained the healthcare system and economy of a majority of nations and has 67 crippled the human population [1, 2, 3]. COVID-19 is characterized by a range of clinical features 68 which are categorized as most common, less common, and rare (severe). The most common 69 symptoms are fever, dry cough, and fatigue, while less common symptoms are pneumonia without 70 noticeable hypoxemia, sputum production, sore throat, headache, chest pain, and diarrhea [4, 5] . 71 Acute respiratory distress syndrome, sepsis, acute cardiac injury, heart failure, acute kidney injury, 72 hypoxic encephalopathy are more common in severe cases [4, 5, 6] . Severe patients show elevated 73 levels of pro-inflammatory cytokines, indicating the presence of cytokine storm [4]. Self-reported 74 loss of smell and taste was also observed in some cases [7] . 75 Major risk factors for COVID-19 severity are older age, male gender, and comorbidities. A higher 76 case fatality rate (CFR) has been observed in older adults as compared to the younger population [8, 77 9, 10, 11, 12]. Relatively fewer female deaths are consistently observed as compared to males [10, 78 11, 12]. COVID-19 patients with comorbidities like hypertension, obesity, diabetes, and others are 79 more likely to develop severe complications than ones without any underlying diseases [13, 14, 15, 80 16, 17, 18]. ABO blood groups also seem to be contributing to COVID-19 related risk. While blood 81 group A is associated with an increased risk of severe COVID-19 outcomes, blood group O seems to 82 be protective [19, 20, 21] . 83 In addition to the above risk factors, host genetics can modulate susceptibility and severity of the 84 disease by regulating viral entry and host immune response [22, 23] . For example, rs12329760 85 (p.Val197Met) affects the stability of TMPRSS2 protein, which is implicated in SARS-CoV2 viral 86 entry into host cells. This polymorphism was found to be less frequent in severe COVID-19 cases as 87 compared to others, suggesting its protective role against severe outcomes of COVID-19 [24] . The C 88 allele of rs12252 in the IFITM3 gene that is known for inhibiting influenza virus entry into host cells 89 was found to be a genetic risk factor for COVID- 19 genetics in COVID-19 susceptibility and severity in the Indian population, we analyzed the 96 frequencies of the polymorphisms that are directly or indirectly implicated in COVID-19. We have 97 also made a comparison between Indian and other populations to understand allele frequency 98 distribution globally. 99 (Table 1) , eighteen polymorphisms implicated in the immune response to SARS-CoV2 infection 145 (Table 2) , fifty-eight polymorphisms directly or indirectly associated with COVID-19 severity (Table 146 3), nine polymorphisms associated with COVID-19 related mortality/fatal outcomes (Table 4), and 147 twelve polymorphisms associated with comorbidities of COVID-19 (Table 5) Supplementary Table 3 ). In the 'COVID-19 susceptibility' category, the number of variants with the 163 highest risk allele frequency was less than the variants with the lowest risk allele frequency in the 164 IND population. A similar trend was observed in EAS and AFR populations. However, an opposite 165 trend was observed in EUR and AMR populations (Fig. 1A) . In IND and AFR populations, COVID-166 19 severity-related variants with the highest risk allele frequencies were more than the lowest 167 frequency variants, while in EAS, EUR, and AMR populations this was found to be opposite (Fig. 168 1B). The number of variants associated with COVID-19 related mortality and having the highest risk 169 allele frequency was less than the variants with the lowest risk allele frequency in IND, EAS, and 170 AFR populations, while in SAS and AMR only the highest frequency variants were observed (Fig. 171 1C). 172 173 4 DISCUSSION 174 SARS-CoV-2 gains entry into human cells by binding its spike protein to ACE2 receptors. ACE2, a 176 part of the renin-angiotensin-aldosterone system (RAAS), is crucial in regulating blood pressure and 177 maintaining electrolyte balance in the body, physiology of the heart, kidneys, and lungs [44, 45] . 178 Genetic variations in ACE2 are known to be associated with its increased expression levels in 179 COVID-19 patients. These variants affect ACE2 transcription/translation and also alter binding 180 affinity to SARS-CoV2 spike protein, thereby affecting the susceptibility to SARS- . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted December 15, 2021. ; https://doi.org/10.1101/2021.12.15.21267816 doi: medRxiv preprint 0.078 (Table 4) , which is significantly greater than only the European population (0.046). Its 379 frequency is highest in the African population (0.256). 380 381 Twelve polymorphisms associated with comorbidities of COVID-19 were also analyzed (Table 5) Associations Between Genetically Predicted Protein Levels and COVID-674 Vitamin D deficiency is associated with COVID-19 positivity 676 and severity of the disease Does vitamin D deficiency increase the 678 severity of COVID-19? Current State of Evidence: Influence of Nutritional and 680 Nutrigenetic Factors on Immunity in the COVID-19 Pandemic Framework SPEG interacts with myotubularin, and its deficiency 683 causes centronuclear myopathy with dilated cardiomyopathy Molecular insights into cardiomyopathies 686 associated with desmin (DES) mutations Genetic variants are identified to increase risk of 689 COVID-19 related mortality from UK Biobank data Impact of population density on Covid-19 infected and 692 mortality rate in India Implications of the second wave of COVID-19 in 695 Effects of host genetic variations on response to, 697 susceptibility and severity of respiratory infections Dectin-1 and DC-SIGN polymorphisms 700 associated with invasive pulmonary Aspergillosis infection Evaluación de 704 variantes en los genes IL6R, TLR3 y DC-SIGN asociadas con dengue en una población colombiana 705 muestreada Variability in genes related to 708 SARS-CoV-2 entry into host cells (ACE2, TMPRSS2, TMPRSS11A, ELANE, and CTSL) and its 709 potential use in association studies The Immune Response and 711 Immunopathology of COVID-19 Dipeptidyl peptidase-4 (DPP4) inhibition in 714 COVID-19 Genetic Associations With Plasma Angiotensin 716 Converting Enzyme 2 Concentration: Potential Relevance to COVID-19 Risk The transcription factor 719 HNF1α induces expression of angiotensin-converting enzyme 2 (ACE2) in pancreatic islets from 720 evolutionarily conserved promoter motifs Infectivity and Progression of COVID-19 Based on 723 Selected Host Candidate Gene Variants IFITM proteins promote SARS-CoV-2 infection 726 and are targets for virus inhibition in vitro Association analysis framework of genetic 729 and exposure risks for COVID-19 in middle-aged and elderly adults Genome-wide association study identifies loci affecting blood 732 copper, selenium and zinc Identification of a Novel Susceptibility Marker for SARS-734 CoV-2 Infection in Human Subjects and Risk Mitigation with a Clinically Approved JAK Inhibitor in 735 First comprehensive computational analysis of 738 functional consequences of TMPRSS2 SNPs in susceptibility to SARS-CoV-2 among different 739 populations A bioinformatic approach to investigating 741 cytokine genes and their receptor variants in relation to COVID-19 progression Heatr9 is an infection responsive gene that affects 744 cytokine production in alveolar epithelial cells Role of ICAM-3 in the initial interaction of T 747 lymphocytes and APCs Human 749 gene polymorphisms and their possible impact on the clinical outcome of SARS-CoV-2 infection. 750 Arch Virol The African-American population with a low allele frequency of SNP rs1990760 752 (T allele) in IFIH1 predicts less IFN-beta expression and potential vulnerability to COVID-19 753 infection The interferon gamma gene polymorphism +874 A/T is 755 associated with severe acute respiratory syndrome Studying the 758 impact of nutritional immunology underlying the modulation of immune responses by nutritional 759 compounds-a review The 677C→T variant of MTHFR is the major genetic 762 modifier of biomarkers of folate status in a young, healthy Irish population Impact of Genetic Polymorphisms on Human 765 Angiopoietin-2 as a marker of endothelial 767 activation is a good predictor factor for intensive care unit admission of COVID-19 patients A dramatic rise in serum ACE2 activity in a 770 critically ill COVID-19 patient SARS-CoV-2 Receptor ACE2 Gene Is Associated with 772 Hypertension and Severity of COVID 19: Interaction with Sex, Obesity, and Smoking Polymorphisms in the 775 ACE2 locus associate with severity of COVID-19 infection Genomic atlas of the human plasma proteome. 778 Nature Elevated level of C-reactive protein may be an early marker to predict risk for severity 780 of COVID-19 CRP-level-associated polymorphism rs1205 within the 784 CRP gene is associated with 2-hour glucose level: The SAPPHIRe study CRP/IL-6/IL-10 Single-Nucleotide Polymorphisms Correlate 787 with the Susceptibility and Severity of Community-Acquired Pneumonia Association of ICAM3 genetic variant with severe acute 791 respiratory syndrome IL1RN coding variant is associated with lower risk of acute 793 respiratory distress syndrome and increased plasma IL-1 receptor antagonist SARS-CoV-2 and COVID-19: Is interleukin-6 (IL-6) the 'culprit lesion' of ARDS 796 onset? What is there besides Tocilizumab? SGP130Fc Age-dependent impact of the major common 799 genetic risk factor for COVID-19 on severity and mortality Post-COVID-19 pneumonia lung fibrosis: a worrisome sequelae in 802 surviving patients A common MUC5B promoter polymorphism and 805 pulmonary fibrosis Common variants at 21q22.3 locus influence MX1 807 and TMPRSS2 gene expression and susceptibility to severe COVID-19. iScience Genetic variants are identified to increase risk of 810 COVID-19 related mortality from UK Biobank data