key: cord-0822571-qi6dmgm7 authors: Ravikanth, Vishnubhotla; Sasikala, Mitnala; Naveen, Vankadari; Latha, Sabbu Sai; Parsa, Kishore Venkata Laxmi; Vijayasarathy, Ketavarapu; Amanchy, Ramars; Avanthi, Steffie; Govardhan, Bale; Rakesh, Kalapala; Kumari, Daram Sarala; Srikaran, Bojja; Rao, Guduru Venkat; Reddy, D. Nageshwar title: A variant in TMPRSS2 is associated with decreased disease severity in COVID-19 date: 2021-05-28 journal: Meta Gene DOI: 10.1016/j.mgene.2021.100930 sha: 765858d461ea4cbbf8d2081d33c592d548ab6b9f doc_id: 822571 cord_uid: qi6dmgm7 BACKGROUND: Mortality due to COVID-19 caused by SARS-CoV-2 infection varies among populations. Functional relevance of genetic variations in Angiotensin-converting enzyme 2 (ACE2) and Transmembrane serine protease 2 (TMPRSS2), two crucial host factors for viral entry, might explain some of this variation. METHODS: In this comparative study in Indian subjects, we recruited 510 COVID-19 patients and retrieved DNA from 520 controls from a repository. Associations between variants in ACE2 and TMPRSS2 with disease severity were identified by whole exome sequencing (WES, n = 20) and targeted genotyping (n = 1010). Molecular dynamic simulations (MDS) were performed to explore functional relevance of the variants. Cleavage of spike glycoprotein by wild and variant TMPRSS2 was determined in HEK293T cells. Potential effects of confounders on the association between genotype and disease severity were tested (Mantel-Haenszel test). RESULTS: WES identified deleterious variant in TMPRSS2 (rs12329760, G > A, p. V160M). The minor allele frequency (MAF) was 0·27 in controls, 0·31 in asymptomatic, 0·21 in mild-to-moderately affected and 0·19 in severely affected COVID-19 patients. Risk of severity increased with decreasing MAF: Asymptomatic: Odds ratio-0·69 (95% CI–0·52–0·93; p = 0·01); mild-to-moderate: Odds ratio-1·89 (95% CI–1·22–2.92;p = 0·004) and severe: Odds ratio-1·79 (95% CI–1·11–2.88;p = 0·01). No confounding effect of diabetes and hypertension were observed on the risk of developing severe COVID-19 disease with respect to genotype. MDS revealed decreased stability of TMPRSS2 with 160 M variant. Spike glycoprotein cleavage by TMPRSS2 reduced ~2·4-fold in cells expressing 160 M variant. CONCLUSION: We demonstrate association of TMPRSS2 variant rs12329760 with decreased disease severity in COVID-19 patients from India. locations. The United States and European countries have documented a large number of infections associated with higher mortality in the initial months of the pandemic, compared to South Asian countries. Although rates of infection have increased in South Asian countries, mortality rates remain low during the evolution of the pandemic. 1 These variations could be due to differences in virulence of viral strains 2,3 and host factors 4 including genetic makeup. 5 Few studies are available regarding disease-modifying genetic variations in the host. 6 Based on the known importance of ACE2 and TMPRSS2 in SARS-CoV-2 infection 7, 8 , two previous studies of large, publicly available genome databases 9,10 identified deleterious variants that might predict differences in COVID-19 disease severity among populations. These studies emphasized the urgent need for these genetic associations to be tested in COVID-19 positive patients. Increased mortality rates in COVID-19 are associated with age of the population, male sex and comorbidities such as diabetes, hypertension and obesity 11, 12 . In India, COVID-19 related mortality rates are lower (1·6%) than those in the United States (3·01%), Brazil (3·06%) and United Kingdom (12·0%). 13 Despite high prevalence of diabetes 14 and metabolic syndrome 15 in India, the observed differences in the rates of mortality among countries raises an intriguing question whether functionally relevant variants in ACE2 and TMPRSS2 contribute to differences in infection rates and mortality. The aim of our study was to i) identify functionally relevant variants in ACE2 and TMPRSS2 in COVID-19 positive patients compared to controls, ii) demonstrate the clinical relevance of the variant, and iii) explore the association of the variants with severity of COVID-19. To identify the variants, we sequenced complete exonic regions of healthy individuals. We report a variant rs12329760 in the TMPRSS2 gene that is associated with mild symptoms of COVID-19. In this comparative study, 510 consecutive COVID-19 patients, who tested positive for SARS-CoV-2 by qRT-PCR during 5 th May 2020 and 30 th August 2020 were recruited at AIG Hospitals, a tertiary care hospital in Hyderabad, India. All the participants provided written informed consent. Patients with flu-like symptoms who tested negative for SARS-CoV-2 by qRT-PCR and had a normal CT study were excluded. Whole blood (3ml) was collected from J o u r n a l P r e -p r o o f Journal Pre-proof all COVID-19 patients for genotyping . The diagnosis and classification of COVID-19 patients was established based on the diagnostic criteria of the Diagnostic and Therapeutic Program of Novel Coronavirus Pneumonia (6th Version for Trial Implementation) . 16 Patients with a positive qRT-PCR for SARS-CoV-2 were classified into the following categories of COVID-19: i) asymptomatic: no symptoms; ii) mild: headache, dry cough, myalgia with or without ground glass opacities; iii) moderate: fever, breathlessness, and ground glass opacities on imaging; iv) severe: respiratory distress (respiratory rate ≥30 breaths/min), oxygen requirement more than 6L/min or requirement of non-invasive/invasive ventilator support, with lesions significantly progressing >50% within 24-48 h on pulmonary imaging. 17 In addition, DNA from healthy individuals (n=520),retrieved from AIG Hospital's DNA repository (collected during 2016-2018 before COVID-19 for other research purposes) was considered as control group to generate allelic frequency for the identified variants. The study participants were recruited after the protocol was approved by the Institutional Ethics Committee of AIG Hospitals. All the participants had provided informed consent to collect clinical data and blood samples. Confidentiality of all the samples was maintained. Key genotyping methods are summarized here, with additional details provided as Supplementary Material. DNA was extracted from whole blood using a commercial kit (Bioserve Biotechnologies, India). Complete exonic regions (n=20 DNA samples considered as controls) were amplified employing Ion Ampliseq Exome RDY kit and sequenced on the Next generation sequencer (NGS-Ion Proton; Life technologies, USA). Generated sequences were aligned to reference human genome and annotated using Ion reporter. Functionally relevant variants were identified using Polyphen and SIFT scores. Primers were designed using Primer-Z software 18 (Supplementary Table T2 ) to amplify the flanking regions of variants in ACE2 and TMPRSS2.All the amplicons were sequenced on Beckman GeXP system. Genotypes were interpreted using Genome Lab GeXP software (v10·2). Minor allele frequency (MAF) for the Indian ethnicity was calculated based on the genotype data from this study. Insilico analysis was performed using I-mutant 19 and PhyreRisksoftware. 20 The MAF for rs12329760 variant in TMPRSS2 for various ethnicities were retrieved from ENSEMBL genome browser and the MAF generated in this study was used for the Indian ethnicity. Corresponding mortality rates for the ethnicities were retrieved from Worldometer (https://www.worldometers.info/coronavirus/). Pearson's correlation coefficient (r) was used to measure the strength of association between MAF and mortality per million. To assess the genotype-based differences in the mRNA and protein expression of TMPRSS2, paraffin embedded human normal lung tissue blocks were identified and retrieved (n=10) from AIG Hospital's pathology repository. RNA was isolated, converted to cDNA and relative gene expression was evaluated (Sybr chemistry). Samples were analyzed in duplicates and data were normalized against human GAPDH. Relative gene quantification was performed using the 2 -ΔΔCT method 21 and expressed as Log2 fold change. Genotyping for TMPRSS2 variant was performed following genotyping protocol. Tissue sections (~0·5 µm) of paraffin embedded blocks were immunostained with antihuman rabbit TMPRSS2 antibodies (Invitrogen USA), followed by anti-rabbit HRP conjugated secondary antibodies and stained with DiaminoBenzidine (DAB) after antigen retrieval and counter stain. Images were captured using light microscope (Olympus Tokyo Japan). The role of TMPRSS2 variant V160M in spike protein cleavage was studied in a cell culture system employing HEK-293T cells which do not endogenously express TMPRSS2. HEK293T cells were procured from ATCC and maintained routinely in DMEM supplemented with 10% fetal bovine serum at 37 o C with 5% CO2. HEK 293T cells were The study was powered based on existing literature (0·22 MAF for GIH-retrieved from ENSEMBL). 23 A sample size of 313 is adequately powered with confidence level at 95% to generate MAF in controls. Continuous variables were expressed as mean and standard deviation, categorical variables as proportions. Patient characteristics were compared using ANOVA for continuous variables and Chi-square or Fisher's exact test for categorical variables. To obtain Odds ratio, we compared the genotypes between Controls Vs Asymptomatic; Asymptomatic Vs Mild to moderate and Asymptomatic Vs Severe categories under Dominant and Recessive genetic models. "A" allele was considered as protective. Pearson's correlation coefficient (r) was used to measure the strength of association between MAF and mortality per million reported in various ethnicities. Chi-square goodness-of-fit was used to confirm the agreement of the observed genotype frequencies with those of expected (Hardy-Weinberg equilibrium) for all the variants. Multivariate logistic regression was used to identify significant independent variables associated with disease severity. Confounding effect of the variables was explored using Mantel-Haenszel Test. The data was analysed using Statistical package for Social Sciences (SPSS Version 25). A two-tailed 'p' value ≤0·05 was considered statistically significant. Journal Pre-proof The mean age was 32·46±9·65 years for controls and 44·42±17·0 for patients with COVID-19, with both groups comprising predominantly males of Indian ethnicity. The clinical characteristics of patients with COVID-19 are given in Table 1 and those of the control group in supplementary Table S1 . Baseline age (p=0·17), gender (p=0·41) and proportion with diabetes (p=0·27), hypertension (p=0·77) and both (p=0·91) were similar for the patients with mild (n=83) vs moderate (n=36) severity; therefore, these groups were combined in the analysis. Fig. 1.) . 24, 25 In addition, V160M is a residue overlap splice site variant and is predicted to be deleterious and damaging by SIFT and Polyphen respectively. MAF for rs12329760 in TMPRSS2 was 0·27 in controls, 0·31 in the asymptomatic, 0·21 in the mild-to-moderate and 0·19 in the severe COVID-19 patients (Fig. 2A) . The risk for severity increased with decrease in MAF (Fig. 2B ). There was a negative correlation between MAF and mortality rates (Pearson's correlation coefficient; r= -0·76; p=0.001; Fig. 2C ). Genotype based CT severity score is given in Figure 2D ,2E,2F). MAF of the ACE2 variant It is known that variant rs12329760 in TMPRSS2 results in substitution of valine with methionine (V160M). Structurally the V160 was found to be stable with several polar residues creating hydrophobic pockets. Replacement of 160M shows a steric hindrance clash with the surrounding residues and does not accommodate Methionine due to the topology and charge limit (Fig. 3) . In order to understand the influence of V160M on the overall structural stability of TMPRSS2, we performed MDS studies. We observed that the longer methionine residue substitution V/M160 induces a significant increase in the stability factor of TMPRSS2 decreasing the stability of the protein. Genotyping of TMPRSS2 in normal lung tissues (n=10) revealed 6 tissues to be wild type and 4 to be heterozygous. A ~2·5-fold increase in mRNA ( Fig 4A) and protein levels ( Fig 4B) were observed in variant carriers. In consistent with this, ectopic expression of wild and variant TMPRSS2 also revealed ~2·3-fold increase in HEK 293T cells (Fig 4C) . To address the importance of Val160Met mutation in the cleavage of spike, HEK 293T cells were transfected with spike alone or along with wild type or variant TMPRSS2 plasmids and analyzed cellular lysates by western blotting 24h post transfection (Fig.4D) .Expression of spike protein alone in HEK 293T cells resulted in two bands: an unprocessed band >124 kDa and a processed band at ~91 kDa corresponding to S2 fragment of spike (Fig. 4D) .Overexpression of TMPRSS2 resulted in marked disappearance of S2 band, while a slightly lower migrating S2' band was evident at ~80 kDa (Fig. 4D) 23 We found a negative correlation (r= -0·76) between MAF and mortality in COVID-19. Despite, strong negative correlation, a few ethnicities with a higher MAF recorded higher mortality as in South Africans, which could be due to risk variants identified in other loci. 6 Although significant difference in the genotype frequencies was noted between controls and asymptomatic by dominant model, there was no difference in the genotype frequencies in the Recessive model. This is probably because only SARS-CoV-2 positive patients were recruited and it is highly likely that individuals with the homozygous mutant genotype are largely protected from infection and may test negative. This is also reiterated from the fact that the homozygous mutant genotype (AA) is significantly decreasing in the patient group with increasing level of severity. The MAF of 0·31 in asymptomatic patients, 0·21 in mild-to-moderate patients and 0·19 in severe COVID-19 patients that we determined further reiterates the significant association of the variant with milder disease. Although there was a significant higher mean age and BMI in the symptomatic (mild-to-moderate and severe groups) as compared to asymptomatic group (Table 1) In conclusion, our study demonstrates significant association of the variant rs12329760 in TMPRSS2 with decreased disease severity in COVID-19 patients. We declare no competing interests Apparent difference in fatalities between Central Europe and East Asia due to SARS-COV-2 and COVID-19: Four hypotheses for possible explanation Genome-wide analysis of SARS-CoV-2 virus strains circulating worldwide implicates heterogeneity Could the D614G substitution in the SARS-CoV-2 spike (S) protein be associated with higher COVID-19 mortality? Risk factors for death in 1859 subjects with COVID-19 Deciphering the Role of Host Genetics in Susceptibility to Severe COVID-19 Genomewide Association Study of Severe Covid-19 with Respiratory Failure A pneumonia outbreak associated with a new coronavirus of probable bat origin SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor ACE2 and TMPRSS2 variants and expression as candidates to sex and country differences in COVID-19 severity in Italy New insights into genetic susceptibility of COVID-19: an ACE2 and TMPRSS2 polymorphism analysis Population-level COVID-19 mortality risk for non-elderly individuals overall and for non-elderly individuals without underlying diseases in pandemic epicenters Factors associated with COVID-19-related death using OpenSAFELY India State-Level Disease Burden Initiative Diabetes Collaborators. The increasing burden of diabetes and variations among the states of India: the Global Burden of Disease Study High prevalence of metabolic syndrome among urban subjects in India: a multisite study National Health Commission of the People's Republic of China. The Notification of Printing and Distributing New Coronavirus Pneumonia Management (Trial Version 6) Estimates of the severity of coronavirus disease 2019: a model-based analysis PrimerZ: streamlined primer design for promoters, exons and human SNPs 0: predicting stability changes upon mutation from the protein sequence or structure PhyreRisk: A Dynamic Web Application to Bridge Genomics, Proteomics and 3D Structural Data to Guide Interpretation of Human Genetic Variants A new mathematical model for relative quantification in real-time RT-PCR ERK1/2 activated PHLPP1 induces skeletal muscle ER stress through the inhibition of a novel substrate AMPK Ensembl Genome Browser accessed at www.ensembl.org [Accessed 15th Association of TMPRSS2-ERG gene fusion with clinical characteristics and outcomes: results from a population-based study of prostate cancer ESEfinder: A web resource to identify exonic splicing enhancers COVID-19: a novel zoonotic disease caused by a coronavirus from China: what we know and what we don't Assessment of risk conferred by coding and regulatory variations of TMPRSS2 and CD26 in susceptibility to SARS-CoV-2 infection in human Racial/Ethnic Variation in Nasal Gene Expression of Transmembrane Serine Protease 2 (TMPRSS2) The authors acknowledge the funding received from Asian Healthcare Foundation. We thank the Monash University Software Platform for license and access to the concerned software. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.J o u r n a l P r e -p r o o f