key: cord-0704958-1kvtkq3c authors: Calcagnile, Matteo; Forgez, Patricia; Alifano, Marco; Alifano, Pietro title: The lethal triad: SARS-CoV-2 Spike, ACE2 and TMPRSS2. Mutations in host and pathogen may affect the course of pandemic date: 2021-01-14 journal: bioRxiv DOI: 10.1101/2021.01.12.426365 sha: 72fa322989c356ccc7b4f19b596385be46375a82 doc_id: 704958 cord_uid: 1kvtkq3c Variants of SARS-CoV-2 have been identified rapidly after the beginning of pandemic. One of them, involving the spike protein and called D614G, represents a substantial percentage of currently isolated strains. While research on this variant was ongoing worldwide, on December 20th 2020 the European Centre for Disease Prevention and Control reported a Threat Assessment Brief describing the emergence of a new variant of SARS-CoV-2, named B.1.1.7, harboring multiple mutations mostly affecting the Spike protein. This viral variant has been recently associated with a rapid increase in COVID-19 cases in South East England, with alarming implications for future virus transmission rates. Specifically, of the nine amino acid replacements that characterize the Spike in the emerging variant, four are found in the region between the Fusion Peptide and the RBD domain (namely the already known D614G, together with A570D, P681H, T716I), and one, N501Y, is found in the Spike Receptor Binding Domain – Receptor Binding Motif (RBD-RBM). In this study, by using in silico biology, we provide evidence that these amino acid replacements have dramatic effects on the interactions between SARS-CoV-2 Spike and the host ACE2 receptor or TMPRSS2, the protease that induces the fusogenic activity of Spike. Mostly, we show that these effects are strongly dependent on ACE2 and TMPRSS2 polymorphism, suggesting that dynamics of pandemics are strongly influenced not only by virus variation but also by host genetic background. Viruses, like all other species, obey evolutionary and biodiversity rules. According to these rules, 51 surviving viruses adapt to their own benefit. To prevent these adaptations, there are two human 52 interventions possible, either eradicate the virus, or attempt to understand the relationship between 53 the host and the virus, to mitigate the effects of the virus. 54 The severe acute respiratory syndrome corona virus -2 (SARS-CoV-2) spread incredibly 55 quickly between people, due to its newness and transmission route, but the kinetics of diffusion and 56 mortality remains variable from one country to another. A multitude of factors may concur to explain 57 the ethnic and geographical differences in pandemic progression and severity. Considerable 58 individual differences in susceptibility to onset of disease caused by SARS-CoV-2 may be involved, 59 including mainly sex, age and underlying conditions [1] . However, genomic predisposition, a major 60 concept in modern medicine, prompts to understand molecular bases of heterogeneity in diffusion 61 and severity of disease, to find prevention and treatment strategies. If host genomic predisposition 62 has been advocated since the beginning of pandemic, virus' genomic predisposition (i.e. the 63 occurrence of mutations) to spread less or more easily and eventually to cause more or less severe 64 disease has been initially unkempt, because of capacity of Coronaviruses to proofread, thus removing 65 mismatched nucleotides during genome replication and transcription. However, in the last few 66 months it begun evident that some specific mutants (see below) are progressively superseding the 67 virus that was identified as wild type, rising enormous questions in terms of comprehension of 68 mechanisms of pathogen-host interaction. 69 SARS-CoV-2 uses envelope Spike projections as a key to enter human airway cells [2] through a 70 specific receptor. Spike glycoproteins form homotrimers protruding from the viral surface, and 71 comprise two major functional domains: an N-terminal domain (S1) for binding to the host cell 72 receptor, and a C-terminal domain (S2) that is responsible for fusion of the viral and cellular 73 membranes [3] . Following the interaction with the host receptor, internalization of viral particles is 74 accomplished thanks to the activation of fusogenic activity of Spike, as a consequence of major 75 analyzed. The data show that most common variants of the SARS-CoV-2 Spike are located in a 163 protein region that spans the amino acids 541 and 788, while in SARS-CoV Spike the corresponding 164 region between amino acids 14 and 305 was primarily affected by amino acid variation, followed by 165 the region 541-788 (Fig. 1C) . The region 541-788 links the RBD and the fusion peptide and contains 166 the S1/S2 cleavage site for TMPRSS2, the trans-membrane protease that has been shown to carry out 167 the priming of the SARS-CoV-2 Spike by sequential cleavage at S1/S2 and S2' (Fig. 1C) [10, 28] . 168 The Dataset S1 was used to determine the geographical distribution and temporal spread of the 169 SARS-CoV-2 Spike D614G, the most diffused variant. The data show distinct patterns in the different 170 geographical regions. Specifically, in Eastern Asia the D614G variant was described in January 2020 171 at low frequency (3.8%) and then it was subject to a decremental trend over time (Fig. S1) the wild type (D614) SARS-CoV-2 Spike this region forms an N-terminal a-helix followed by a C-196 terminal b-sheet ( Fig. 2A) . D614G replacement was predicted to drastically change the peptide 197 secondary structure by replacing the C-terminal b-sheet with a a-helix ( Fig. 2A) . The effects of the 198 D614G substitution on SARS-CoV Spike structure were then analyzed. Unlike the SARS-CoV-2 199 Spike, the corresponding amino acid region (amino acids 587-613) in the wild type SARS-CoV Spike 200 is arranged in two anti-parallel a-helices (Fig. 2B) , similar to those found in D614G SARS-CoV-2 201 Spike variant, and the D614G substitution does not seem to change this structure (Fig. 2B) . 202 The analysis was then extended to other D614 variants. Results show that in SARS-CoV Spike the 203 anti-parallel a-helices are very stable, and do not appear to be affected by any substitution studied 204 (D614E, D614P, D614A) (Fig. 2B ). In contrast, in SARS-CoV-2 Spike D614E and D614P 205 substitution does not change the N-terminal b-sheet / C-terminal a-helix structure of the wild type 206 protein, while in the D614A variant the C-terminal b-sheet is lost ( Fig. 2A) . . 5A) , which is associated with an increased affinity with WT spike [23], suggesting that the 299 effect of N501Y substitution is dependent on ACE2 genetic background. In fact, the analysis of 300 molecular docking complexes by Chimera revealed that the number and the arrangement of contacts 301 and H-bonds were different in the different complexes (Table 3) , with a higher contact density in 302 N501Y Spike RBD / wild type ACE2 complex (Fig. 5B) , as compared to the other complexes. Most 303 of contacts involve the tyrosine 501, as visualized by Chimera (Fig. 5E ). Many contacts were lost in 304 the complex between N501Y Spike RBD and K26R ACE2 variant (Fig. 5D) . 305 306 In principle, any new infectious agent that challenges a totally susceptible population with little or no 335 immunity against it is able to totally infect the population causing pandemics. Pandemics rapidly 336 spread affecting a large part of people causing plenty of deaths with significant social disruption and 337 economic loss. However, if we look at the even worst pandemics in the human history we can realize 338 that ethnic and geographical differences in the susceptibility to disease actually exist, in spite of 339 transmission routes that are the same for all individuals [34] . Infectious sources are susceptible to 340 evolution, and selective pressure by host characteristics and measure to control the pandemic may 341 lead to emergence of more aggressive or indolent strains. women to a less severe and often asymptomatic form of disease. Furthermore, the ACE2 gene is 360 located on Xp22, in an area where genes are reported to escape from X-inactivation, further 361 explaining higher expression in females [35, 36] . 362 ACE2 plays an essential role in the renin-angiotensin-aldosterone system, and its loss of function predict that hamsters could be infected, which was experimentally confirmed -underlining the 381 reliability of in silico modeling-and could be subsequently at the origin of inter-animal transmission. 382 Of note, in the same study, Chan and colleagues [40] showed that the binding energy between ACE2 383 and Spike of SARS-CoV, responsible for the 2002 epidemic, was -39.49 REU as compared to -58.18 384 of human ACE2. It has been suggested that polymorphisms in the ACE2 gene could reduce or 385 enhance the wild type Spike affinity with ACE2: in silico models have predicted that two non-386 infrequent polymorphisms -the S19P (0.3% of African populations) and K26R (0.5% of Europeans)-387 are associated with decreased or increased affinity and their distribution among patients with COVID-388 In the present study, we confirmed our previous theoretical hypothesis that N501Y mutation of 390 Spike would be associated with an increased ACE2 affinity, and molecular docking clearly shows 391 that interaction with ACE2 is even better than that of the wild type Spike protein, per se already 392 remarkable. To date, almost all the cases with this mutation have been described in England and should consider that biological effects are probably much more important than mere physical 407 variation in global energy score. Of note when measuring the affinity of N501Y Spike for K26R 408 ACE2 variant (which has a significantly higher affinity for wild type Spike as compared to wild type 409 ACE2), we observed a decrease in affinity (from -54.79 Kcal/mol to -52.57), suggesting that the 410 effect of N501Y substitution is dependent on ACE2 genetic background. We provided theoretical 411 evidence through the analysis of molecular docking complexes by Chimera that the number and the 412 arrangement of contacts and H-bonds were different in the different complexes, with a higher contact 413 density in N501Y Spike RBD / wild type ACE2 explaining enhanced affinity. 414 In our theoretical approach, the effect of the other leading mutation of Spike, namely D614G, may 415 occur through interaction with TMPRSS2. This mutation involves a region far from the RBD and in 416 vitro studies showed that neutralization by monoclonal antibodies targeting the RBD has equal effects 417 when either D614 or D614G are challenged. Our models predict that D614G replacement drastically 418 change the peptide secondary structure by replacing the C-terminal b-sheet with a a-helix in the 419 region close to the mutation, and our CABSflex simulations show an increase in flexibility of the 420 region of interest. Thus, changes in whole protein conformation could results in variations in affinity 421 between ACE2 and D614G Spike. In the recent work by Yurkovetskiy et al. [42] , who also showed 422 D614G changes in the conformation of the S1 domain in the SARS-CoV-2 S by a cryo-electron 423 microscopy approach, the association rate between D614G and ACE2 was slower than that between 424 D614 and ACE2, and the dissociation rate of D614G was faster, resulting in a lower affinity. As the 425 mutation occurs in the region of the Spike interacting with TMPRSS2, in our approach, we focused starting with the wild type Spike and TMPRSS2 proteins. These models were re-docked by using 493 Gramm-X [32]. Using Chimera [49] , the obtained complexes were analyzed to select those that show 494 an interaction between TMPRSS2 and the Spike domain of cleavage sites. The final complexes were 495 analyzed by using FindHbond and FindClashes/Contacts tools of Chimera in order to identify 496 contacts, pseudobonds and H-bonds. FindClashes/Contacts is a tool that allows the identification of 497 interatomic contacts using van der Waals (VDW) radii. This method was used to localize all direct 498 (polar and nonpolar) interactions between two atoms, including both unfavorable (clashes) and 499 favorable interactions. 500 In order to compare the Global Energy Scores (GES) in the interaction between wild type or 501 N501Y Spike RBD with wild type or K26R ACE2 the HDOCK server was used [54] . This tool was 502 based on a hybrid method: template-base modeling and ab initio docking. Template based modeling 503 was choose as the best strategy to compute the affinity between RBD and ACE2 because many ACE2-504 RBD crystallographic models were reported in public database. On the other hand, Gramm-x [32], a 505 tool based on rigid-body docking (Lennard-Jones modified function), was used to identify novel 506 binding sites. Using the HDOCK server [54], receptors and ligands were submitted as primary amino 507 acid sequences, including wild type and variant sequences (K26R ACE2 and N501Y RBD). Docking 508 complexes obtained with the template-base modeling were selected, and the affinity energy was 509 calculated using FireDock [50, 51] . 510 The workflow used to perform docking simulations was reported in Fig. S5 Dataset S1. The Dataset reports the missense mutations identified in the SARS-CoV-2 Spike proteins described by Rahman et al. [24] . Table 1 . Counts and frequencies of identified missense mutations. Table 2 . Temporal distribution of missense mutations. Table 3 . Geographical distribution of missense mutations. Ref sequences of SARS-CoV-2 was used as query. Genomic modulators of the immune response Genomic characterization of the 2019 534 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after 535 visiting Wuhan Mechanisms of coronavirus cell entry 539 mediated by the viral spike protein Structure, Function, and 542 Antigenicity of the SARS-CoV-2 Spike Glycoprotein Cleavage and activation of 546 the severe acute respiratory syndrome coronavirus spike protein by human airway trypsin-like 547 protease Frequent and focal FGFR1 amplification 598 associates with therapeutically tractable FGFR1 dependency in squamous cell lung cancer The combination of ACE I/D 606 and ACE2 G8790A polymorphisms revels susceptibility to hypertension: A genetic 607 association study in Brazilian patients Enzyme 2 gene polymorphism and enzymatic activity with essential hypertension in different 612 gender: A case-control study European Centre for Disease Prevention and Control. Rapid increase of a SARS-CoV-2 620 variant with multiple spike protein mutations observed in the United Kingdom Comprehensive 629 annotations of the mutational spectra of SARS-CoV-2 spike protein: a fast and accurate 630 pipeline. Transboundary and emerging diseases Chinese SARS Molecular Epidemiology Consortium. Molecular evolution of the SARS 640 coronavirus during the course of the SARS epidemic in China A unique protease cleavage site predicted in 644 the spike protein of the novel pneumonia coronavirus (2019-nCoV) potentially related to viral 645 transmissibility PEP-FOLD: an online resource for de novo peptide 648 structure prediction Improved PEP-FOLD approach for peptide 652 and miniprotein structure prediction Tovchigrechko A, Vakser IA. GRAMM-X public web server for protein-protein docking. 660 Nucleic acids research TMPRSS2 and ADAM17 interactions with ACE2 complexed with SARS-CoV-663 2 and B0AT1 putatively in intestine, cardiomyocytes, and kidney. BioRxiv Pandemics: waves of disease, waves of hate from the Plague of Athens to A.I X-inactivation profile reveals extensive variability in X-linked gene 670 expression in females Angiotensin converting 680 enzyme 2: SARS-CoV-2 receptor and regulator of the renin-angiotensin system A crucial role of angiotensin converting enzyme 684 2 (ACE2) in SARS coronavirus-induced lung injury Simulation of the Clinical and 688 Pathological Manifestations of Coronavirus Disease Hamster Model: Implications for Disease Pathogenesis and Transmissibility Adaptation of SARS-CoV-2 in BALB/c mice for 693 testing vaccine efficacy Structural and 697 Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant dk/services/tmhmm. 705 706 45. Madden, T. The BLAST sequence analysis tool Clustal Omega, accurate alignment of very large numbers of 710 sequences PAST-palaeontological statistics, ver. 1.89. Palaeontol. 714 Electron UCSF Chimera-a 717 visualization system for exploratory research and analysis FireDock: a 724 web server for fast interaction refinement in molecular docking FireDock: fast interaction refinement in Molecular 728 docking SSIPe: accurately estimating protein-protein binding 732 affinity change upon mutations using evolutionary profiles in combination with an optimized 733 physical energy function EvoEF2: accurate and fast energy function for computational 737 protein design The HDOCK server for integrated protein-protein docking. 740 Nature protocols B) Relative frequency of G8V and V197M TMPRSS2 variants in 7 human populations C) Molecular docking simulations of SARS-CoV-2 Spike /TMPRSS2 complexes. Wild type A570D, D614G, P681H, or T716I SARS-CoV-2 Spike was used as a receptor, and wild type, G8V or V197M TMPRSS2 as a ligand. Docking simulations were carried out by Gramm-X server, and FireDock was used to calculate the GES (Global Energy Scores) as detailed in the The authors declare no conflict of interest.