key: cord-0913782-dvtvy7yx authors: Malik, Ashish; Prahlad, Dwarakanath; Kulkarni, Naveen; Kayal, Abhijit title: Interfacial Water Molecules Make RBD of SPIKE Protein and Human ACE2 to Stick Together date: 2020-06-15 journal: bioRxiv DOI: 10.1101/2020.06.15.152892 sha: 36a251fff398d1c2cd705ad54a119cd13e89b820 doc_id: 913782 cord_uid: dvtvy7yx A novel coronavirus (SARS-CoV-2; COVID-19) that initially originates from Wuhan province in China has emerged as a global pandemic, an outbreak that started at the end of 2019 which claims 431,192 (Date: 15th June 2020 (https://covid19.who.in) life till now. Since then scientists all over the world are engaged in developing new vaccines, antibodies, or drug molecules to combat this new threat. Here in this work, we performed an in-silico analysis on the protein-protein interactions between the receptor-binding (RBD) domain of viral SPIKE protein and human angiotensin-converting enzyme 2 (hACE2) receptor to highlight the key alteration that happened from SARS-CoV to SARS-CoV-2. We analyzed and compared the molecular differences between these two viruses by using various computational approaches such as binding affinity calculations, computational alanine, and molecular dynamics simulations. The binding affinity calculations show SARS-CoV-2 binds little more firmly to the hACE2 receptor than that of SARS-CoV. Analysis of simulation trajectories reveals that enhanced hydrophobic contacts or the van der Waals interaction play a major role in stabilizing the protein-protein interface. The major finding obtained from molecular dynamics simulations is that the RBD-ACE2 interface is populated with water molecules and interacts strongly with both RBD and ACE2 interfacial residues during the simulation periods. We also emphasize that the interfacial water molecules play a critical role in binding and maintaining the stability of the RBD/hACE2 complex. The water-mediated hydrogen bond by the bridge water molecules is crucial for stabilizing the RBD and ACE2 domains. The structural and dynamical features presented here may serve as a guide for developing new drug molecules, vaccines, or antibodies to combat the COVID-19 pandemic. 3 The novel coronavirus (severe acute respiratory syndrome coronavirus 2, SARS-CoV-2) has emerged as a global pandemic by taking more than 431,192 (Date: 15 th June 2020 (https://covid19.who.in) life all over the world. The rapid spread and high mortality rate (3 to 5%) 1 CoV and SARS-CoV-2, use the homotrimer spike glycoprotein (comprising of S1 and S2 subunit) or SPIKE protein to bind the cellular receptors and induce the dissociation of S1 and S2 subunit. This also triggers a cascade of events such as the transition of S2 from a metastable prefusion state to a more stable post-fusion state, the fusion between a cell and viral membrane, etc [6] [7] [8] . S1 subunit contains the receptor-binding domain (RBD) that binds to the peptidase domain (PD) of human angiotensin-converting enzyme 2 (ACE2) 9 . The recognition of SPIKE protein with the ACE2 enzyme is the first step of the virus lifecycle and entry to the host cell. These are the reasons the CoV SPIKE protein is a key target for developing new vaccines, therapeutic antibodies, and new drug molecules. Recent in vitro studies also indicate that the RBD domain of the S1 subunit plays a key functional role for the binding of SARS-CoV by ACE2 10 . The crystal as well as cryo-EM structures of the complexes (SPIKE/hACE2) reveal key information about the interactions between the RBDs and the hACE2 receptor 6, 9, 11 . Superposition of RBD domains of SARS-CoV and SARS-CoV-2 shows very similar structural fold with a root mean square deviation (RMSD) of 0.68 Å of 139 pairs of Cα atoms 12 . Knowing that binding to the ACE2 receptor triggers the virial lifecycle, it is also important to understand how the binding affinity of SARS-CoV-2 differs from the SARS-CoV. Experimental results reveal that the binding affinity values for SARS-CoV and SARS-CoV-2 with hACE2 fall in a similar range 12 . Specifically, the equilibrium dissociation constant (KD) shows a slightly higher value (15.2 nm) of SARS-CoV-2 than that of SARS-CoV (15 nM) 13 . However, a recent publication shows a 20-4 fold increased binding affinity between trimeric SPIKE protein with ACE2 receptor of SARS-CoV-2 than that of SARS-CoV 6 . Nevertheless, it is now well established that SARS-CoV-2 has a slightly better binding affinity towards hACE2 than the SARS-CoV. The high binding affinity suggests that this novel coronavirus is evolved towards being a better binder to the same human ACE2 receptor. Thus, a comparison of both the RBD domains of SARS-CoV and SARS-CoV-2 is of utmost importance and it migh t give us a clue why SARS-CoV-2 is so deadlier than SARS-CoV. Similar to the experiments, several computational studies are performed on the binding affinity of the RBDs of SARS-CoV and SARS-CoV-2 with hACE2 14, 15 . Most of the earlier results are based on the homology modeled structure of the SARS-CoV-2, where the global fold of the SPIKE protein was correct but the atomic level scoring function for predicting the binding affinity was not accurate. Thus the earlier results predicted lower binding affinity of SARS-CoV-2 14, 15 . This also implies that the side-chain packing of the modeled structure was not accurate enough. Recent binding free energy calculations performed on the crystal structure of RBD/hACE2 of SARS-CoV-2 predicts high binding affinity than SARS-CoV 16 . Another study uses Rosetta Interface Analyzer 17 and concludes that both these viruses have a similar binding affinity 18 . Thus, from in-silico perspective, it is still not fully understood why SARS-CoV-2 is better binder than SARS-CoV-2 and the results are inconclusive. Hence, atomistic-level comparison is needed between these two viruses. A recent study found that the hydrogen bonding network and the hydrophobic interactions are major dominating forces that are responsible for enhanced binding in SARS-CoV-2 19 . Till now most of the studies focus on the search for new vaccines and drug molecules against COVID-19. There is very little attention paid to the atomistic label description and dynamics of the interfacial domain of SPIKE protein with the hACE2 receptor. The dynamic interactions between the RBD and the hACE2 may reveal some crucial points that are not accessible from only the crystal structures. In this work, we first assess the atomistic level description of RBDs of SPIKE protein with hACE2 receptor from the crystal structures. We identify the key contact residues that are responsible for the binding of the RBD domain and hACE2. We also critically analyze the differences in the interfacial residues between the SARS-CoV and SARS-CoV-2. Further the binding affinity of these two domains of SARS-CoV and SARS-CoV-2. Next, all-atom molecular dynamics simulations of RBD domains of SPIKE protein complexed with the human ACE2 5 receptor of SARS-CoV and SARS-CoV-2 are performed to extract the dynamics of these complexes. We focus our study at the interfacial region of RBDs and hACE2 and provide an atomistic picture of the interactions. The importance of contact residues is also highlighted and a thorough comparison was made between the SARS-CoV and SARS-CoV-2. Till now, most of the in-silico work on SPIKE-ACE2 complex is done to design new antibodies or drug molecules, whereas little attention is paid to the atomistic level description and the dynamics of the SPIKE-ACE2 complex. Hence, our study focuses on the interfacial region of RBD and ACE2 domains. Our simulation results reveal that the interfacial water molecules play an important role regarding the binding of RBDs and ACE2 domain, particularly, the bridge water connecting the two domains of SPIKE protein and the hACE2. The crystal structure of SARS-CoV-2 -RBD/ACE2 (PDB ID: 6VW1) and SARS-CoV-RBD/ACE (PDB ID: 2AJF) are taken from the protein database and subjected to MD simulations. The two protein-protein complexes contain some notable structural elements for which special care was taken while preparing the system, like N-glycosylation of various asparagine residues, disulfide bridges between various pairs of cysteine, presence of zinc finger and other amino acids which require proper protonation states. The systems are prepared using the CHARMM-GUI web 6 server 20 and parameter files and equilibration files are taken from the server. All the simulations were performed using the software GROMACS version 2019.4 21, 22 using the CHARMM-36 forcefield 23 . Each system is solvated in TIP3P 24 water and KCl salt were added to neutralize the systems to a salt concentration of 150 mM. Each system is solvated in a rectangular box of dimensions L x =138 Å, L y =138 Å, and L z =138 Å, the distance is appropriate to avoid all finite-size effects due to long-range electrostatic interactions among neighboring simulation boxes. Simulation for both the complexes is carried out at 310K and 1 atm pressure. To treat long-range electrostatic interactions, Particle Mesh Ewald (PME) 25 method is used with a cut-off of 12 Å, and to constrain hydrogen bond, the LINCS algorithm 26 was applied. Each system is minimized with 5000 steps of the steepest descent algorithm to eliminate the bad contacts in the system. After minimization equilibration simulation is performed for in NVT ensemble at 310 K. After NVT simulation, NPT equilibration is performed for 5 ns at 1 atm pressure, and the temperature is maintained at 310 K. During the equilibration period, the position restrained is applied to the heavy atoms of the protein. Velocity rescale thermostat and Parinnello-Rahman Barostat 27 are applied to maintain the temperature and the pressure respectively. For the protein-protein binding affinity calculations, FoldX software 28 is used. Different visualizing and trajectory analysis software packages UCSF Chimera 29 and VMD (Visual Molecular Dynamics) 30 were used and QZyme Workbench™ 31 an in-house enzyme engineering platform was used for all calculations performed and discussed in results. Earlier published results highlighted the differences between the RBD domains of two SPIKE proteins. The SPIKE protein of SARS-CoV and SARS-CoV-2 share 75% sequence similarity and their structural fold are almost identical. Whereas the RBD domains have a sequence similarity of 73.7% 4 . Figure 1 shows the superposition of SARS-CoV and SARS-CoV-2 and in the inset, we highlight the key residues that have changed from SARS-CoV to SARS-CoV-2. In the crystal structure of SARS-CoV and SARS-CoV-2, there are significant protein-protein contacts that can be seen and we identified some of the key residues that are important for the 7 stabilization of SPIKE protein and hACE2 receptor. In Figure S1 of Supporting Information, we displayed the residues that are in the interfacial region of the RBD domain bound to the ACE2 domain. With a distance cut-off 4 Å there are 16 residues of RBD that are in contact with 20 residues of ACE2 domain. A similar analysis of SARS-CoV-2 shows 18 residues of RBD that are in contact with 21 residues of the ACE2 receptor. Among the 21 ACE2 residues that are interacting with two RBDs, 17 are found to be common residues. Most of the interacting residues of the ACE2 receptor are located near the N-terminal helix. The above analysis manifests that the surface area of the interfacial domain of ACE2 and RBDS is almost the same. This also corroborates earlier findings where the surface area is 1,687 Å 2 and 1,699 Å 2 for SARS-CoV-2 and SARS-CoV 32 . Within this 4 Å cut-off there are 4 hydrophobic residues (Leu455, Phe456, Ala475, and Phe486) and 14 hydrophilic (i.e. not hydrophobic) residues (Arg408, Thr415, Tyr449, Tyr453, Gly476, Asn487, Tyr489, Gln493, Gly496, Gln498, Thr500, Asn501, Gly502, and Tyr505) are there near the interface of SPIKE protein of SARS-CoV-2. Whereas there are 3 hydrophobic residues (Leu443, Leu472, and Ile489) and 13 hydrophilic residues (Arg426, Tyr436, Tyr440, Tyr442, Asn473, Tyr475, Asn479, Gly482, Tyr484, Thr486, Thr487, Gly488, Tyr491) are present near the interface of SPIKE protein of SARS-CoV. It is interesting to note that the interface of RBD domain is enriched with tyrosine and glycine residues for SARS-CoV-2 (4 tyrosine and 3 glycine out of 14 residues) as well as SARS-CoV (6 tyrosine and 2 glycine out of 16 residues). Further, we counted the residues with a cut-off value of 8 Å to get an estimation of the buried residues of both the RBD domains. We found that 53 and 51 residues are coming in contact with the ACE2 domain for SARS-CoV and SARS-CoV-2 respectively (Table S1, Supporting Information). To see the effect of the interfacial residues on the overall binding affinity with hACE2, we performed computational alanine scanning on these interfacial residues to see the extent of the impact on the binding affinity due to alanine substitution. Further, the crystal structure of SARS-CoV-2 has 13 hydrogen bonds between the RBD domain and the ACE2 domain. However, SARS-CoV has a lesser number of hydrogen bonds between the RBD and the ACE2 domain for SARS-CoV. It has only 5 hydrogen bonds between the two domains. As there are fewer hydrogen bonds in SARS-CoV, we expect that the RBD/ACE2 complex of SARS-CoV-2 is more stable than SARS-CoV. Further, we calculate the binding free energy of the SPIKE/hACE2 complex from the PRODIGY web server 33 as well as from FoldX software 28 . The binding free energy is calculated on the crystal structures of the dimer and the 8 values are -10.8 and -11.7 kcal/mol for SARS-CoV and SARS-CoV-2 respectively. While the FoldX calculations show binding energy -4.49 and -9.87 kcal/mol for SARS-CoV and SARS-CoV-2 respectively. Though the values obtained from PRODIGY and FoldX are not in the same range, the trend suggests that SARS-CoV-2 binds more effectively to hACE2 and evolved as a better binder to the human receptor. As we discussed in the previous section about the interfacial residues of the RBD domain is critical for stabilizing the protein-protein complex, we performed computational alanine scanning on the contact residues of the RBD domain to highlight the importance of those residues. We mutated the interfacial residues to alanine and then calculate the binding free energy with the ACE2 domain. Alanine scanning is performed to identify the residues that make a significant contribution to the binding affinity with hACE2. The result reveals that (Figure 3) most of the residues lower the binding affinity when mutated with alanine except N501 where a slight increase in binding affinity is observed for SARS-CoV-2. A decrease in binding affinity suggests that both SARS-CoV-2 and SARS-CoV have optimized its RBD domain to bind with the human ACE2 receptor. From the alanine scanning results, we observe that Y449, L455, F486, Y498, G502, and V503 are found to be critical for stabilizing the overall complex. Compared to SARS-CoV-2, the alanine mutation doesn't lower the binding affinity for all the residues of SARS-CoV. This result points to the fact that SARS-CoV-2 optimized its interfacial residues in such a manner that binding to hACE2 is more enhanced. In this context, we would like to discuss a few changes that happened between the SARS-CoV to SARS-CoV-2. Few mutations like Tyr484→Gln498, Thr487→Asn501, Tyr442→Leu455, Leu443→Phe456, Leu472→Phe486 Asn479→Gln493 show significant changes in binding affinity when the particular position is mutated with alanine. It is to be noted that Gln474 and Gly485 inserted near the interfacial domain of RBD of SARS-CoV-2. Though these two residues are not directly interacting with the ACE2, they form intramolecular hydrogen bonds two give the stability of the loop. Like Gln474 interacts with Gly476 and Gly484 interacts with Cys488. The alanine mutation of these two residues does not change the binding affinity significantly. Next section we discuss the results obtained from molecular dynamics simulations. During 100 ns molecular dynamics simulations, both the adducts (SPIKE/hACE2) of SARS-CoV and SARS-CoV-2 are found to be stable. In Figure S2 of From the molecular dynamics simulation trajectories, we extracted different components of interaction energy i.e. electrostatics and van-der-Waals (vdW) interaction terms between the RBD of SPIKE protein and the hACE2 receptor. In Figure 4 , the probability of different RBD and the hACE2 has significance on the overall binding affinity of the protein-protein interaction. 12 A key observation that was made from the simulation trajectories is that the minimum distance between the Cα atoms of RBD and the ACE2 domain varies significantly for SARS-CoV and SARS-CoV2 (Figure 5 (a) ). The minimum Cα distance gives an idea of the relative separation between the two domains. It is observed that the average Cα distance for SARS-CoV It is well known that water-mediated interactions are one of the major driving forces of protein folding and protein-drug recognition processes [34] [35] [36] [37] . In the previous section we observed the there is almost a 4-5 Å gap in between the SPIK/hACE2 protein-protein complex. There is a possibility that this interfacial domain or the gap region is hydrated with water molecules and those interfacial water molecules might play a significant role in stabilizing SPIKE protein and the hACE2 receptor. Indeed, from our simulations, we observed that water molecules gradually populate the interfacial domain of SPIKE protein and hACE2 receptors. During the 100 ns simulation period, we observed that the interfacial domain forms a rich hydrogen-bonded network and stabilize the overall protein-protein complex. From our simulations, we analyze the water molecules that come within the 3.5 Å near to the SPIKE protein and hACE2 domain. The fluctuation in the number of water molecules in the interfacial region of the ACE2 domain shows high water content for SARS-CoV than SARS-CoV-2 (Figure 7(a) ). The average water molecules near the ACE2 domain are 214.2±8.68 and 242.7±10.04 for SARS-CoV-2 and SARS-CoV respectively Though the ACE2 interface is almost the same for SARS-CoV and SARS-CoV-2, there is a higher number of water molecules near the ACE2 interfaces. The probability plot ( Figure 7 (b) ) shows a clear difference in the number of water molecules between SARS-CoV and SARS-CoV-2. A similar result can be observed for the RBD domain of SPIKE protein. A greater number of water molecules can be seen near the SARS-CoV SPIKE domain than that of the SARS-CoV-2 SPIKE domain (Figure 8(a) and (b) ). The high occupancy near the interfacial domain of SARS-CoV can be attributed to the fluctuation of the RBD and the hACE2 domain and larger gap in between these two domains. Finally, we calculated the bridge-water molecules that are hydrogen-bonded with both ACE2 and RBD domains at the same time. We calculate the number of water molecules that are simultaneously forming hydrogen bonds with RBD and the ACE2 domain. In Figure 9 (a) we display the number of bridge water molecules during 100 ns molecular dynamics simulations and in Figure 9 (b) we show one bridge water molecule that is hydrogen-bonded to RBD (Gln493) as well as ACE2(Glu35) domain at the same time. From the simulation trajectories, several multiwater bridge water molecules can be observed in RBD and hACE2 domains. The bridge water molecules It can be seen that there is a slightly higher number of bridge water molecules (~2) for SARS-CoV-2 than that of SARS-CoV. These bridging water molecules play a significant role in stabilizing the two domains of the complex. We expect that these bridge water molecules play a significant role in stabilizing SPIKE/ACE2 domain. Here we want to highlight one key point that in the crystal structure (SARS-CoV-2) the Gln493 is engaged in a hydrogen bond with the Glu35, whereas from MD simulation we observed that water molecules enter into the interfacial region and connected these two residues. To have an idea about how the RBD and ACE2 domains are interacting with water molecules, we calculate the water-mediated hydrogen bonds with RBD and hACE2 domains. In the case of hACE2, the interfacial residues are making almost the same number of hydrogen bonds with the water molecules for SARS-CoV and SARS-CoV-2. This is expected as the ACE2 receptor is the same in both the viruses and small changes can be attributed to the fluctuation of the hACE2 domain. In Figure 10 Finally, we calculated the interaction energy (electrostatic+ van der Waals) between the interfacial RBD and ACE2 domain with water molecules (Figure 11(a),(b) ). We observe that the interaction (a) (b) Figure 11 Total interaction energy (coulombic + van der Waals) between the water molecules and the hACE2 and the RBD domains. energy between the water molecules and the hACE2 domain of both the viruses has almost the same that means the surface topology of hACE2 are behaves similarly during the MD simulations for both these viruses. The difference can be seen between the RBD of SARS-CoV and SARS-CoV-2. Interaction with water molecules is more preferred with SARS-CoV than SARS-CoV-2 as the water molecules make a greater number of hydrogen bonds with the RBD domain. Enhanced water interaction with the RBD of SARS-CoV reduces the direct interaction with the hACE2 receptor. In summary, we elucidate the similarities and differences between the SARS-CoV and SARS-CoV-2 by using molecular dynamics simulation approaches. From the structural point of view, the RBD of SPIKE protein shares a significant similarity in terms of the three-dimensional structure and the fold. However, significant differences are observed in how these two viruses bind to the hACE2 receptor. We found that improvement in hydrophobic contacts that leads to enhanced van der Waals interactions between the RBD and the hACE2 receptor affects the high binding affinity for the SARS-CoV-2. The most important information we found from the study is that involvement of water molecules at the interfacial domain of the RBD and ACE2 receptor. We observed that for both SARS-CoV and SARS-CoV-2 the interfacial domain is spontaneously hydrated and a significant number of water molecules populate the gap in between RBD and ACE2 domain. It is found that bridge water molecules play a significant role in stabilizing the protein-protein binary complex. This kind of water-mediated interaction for SARS-CoV and SARS-CoV-2 did not observe in the previous study. The present study unravels the many structural and dynamical features of this protein-protein dimer complexes that help us to understand how this virus interacts with the human cell. Clinical Features of Patients Infected with 2019 Novel Coronavirus in Clinical Characteristics of Novel Coronavirus Cases in Tertiary Hospitals in Hubei Province Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China A Pneumonia Outbreak Associated with a New Coronavirus of Probable Bat Origin A Novel Coronavirus from Patients with Pneumonia in China Cryo-EM Structure of the 2019-NCoV Spike in the Prefusion Conformation. Science Are Resistant to Conformational Changes Induced by Receptor Recognition or Proteolysis Cryo-EM Structures of MERS-CoV and SARS-CoV Spike Glycoproteins Reveal the Dynamic Receptor Binding Domains Structure of SARS Coronavirus Spike Receptor-Binding Domain Complexed with Receptor. Science (80-. ) Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Cryo-EM Structure of the SARS Coronavirus Spike Glycoprotein in Complex with Its Host Cell Receptor ACE2 Structural Basis for the Recognition of SARS-CoV-2 by Full-Length Human ACE2. Science (80-. ) Potent Binding of 2019 Novel Coronavirus Spike Protein by a SARS Coronavirus-Specific Human Monoclonal Antibody Genomic and Protein Structure Modelling Analysis Depicts the Origin and Infectivity of 2019-NCoV, a New Coronavirus Which Caused a Pneumonia Outbreak in Wuhan, China Fast Assessment of Human Receptor-Binding Capability of 2019 Novel Coronavirus (2019-NCoV) Computational Design of Peptides to Block Binding of the SARS-CoV-2 Spike Protein to Human ACE2 A Comparison of Successful and Failed Protein Interface Designs Highlights the Challenges of Designing Buried Hydrogen Bonds SARS-CoV-2 and SARS-CoV Spike-RBD Structure and Receptor Binding Comparison and Potential Implications on Neutralizing Antibody and Vaccine Development Enhanced Receptor Binding of SARS-CoV-2 through Networks of Hydrogen-Bonding and Hydrophobic Interactions CHARMM-GUI Membrane Builder Toward Realistic Biological Membrane Simulations GROMACS: Fast, Flexible, and Free High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain Χ1 and Χ2 Dihedral Angles Molecular Dynamics Simulation of the Graphene-Water Interface: Comparing Water Models Particle Mesh Ewald: An N⋅log(N) Method for Ewald Sums in Large Systems LINCS: A Linear Constraint Solver for Molecular Simulations Crystal Structure and Pair Potentials: A Molecular-Dynamics Study The FoldX Web Server: An Online Force Field UCSF Chimera-A Visualization System for Exploratory Research and Analysis VMD: Visual Molecular Dynamics Silico Based Engineering Approach to Improve Transaminases for the Conversion of Bulky Substrates Structure of the SARS-CoV-2 Spike Receptor-Binding Domain Bound to the ACE2 Receptor PRODIGY: A Web Server for Predicting the Binding Affinity of Protein-Protein Complexes Protein-Protein Recognition: Crystal Structural Analysis of a Barnase-Barstar Complex at 2.0-.ANG. Resolution Role of Water Mediated Interactions in Protein−Protein Recognition Landscapes Anomalous Dynamics of Water Confined in Protein -Protein and Protein -DNA Interfaces Dynamics of Hydration Water Plays a Key Role in Determining the Binding Thermodynamics of Protein Complexes