key: cord-0713224-p9esp1tn authors: Han, Yanqiang; Wang, Zhilong; Wei, Zhiyun; Schapiro, Igor; Li, Jinjin title: Binding Affinity and Mechanisms of SARS-CoV-2 Variants date: 2021-07-26 journal: Comput Struct Biotechnol J DOI: 10.1016/j.csbj.2021.07.026 sha: f5cc2c6b1977828a2422216ed549829da80d7b8e doc_id: 713224 cord_uid: p9esp1tn During the rapid worldwide spread of SARS-CoV-2, the viral genome has been undergoing numerous mutations, especially in the spike (S) glycoprotein gene that encode a type-I fusion protein, which plays an important role in the infectivity and transmissibility of the virus into the host cell. In this work, we studied the effect of S glycoprotein residue mutations on the binding affinity and mechanisms of SARS-CoV-2 using molecular dynamics simulations and sequence analysis. We quantitatively determined the degrees of binding affinity caused by different S glycoprotein mutations, and the result indicated that the 501Y.V1 variant yielded the highest enhancements in binding affinity (increased by 36.8%), followed by the N439K variant (increased by 29.5%) and the 501Y.V2 variant (increased by 19.6%). We further studied the structures, chemical bonds, binding free energies (enthalpy and entropy), and residue contribution decompositions of these variants to provide physical explanations for the changes in SARS-CoV-2 binding affinity caused by these residue mutations. This research identified the binding affinity differences of the SARS-CoV-2 variants and provides a basis for further surveillance, diagnosis, and evaluation of mutated viruses. The novel coronavirus disease , caused by severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) 1 , has emerged as a severe epidemic and a serious risk to the global public health and economy since its first outbreak in December 2019 [2] [3] [4] . Many studies have indicated that human angiotensin converting enzyme 2 (hACE2) is the entry receptor of SARS-CoV-2 [5] [6] [7] [8] , which is similar to that of SARS-CoV 9,10 . SARS-CoV-2 has been reported to use the homotrimeric spike (S) glycoprotein on the virion surface to implement receptor recognition and membrane fusion 2 . The S protein includes two main subunits (S1 and S2) that form the receptorbinding domain and the fusion peptide domain, in which S1 identifies and binds to the host cell and S2 implements membrane fusion. It has been reported that the receptor binding domain (RBD) within S1 subunit is a key functional component for binding to hACE2, and this binding is also a critical initial step in viral infection 1, 6, 8, 11 . Moreover, neutralizing monoclonal antibodies of the immune system have been reported to carry out their neutralization activity by binding to the RBD within the S protein and disrupting virus binding [12] [13] [14] [15] [16] . Therefore, an in-depth investigation of RBD binding to hACE2 is essential for drug design and antibody development. However, residue mutations in S glycoprotein can significantly affect virus infectivity and transmission efficiency, thus influencing the efficacy of neutralizing antibodies. During the global spread of SARS-CoV-2, a large number of reports have indicated that S glycoprotein mutations can enhance the infection of the virus, but the mechanism has not been explained [17] [18] [19] [20] . For example, the N439K variant, in which the ASN439 residue in the RBD is mutated to LYS439, first emerged in Scotland and spread widely in many European countries. N439K was reported to significantly enhance the binding affinity of S glycoprotein to the hACE2 receptor and influence the activity of neutralizing antibodies 21, 22 . Another widespread SARS-CoV-2 variant is 501Y.V1, which is considered to be the most severe mutation based on experimental evaluation of the structures and functions of the virus [23] [24] [25] , with increased binding affinity to the hACE2 receptor. The third SARS-CoV-2 variant found to have severe effects was reported at the end of 2020, when the 501Y.V2 variant was first discovered in South Africa and then found to be widely spread to a series of countries, including the UK, Switzerland, Finland, Japan and Australia, which has been observed to have an increased receptor-binding ability 25, 26 . The 501Y.V2 variant has three residue mutations in the S glycoprotein (K417N, E484K and N501Y mutations), which allow the virus to more easily bind to human cells 25, 27 . K417N refers to the residue mutation from LYS417 to ASN417, E484K represents the residue mutation from GLU484 to LYS484, and N501Y denotes the residue mutation from ASN501 to TYR501 28 . It is very rare for three mutations to appear in one variant at the same time, especially during large outbreaks. Therefore, it is very important to study the binding affinity of receptors to the 501Y.V2 variant, which is difficult to observe in the laboratory and can serve as an identification indicator of virus infection after mutation. In this work, we studied the structures, chemical bond changes, binding free energies (enthalpy and entropy), and residue contribution decompositions of the three most widespread and important S glycoprotein variants (501Y.V1, 501Y.V2, N439K) of SARS-CoV-2. As shown in Fig. 1(a) , the 501Y.V1 variant contains N501Y mutation in the RBD, the 501Y.V2 variant contains the mutations of K417N, E484K, and N501Y in the RBD, and the N439K variant contains the mutation of N439K in the RBD. Based on structural analysis and molecular dynamics simulations, we studied the effects of the three variants on hACE2 binding affinity and mechanisms found the following discoveries. (1) Structural and sequence analysis showed that the molecular weight and stability changed significantly, but the aliphatic index and half-life did not change after the residue mutations, as shown in Table S1 of the supplementary information (SI). The 501Y.V1 and 501Y.V2 variants exhibited fewer hydrophobic properties and more lyotropic properties than wild-type SARS-CoV-2. (2) The root-mean-square deviations (RMSDs) and root-mean-square fluctuations (RMSFs) of the 501Y.V1, 501Y.V2, and N439K variants were all greater than those of wild-type SARS-CoV-2, indicating the increased flexibility of the SARS-CoV-2 residues after mutations and the enhanced binding affinities between the three variants and the hACE2 receptor. (3) Structural and sequence-based analysis Thr333-Gly526 refers to the sequence of the RBD region. The SARS-CoV-2 S glycoprotein, a large homotrimeric protein, plays a major role in viral when entry into host cells. Recent studies on the S glycoprotein and its interaction with the cell receptor (hACE2) have shown that their binding is a critical initial step for SARS-CoV-2 to enter host cells, in which the binding of the RBD domain induces the dissociation of the S1 subunit from hACE2 and promotes the transformation of the S2 subunit to a stable post-fusion state for membrane fusion [6] [7] [8] 11, 29 . The S protein is comprised of a S1 subunit and a S2 subunit. As shown in Fig. 1 (a) , the S1 subunit of the S protein is composed of an N-terminal domain (NTD), receptor binding domain (RBD), subdomain 1 (SD1) and subdomain 2 (SD2), while the S2 subunit is comprised of a fusion peptide (FP), heptad repeat 1 (HR1), heptad repeat 2 (HR2), transmembrane region (TM) and intracellular domain (IC). The mutations in the RBD of the three studied variants (501Y.V1, 501Y.V2 and N439K) are also shown in Fig. 1 (a) . Both the 501Y.V1 and N439K variants have single mutated residues in the RBD, while the 501Y.V2 variant is comprised of three mutated residues (K417N, E484K, and N501Y). The secondary structure, along with the sequence of the RBD, is shown in Fig. 1 where the RBD is highlighted in red. The sequences of the wild-type SARS-CoV-2 RBD and the three mutated variants were analyzed by the ProtParam tool 30 . Table S1 in SI shows the physical and chemical parameters of the wild-type SARS-CoV-2 RBD and the three variants, including formula, molecular weight, theoretical pI, instability index, aliphatic index, grand average of hydropathicity (GRAVY), and estimated half-life (mammalian reticulocytes, in vitro). As can be observed in Table S1 , the molecular weights of the mutated structures changed obviously, but the aliphatic indexes and half-lives did not. The instability indexes of the four structures in Table S1 were all below 40, indicating that the structures were stable with or without the studied mutations. Furthermore, the 501Y.V2 and 501Y.V2 variants exhibited the same GRAVY (-0.205), which was less negative than that of the wild-type RBD (-0.216), indicating that they were less hydrophobic and more lyotropic. In this study, we investigated the binding affinity of the SARS-CoV-2 RBD and hACE2 complex (PDB ID: 6M0J) 29 . The crystal structure of the RBD-hACE2 complex is shown in Fig. 2 After energy minimization, heating and density equilibration, the RBD-hACE2 complex solution was equilibrated with a 1.2 ns MD simulation in an NPT ensemble with a temperature of 300 K and a pressure of 1 atm. Then, a 40 ns production run was performed while keeping the system stable with the convergence of temperature, density and total energy. Since the RMSDs of the four complexes reached convergence at 10 ns, the MD trajectory of 10-40 ns was used for the binding energy calculation, residue contribution decomposition and interaction analysis. Table 1 From the calculated total enthalpy ( ) in Table 1 , the SARS-CoV-2 RBD-∆ hACE2 complex of the 501Y.V1 variant exhibited the strongest (the lowest) binding energy of -53.073 kcal/mol, while the wild-type RBD-hACE2 complex showed the weakest (the highest) binding energy of -46.633 kcal/mol. However, the total enthalpy did not take into account the entropy contribution, which depends on the alternation of motional freedom induced by RBD binding to hACE2 and contributes significantly to the total binding free energy. After considering the entropy contribution, the total binding free energies ( ) in Table 1 indicated that the three variants all showed ∆ stronger binding free energies than the wild-type RBD-hACE2 complex (-12.965 kcal/mol), which is consistent with previous experimental investigations [23] [24] [25] 27, [34] [35] [36] . However, compared with the single N501Y mutation, the additional K417N and E484K mutations resulted in a decreased affinity between 501Y.V2 and hACE2 (as shown in Table 1 ). In addition, the N439K variant showed only slightly lower binding free energy than the wild-type SARS-CoV-2. From Table 1 (Table S3 and S4). As Tables S3-S4 shown, the 501Y.V1 variant is commonly considered to significantly increase the binding affinity of RBD to ACE2, while 501Y.V2 and N439K show small increasement in binding affinity. Overall, our results show that the mutations (N501Y, K417N, E484K, and N439K) in the three variants were all capable of enhancing the binding affinity and thus probably increasing the infectivity of the virus, and this result agrees well with previously published experimental results 40 . Fig. 3. From Fig. 3 (a)-(d) , GLN493 in the SARS-CoV-2 RBD broke the salt bridge between LYS31 and GLU35 in hACE2 and formed hydrogen bonds with these two residues. During the binding process, PHE486 in the SARS-CoV-2 RBD enhanceed the hACE2 binding affinity by creating a hydrophobic pocket involving MET82 and TYR83 in hACE2 (Fig. 3 (a)-(d) ). For wild-type SARS-CoV-2, the TYR505 residue contributed the most to the binding of RBD with hACE2, with a large electrostatic energy and van der Waals energy. However, the mutated TYR501 became the residue with the greatest contribution for binding to hACE2 in the 501Y.V1 and 501Y.V2 variants (-5.11 kcal/mol for 501Y.V1 and -6.102 kcal/mol for 501Y.V2, both of which are highlighted in orange in Table 2 ), compared with ASN501 (-2.785 kcal/mol), which was the residue with the fourth greatest contribution for the wild-type SARS-CoV-2 RBD (as shown in Table 2 ). These studied mutations mainly enhanced the binding affinity of the SARS-CoV-2 RBD by forming hydrogen bonds and salt bridges, thus changing the electrostatic and van der Waals energies and promoting hydrophilic interactions. Fig. 3 (b)-(d) show the interfaces and residues between the SARS-CoV-2 RBD and hACE2 for the wild-type 501Y.V1 variant, 501Y.V2 variant and N439K variant, respectively. In the 501Y.V1 and 501Y.V2 variants, the mutation from ASN501 to TYR501 formed strong hydrogen bonds and salt bridges between residues TYR501-TYR41 and THR500-ASP355, as shown in Fig. 3 (b) and (c). In addition to the newly formed hydrogen bond between TYR501 and TYR41, a weaker hydrogen bond between residues GLU484 in RBD and LYS31 in hACE2 was also generated in the 501Y.V1 variant. However, such weaker hydrogen bonds were destroyed in the 501Y.V2 variant. In addition to breaking the weak hydrogen bond between GLU484 and LYS31, the hydrogen bond between LYS417 and ASP30 in wild-type SARS-CoV-2 was also broken in the 501Y.V2 variant. Therefore, compared with the 501Y.V1 variant, the 501Y.V2 variant had a weaker binding affinity for the cellular target in the host. Fig. 3 (d) shows the composite structure of the N439K variant, which formed new but weak hydrogen bonds between residues LYS439 and GLU329, thus slightly enhancing the binding affinity of the SARS-CoV-2 RBD (Fig. 3 (d) ). From the perspective of chemical bond generation and breaking, Fig. 3 explains why 501Y.V1 had the highest binding affinity to humans among the three variants, followed by N439K and 501Y.V2. residues in the SARS-CoV-2 RBD, which were decomposed into van der Waals energy (vdW), electrostatic energy (Ele), polar solvation energy (Polar) and non-polar solvation energy (Non-polar). As shown in Fig. 4 and Tables S5-S8, the enhancement of binding affinity for variants was mainly attributed to the energetic changes in electrostatic and van der Waals interactions, which is consistent with the formation of salt bridges and hydrogen bonds (Fig. 3) . In summary, the mutations in the variants mainly enhanced the binding affinity by forming hydrogen bonds and salt bridges, thus changing the electrostatic, van der Waals and polar energies. In conclusion, we investigated the binding affinity and mechanisms of the three most widespread SARS-CoV-2 variants (501Y.V1, 501Y.V2, N439K) based on molecular dynamics simulations and sequence analysis. Compared with wild-type SARS-CoV-2, we demonstrated that the 501Y.V1 and 501Y.V2 variants showed less hydrophobicity and more solubility behaviors and that the molecular weight and stability of the three variants were significantly changed. According to the calculations of the binding free energies, the binding affinities of 501Y.V1, N439K, and 501Y.V2 variants were significantly increased by 36.8%, 29.5%, and 19.6%, respectively. We further analyzed the interaction and residue contribution and demonstrated that the mutated residues enhance the binding affinity mainly by forming hydrogen bonds and salt bridges, which changed the electrostatic and van der Waals energies and enhance the hydrophilic effect. For example, both the 501Y.V1 and N439K variants were able to form new hydrogen bonds, thus enhancing their binding affinity with hACE2. The method and results presented in this paper will provide a theoretical basis for further surveillance, diagnosis, and evaluation of the binding affinity of variant viruses. In this study, the crystal structure of the wild-type SARS- In this study, the sequences and structures of the wild-type SARS-CoV-2 RBD and three mutated variants were analyzed based on previous experimental studies 29, 40 . In addition, the physical and chemical parameters for the wild SARS-CoV-2 RBD and the three variants, including the formula, molecular weight, theoretical pI, instability index, aliphatic index, grand average of hydropathicity (GRAVY), and estimated half-life (mammalian reticulocytes, in vitro), were obtained by ProtParam 30 , a tool for protein sequence analysis. In this study, parallel molecular dynamics (MD) simulations and binding free energy calculations of the RBD-hACE2 complexes were performed using Amber16 45 software. The ff14SB force field 46 was used to generate the parameters of the RBD-hACE2 complexes. An explicit solvent model was used during the MD simulations by solvating the RBD-hACE2 complexes in a rectangular TIP3P 47 water box with a 12 buffer. Å Counterions (Na + and Cl -) were added to neutralize the whole system. A 30,000-step energy minimization was first performed on the system. Then the solution system was gradually heated from 0 to 300 K with an NVT ensemble (N, V and T denote the number of particles, volume, and temperature, respectively) during a 100 ps simulation, followed by a 100 ps simulation for density equilibration with weak restraints (10 kcal mol -1 -2 ) on the RBD-hACE2 complex. Subsequently, the system was equilibrated for Å 1 ns in an NPT ensemble with a temperature of 300 K and pressure of 1 atm. Finally, a 40 ns production run without any restriction was performed with the NPT ensemble for each solution system, with all hydrogen atoms constrained using the SHAKE algorithm. The cutoff distance for long-range nonbonded interactions was set to 8 Å and the time step was set to 2 fs. The energy minimization and MD simulation were all performed with the sander program in Amber16. The last 30 ns trajectory of the production run was used to calculate the binding free energy. The RMSD, RMSF, SASA and R g were analyzed with the cpptraj program in Amber16. The binding free energy ( ) of the RBD-hACE2 complex system were obtained ∆ by the following equation: where , , and were the free energies of the complex, RBD and ∆ ∆ ∆ ℎ 2 hACE2, respectively. For each subsystem, the free energy was calculated by the following equation: ∆  Sequence and structural analysis and theoretical simulations were used to investigate the binding affinity and mechanisms of the SARS-CoV-2 variants.  The 501Y.V1, 501Y.V2 and N439K variants increased binding affinities significantly, and the 501Y.V1 variant yielded the highest binding enhancement (increased by 36.8%).  The mutated residues enhanced the binding affinity mainly by forming hydrogen bonds and salt bridges. Research reported in this publication was supported by the SJTU Global Strategic Partnership Fund (2020 SJTU-HUJI) and the National Natural Science Foundation of China (No.21901157). The data that support the findings of this study are available from the corresponding author upon reasonable request. Supporting information provides the physical and chemical parameters for the wildtype SARS-CoV-2 RBD and the three variants (Table S1 ); the solvent accessible surface area (SASA) and the average radius of gyration for the four RBD-hACE2 complexes (Table S2) ; experimental relative equilibrium dissociation constant (KD) of the four RBD-hACE2 complexes compared with the wild-type complex ( Table S3) ; comparison of binding free energies changes of the three variants between theoretical calculations and experimental results (Table S4) Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding A pneumonia outbreak associated with a new coronavirus of probable bat origin A new coronavirus associated with human respiratory disease in China A Novel Coronavirus from Patients with Pneumonia in China Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Proteolytic activation of the SARS-coronavirus spike protein: Cutting enzymes at the cutting edge of antiviral research Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites Potent binding of 2019 novel coronavirus spike protein by a SARS coronavirusspecific human monoclonal antibody Human neutralizing antibodies elicited by SARS-CoV-2 infection Potent neutralizing antibodies against multiple epitopes on SARS-CoV-2 spike Potent Neutralizing Antibodies against SARS-CoV-2 Identified by High-Throughput Single-Cell Sequencing of Convalescent Patients' B Cells A human neutralizing antibody targets the receptor-binding site of SARS-CoV-2 Studies in humanized mice and convalescent humans yield a SARS-CoV-2 antibody cocktail Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus A virus that has gone viral: amino acid mutation in S protein of Indian isolate of Coronavirus COVID-19 might impact receptor binding, and thus, infectivity Emerging genetic diversity among clinical isolates of SARS-CoV-2: Lessons for today Emergence of genomic diversity and recurrent mutations in SARS-CoV-2 COG-UK update on SARS-CoV-2 Spike mutations of special interest Circulating SARS-CoV-2 spike N439K variants maintain fitness while evading antibody-mediated immunity Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding Adaptation of SARS-CoV-2 in BALB/c mice for testing vaccine efficacy Emerging SARS-CoV-2 variants reduce neutralization sensitivity to convalescent sera and monoclonal antibodies Detection of a SARS-CoV-2 variant of concern in South Africa Sixteen novel lineages of SARS-CoV-2 in South Africa GISAID -Novel variant combination in spike receptor binding site Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Protein Identification and Analysis Tools on the ExPASy Server Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Interaction Entropy: A New Paradigm for Highly Efficient and Reliable Computation of Protein-Ligand Binding Free Energy An accurate free energy estimator: based on MM/PBSA combined with interaction entropy for protein-ligand binding affinity Early transmissibility assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England Updated rapid risk assessment from ECDC on the risk related to the spread of new SARS-CoV-2 variants of concern in the EU/EEA -first update The basis of a more contagious 501Y.V1 variant of SARS-CoV-2 Mutation N501Y in RBD of Spike Protein Strengthens the Interaction between COVID-19 and its Receptor Experimental Evidence for Enhanced Receptor Binding by Rapidly Spreading SARS-CoV-2 Variants Key residues of the receptor binding motif in the spike protein of SARS-CoV-2 that interact with ACE2 and neutralizing antibodies Structural basis of receptor recognition by SARS-CoV-2 The Protein Data Bank SWISS-MODEL: homology modelling of protein structures and complexes QMEANDisCo-distance constraints applied on model quality estimation ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB A modified TIP3P water potential for simulation with Ewald summation MMPBSA.py: An Efficient Program for End-State Free Energy Calculations Performance comparison of generalized born and Poisson methods in the calculation of electrostatic solvation energies for protein structures Writing-Original draft preparation, Visualization Zhilong Wang: Data curation, Software, Visualization, Validation Zhiyun Wei: Conceptualization, Methodology, Supervision, Resources, Writing-Reviewing and Editing Igor Schapiro: Software, Data curation, Validation Jinjin Li: Conceptualization, Formal analysis, Funding acquisition The authors declare no conflict of interest.