key: cord-0968452-bt28ibgk authors: Khan., Abbas; Wei, Dong‐Qing; Kousar, Kafila; Abubaker, Jehad; Ahmad, Sajjad; Ali, Javaid; Al‐Mulla, Fahd; Ali, Syed Shujait; Nizam‐Uddin, N.; Mohammad Sayaf, Abrar; Mohammad, Anwar title: Preliminary Structural Data Revealed That the SARS‐CoV‐2 B.1.617 Variant's RBD Binds to ACE2 Receptor Stronger Than the Wild Type to Enhance the Infectivity date: 2021-07-05 journal: Chembiochem DOI: 10.1002/cbic.202100191 sha: 799070d47270849d9e7235cf9e4191c0a127b575 doc_id: 968452 cord_uid: bt28ibgk The evolution of new SARS‐CoV‐2 variants around the globe has made the COVID‐19 pandemic more worrisome, further pressuring the health care system and immunity. Novel variations that are unique to the receptor‐binding motif (RBM) of the receptor‐binding domain (RBD) spike glycoprotein, i. e. L452R‐E484Q, may play a different role in the B.1.617 (also known as G/452R.V3) variant's pathogenicity and better survival compared to the wild type. Therefore, a thorough analysis is needed to understand the impact of these mutations on binding with host receptor (RBD) and to guide new therapeutics development. In this study, we used structural and biomolecular simulation techniques to explore the impact of specific mutations (L452R‐E484Q) in the B.1.617 variant on the binding of RBD to the host receptor ACE2. Our analysis revealed that the B.1.617 variant possesses different dynamic behaviours by altering dynamic‐stability, residual flexibility and structural compactness. Moreover, the new variant had altered the bonding network and structural‐dynamics properties significantly. MM/GBSA technique was used, which further established the binding differences between the wild type and B.1.617 variant. In conclusion, this study provides a strong impetus to develop novel drugs against the new SARS‐CoV‐2 variants. Over the last two decades, Asia has been the epicentre of three highly pathogenic human coronavirus epidemics caused by severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome (MERS-CoV), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), respectively. [1] SARS-CoV-2 was first identified in December 2019, in Wuhan, China, and acknowledged as a previously unknown βcoronavirus. [2] SARS-CoV-2 has been demonstrated to transmit effectively among human populations because of its striking potential for rapid transmission through aerosol and fomites, making the coronavirus disease-19 (COVID-19) a pandemic. [3] SARS-CoV-2 is a single-stranded and positive-sense RNA genome whose size is approximately equal to~30 kb. [4] The genome codes for accessory, non-structural, and structural proteins (spike [S] , envelope [E] , membrane [M] , and nucleocapsid [N] ). [3a,5] SARS-CoV-2 infection begins with the virus binding to a host's cellular angiotensin-converting enzyme (ACE2) receptors. [6] This binding event involves virus surface S protein. The protein comprises S1 and S2 subunits; the S1 subunits contain a receptor-binding domain (RBD) and facilitates virus binding to ACE2, while the S2 subunit allows membrane fusion as a sequel to the viral genome's integration into the host's genome, initiating infection. [7] Reports from England demonstrated that mutations in the S protein led to a new contagious strain named "B.1.1.7". [11] These mutations in this strain may enhance virus transmissibility and infectivity, including the deletion of residues 69-70 and 144 and the substitution of A570D, D614G, T716I, S982A, D1118H, P681H, K417N, K417T, E484 K and N501Y. [8] From a vaccination perspective, these mutations may support the virus's immune evasion capability, rendering vaccines ineffective against the B.1.1.7 strain. [9] More worrisome is the continuous spread of new SARS-CoV-2 strains in South Africa (B.1.351) and the United Kingdom (B.1.1.7), carrying mutations of N501Y and E484 K within the RBD domain. In the United Kingdom, the number of 501Y mutationbased cases has been observed to increase from 0.1 % to 49.7 % in just one month. This mutation has also been reported in cooccurrence with other mutations in orf8, orf1a, and N proteins. [10] Both the B.1.351 and B.1.1.7 variants are more contagious than the initially reported strain. However, the B.1.1.7strain is unlikely to evade the protective immunity developed through the currently developed vaccine, whereas the B.1.351variant may overcome this immunity. [9c,11] Very recently, a novel B.1.617 variant with two mutations L452R and E484Q on the RBD domain was identified in India. Around 10,787 samples were collected from 18 Indian states, uncovering 771 cases of known variants 34 cases of the B.1.351 variant, 736 cases of the B.1.1.7 variant, and one case of the P.1 variant. According to government officials, an increasing number of variants with mutations, specially E484Q and L452R, have been seen in samples collected from the western state of Maharashtra. [12] Such double mutations may confer increased infectivity and facilitate evasion of the immune system. In time, the new SARS-CoV-2 B.1.617 variant may be more pathogenic, enhancing transmissibility; therefore, close analysis of such variants and predictions of dangerous mutations is crucial in controlling devastating effects of the COVID-19 in the future. Thus, understanding SARS-CoV-2 mechanisms is essential to determine how the mutations help the virus to survive by protecting themselves from hosts' immune defence. Computational approaches in predicting the impact of these mutations on the S protein structure, function and binding to ACE2 offer great promise for devising therapeutic strategies. [13] Accordingly, in this present study, our objective is to employ several computational techniques to decipher the impact of the newly emerged B.1.617 carrying L452R-E484Q mutations on the overall S protein structure and function, assessing the effect on binding with ACE2 and the overall impact on interactions network. These findings may aid in opening new avenues of curtailing the SARS-CoV-2. Our study began with a retrieval of the SARS-CoV-2 spike protein sequence (accession number: P0DTC2) from the UniProt database [14] to map mutations exact location. Homology modelling of the mutated sequence was performed using the Modeller tool. The wild-type spike protein sequence (PDB ID: 6M0J) was then accessed from the protein data bank (PDB) [15] and, subsequently, used in the UCSF Chimera v.1.15 interactive visualization program [16] to generate the desired E484Q-L452R mutant. The 6XDG (E) and 6M0J (E) structures reported experimentally were used as modelling templates. Then, molecular docking of the mutants with an ACE2 receptor molecule was conducted through a high ambiguity-driven protein-protein docking (HADDOCK) server, [17] following the protocol used by Abbas et al. in 2021. [11] The dissociation constant (K D ) for the complexes was determined using the PROtein binDIng enerGY prediction (PRODIGY) server. [18] Dynamics of the spike wild type and B. We examined the structural dynamics of the B.1.617 mutant-ACE2 complex vis-à-vis the wild spike-ACE2 complex through a repeated molecular dynamics simulation run of 500 ns. This was accomplished through AMBER20 simulation package using FF19SB to describe receptors-ligand backbone and sidechain parameters. [19] Complete details of systems preparation and running MD simulation production run were followed from Abbas et al., 2021. [11] Briefly, systems were solvated in a TIP3P solvation box and afterwards neutralized by adding an appropriate number of counter ions. Systems minimization was achieved via 6000 and 3000 steps of steepest descent and conjugate gradient algorithms. Heating of systems was done to 300 K, followed by the equilibration phase. The equilibrium for each system was attained at almost 50 ns and then the final stage production run for each system was run at time scale of 500 ns. Simulation trajectories were generated that were analysed through CPPTRAJ module of AMBER. [20] Estimation of binding free energy The binding free energies of both the mutant and the wild-type complexes were estimated through the AMBER MMGBSA.py module. [21] The systems were subjected to the MM/GBSA method, and energy terms such as electrostatic energy, van der Waals energy (vdW), and polar and non-polar solvation energies were determined and is widely used by various studies. [22] The energy terms were also mathematically processed to estimate the system's net binding free energy, using the equation below: DG net binding energy ¼ DG complex binding energy À ½ DG receptor binding energy þ DG ligand binding energy � Each of the above components of net binding energy can be split as follows: þG polar solvation energy þ G nonÀ polar solvation energy The entropic calculation was not conducted since it is computationally expensive and time consuming process and also highly susceptible to significant inaccuracies. [23] The recent SARS-CoV-2 variants that have emerged in different countries of the world have developed a distressing situation. The appearance of this novel strain demonstrates that the wildtype strain has been exposed to increased genetic pressure and has developed genetic variants. Variations produced by such genetic changes may have resulted in significant changes in propagation, virulence, and clinical implications. Recently Abbas et al., employed protein-protein docking methods in a biophysical investigation to examine the effect of mutations reported in the United Kingdom, South Africa, Brazil, and other countries on the structure and binding of ACE2 and revealed that the B.1.351, P.1 variants are more dangerous and may seriously threaten the efficacy of the already developed vaccine. Recently, a new variant known as B.1.617 of SARS-CoV-2 was reported in India, and it might have contributed to increased infectivity and immune evasion. These mutations are acquired by the RBD of the spike protein. The spike protein is a multidomain protein that uses the RBD domain to bind to the human ACE2 protein. Because of the essential role of Spike protein in binding and pathogenesis, it has been deemed as a potential drug target to treat COVID-19 patients. These mutations may be related to functional heterogeneity and may particularly use a different approach to infection. The RBDspecific mutations (L452R-E484 K) reported in India requires careful examination in the search for new therapeutic alternatives. Therefore, in the current study, we performed comparative binding and biophysical analysis of the wild type and B.1.617 mutant upon the interaction with ACE2. The primary sequence of the wild-type spike-RBD was obtained from UniProt, and the reported mutations were generated. Using Modeller, the structure of the B.1.617 variant was modelled. The multi-domain representation of the spike protein, the modelled RBD structure of the B.1.617 variant, and sequence alignment are given in Figure 1 . Using HADDOCK, protein-protein docking of ACE2 with the wild spike-RBD and B.1.617 variant was performed to explore the structural mechanism of the SARS-CoV-2 variants. Previously Abbas et al., reported that with 1 salt-bridge (between Glu30 and Lys417), 11 hydrogen bonds and 139 non-bonded interactions, the docking score for the wild type was reported to be À 122.6 + /À 0.7 kcal/mol. Using a similar approach, docking of The resulting docking score was À 123.0 + /À 3.5 kcal/mol. Interaction analysis showed that the new variant altered the bonding network and formed distinct interactions. The mutated residues formed significant interactions with ACE2. Only one salt bridge (between Glu30 and the mutated Arg452), seven hydrogen bonds and 129 non-bonded contacts were reported. Similarly, only 7 hydrogen bonds were formed among Glu30-Arg452, Glu30-Arg452, Tyr34-Gln484, Glu35-Tyr453, Glu35-Gln493, Gln37-Tyr489 and Glu75-Asn501 residues. It can be seen that the bonding pattern is significantly altered by these mutations. Likewise, the new substituted residues formed three hydrogen bonds: two between Glu30 and Arg452 and another one between Tyr34 and Gln484. This shows that the binding pattern is significantly affected by the mutation. Interestingly the electrostatic energy contribution (À 205.3 + /À 11.2 kcal/mol) endured as the primary contributing influence in the B.1.617 variant. This notion of more electrostatic energy is supported by previous studies [24] that state that evaluation of the binding of SARS-CoV-2 and SARS-CoV RBDs with ACE2 have revealed more hydrogen bonds and electrostatic interactions in SARS-CoV-2. [24, 25] However, the Van der Waals energy of the wild type and B.1.617 variant is comparable. These results obtained from docking only are further validated and explained in Hydrogen bonding analysis after MD simulation section. Decisively, this rationalizes that the binding of the mutant complex is principally due to the electrostatic contribution and further suggests that the notable differences in the binding pattern are noteworthy for differential infectivity. We further determined the K D (dissociation constant) of the wild and mutant complexes to provide convincing insights into the binding distinctions. It is used to estimate and order the strengths of bio-macromolecular association. [26] The K D kinetics are frequently used to foresee the affinity of antigen-antibody associations, protein-ligand interactions, and significant biological macromolecule interactions. The lower the K D value, the stronger the interaction. [27] The K D values of diverse macromolecules association and their interactions have been calculated previously. [28] We used PRODIGY (PROtein binDIng enerGY prediction), an online server that computes the binding affinity, K D , for various biological complexes, to gain a deeper understanding of the binding of the wild type and mutant complex. Previous study reported a K D of 5.2 × 10 À 10 for the wild type, while here in this study, the predicted K D value for the B.1.617 variant was 2.0 × 10 À 08 , which shows a relatively tighter binding of the wild type than the B.1.617 variant for the docking conformation. Next, we extracted MD equilibrated average conformation to evaluate the binding strength. The K D value for the wild-type MD equilibrated (Average structure) 4.0 × 10 À 09 while for the B.1.617 variant MD equilibrated structure the K D value was predicted to be 1.4 × 10 À 09 . This shows that the B.1.617 variant equilibrated structure demonstrate stronger binding properties than the wild-type equilibrated structure. The lower K D values obtained in this analysis are consistent with the results obtained in in-vitro binding assays of SARS-CoV-2 and SARS-CoV, which reported a tighter binding of the SARS-CoV-2 RBD to the host receptor ACE2 than the SARS-CoV. [29] The docking scores along with the interaction energies and K D values for the wild type and B.1.617 variant are given in Table 1 . Next, we determined the consequence of the fixed aminoacid replacements reported within the RBD of the SARS-CoV-2 spike protein by calculating its stability as root mean square deviation (RMSD). RMSD can be used to identify variations between proteins' initial and final structural conformations. These conformational changes can be estimated with an RMSD during simulation, determining a biological molecule's stability. A smaller deviation during such simulations indicates a highly stable structure. RMSD values were calculated for each proteinprotein complex from the Cα backbone, using 500 ns repeated trajectories. As Figure 2A shows, the wild system attained equilibration at and stable at RMSD of 2.0Å. Almost all the systems attained the equilibrium position at 50 ns. The system remained stable during the simulation; however, a deviation in the RMSD was observed between 50-60 ns when it increased to 4.0 Å. This later decreased back to 2.0 Å and remained uniform for the wild-type complex until 500 ns with the exception of two very minor deviations at 225 ns and 390 ns. This pattern was also observed in a previous study, which reported the more stable behaviour of the wild-type system. The dynamic differences between the wild type and the mutated systems determine the binding differences and behaviour. The average RMSD for replicate 1 during the first 75 ns remained 4.0 Å; however, the RMSD abruptly increased and experienced significant convergence between 76-125 ns. The average RMSD between 76-125 ns was reported to be 6.0Å. Afterwards, the RMSD value decreased and remained stable until 200 ns. ) 1965.3 + /À 120.6 1840.6 + /À 86.3 Z-score À 1.9 À 1.3 K D (dissociation constant) docking conformation 5.2 × 10 À 10 2.0 × 10 À 08 K D (dissociation constant) MD equilibrated conformation 4.00 × 10 À 09 1.4 × 10 À 09 ChemBioChem Full Papers doi.org/10.1002/cbic.202100191 Afterwards from 220 ns to 300 ns, another period of the increased-decreased curve was noticed where maximum RMSD reaches~8 Å at 260 ns. From 310 ns onward, the RMSD again starting to surge till 420 ns, followed by another small up-down surge and then later in the simulation gain stability. Similarly, Replicate 2 also reported a higher RMSD value compared to the wild type. The average RMSD for Replicate 2 was 4.0Å. Replica 2, in comparison to Replica 1 is more stable in term of RMSD and considerably consistent. Previous studies have reported that fluctuations in RMSD values for these complexes are linked with the opening or closing motion of the claw-like structure in ACE2. [30] The higher RMSD values obtained at different time intervals throughout a simulation were probably associated with the binding and unbinding of some destabilized interactions in the ACE2-spike interface. Consequently, the unusual instability of the RBD was associated with the Darwinian selection-driven epistasis in protein evolution. [22g] The radius of gyration (Rg) was calculated to evaluate the compactness during the simulation, as given in Figure 2B . The Rg values increased and decreased during the simulation, and the average Rg value for the wild type was 31.2Å. In contrast, the average Rg values for the two replicates of the B.1.617 variant were 32.4Å and 33.0Å, respectively. Thus, the wild-type system seemed to remain more compact during the first 100 ns; however, this compactness was lost between 95 and 130 ns, and then it remained uniform again until 500 ns where it was seen in a very low Rg pattern. On the other hand, the two replicates remained less compact during the MD simulation. The structure of Replicate 1 remained substantially open. The initial Rg value continuously increased until 125 ns; however, the Rg then remained uniform until 200 ns. Similarly, the Replica 2 Rg values remained consistent with the RMSD results. No significant convergence was observed; however, the average Rg values remained higher than the wild type's. The fluctuation in Rg values during the simulations of all the studied systems was associated with the binding and unbinding of one or another end of the spike RBD domain. An increase and decrease in the Rg pattern can also be pointed to 500 ns though these variations between the wild type and replica 2 are comparatively less than the replica 1. Additionally, the bonding rearrangement during the MD simulation holds and releases the two receptors, which causes a perturbation in the structural compactness. The wild type exhibited minimal root mean square fluctuation (RMSF) compared to the reported mutant (replicates). In the B.1.617 variant, the residues 1-200 exhibited greater fluctuations than the wild type. Next, we found that the RMSF pattern for the wild type and Replicate 2 remained comparable ( Figure 3) ; however, for replicate 1 the residual fluctuation remained higher during the simulation than the wild type. For the mutant (both replicas), higher fluctuations were observed between regions 100-200 compared to the wild type. These fluctuations were linked with the presence of three important loops involved in binding with ACE2, thus highlighting the functional importance for binding, which is possible due to essential conformational changes during Darwinian evolution and amino acids fixation. This higher fluctuation is resulted from the terminal loop residues Ala522, Thr523, Val523 and Cys525 which exhibit unusual flexibility. These overall results suggest that possible evolutionary changes in the mutants have led to variations in their dynamic function. Furthermore, an intermolecular bonding network was explored for the wild type and B.1.617 variant by using an equilibrated average structure from MD trajectories to evaluate the differences in binding pattern accurately. As given in Figure 4A , the wild-type interaction pattern has been changed during the MD simulation while the B.1.617 variant formed extra interactions during the MD time period. A total of 11 hydrogen bonds and 1 salt bridge was detected in the wildtype complex, while 15 hydrogen bonds and two salt bridges were reported in the B.1.617 variant complex (Table 2 ). In the wild type, the two strong bonds between Tyr83-Asn487 and Glu35-Gln493 remained conserved and has also been reported in the previous study as sustained interactions. Similarly, the interaction Glu35-Gln493 plays an important role in locking the correct orientation of Gln493 for interaction is essentially retained during the 500 ns simulation, which is also reported by previous studies. [11,24b, 31] Interesting, a cluster of interactions by Lys353 was observed during the MD simulation. This residue, Lys353, formed a stronger network of hydrogen bonds with Gly496, Gln498 and Gly502. Interestingly, a previous study also reported the Lys353 interaction as a critical mediator in the attachment to the receptor. [11,24b, 31] Moreover, a single salt bridge between Glu30-Lys417 was observed, which is an essential element of the interaction network. The interaction pattern of the wild type RBD and ACE2 is shown in Figure 4A . Differences in the bonding network of the wild type and the B.1.617 variant was found significantly. In the docking conformations, only 7 hydrogen bonds were reported, while in the equilibrated MD structure, 15 hydrogen bonds and 2 salt bridges were established. Interestingly among the key interactions established between the ACE2 and B.1.617 variant, RBD formed shared contacts with the wild type. Among the 15 hydrogen bonds, the interactions formed by Tyr83, Glu35, Gln42 and Lys353 established by B.1.617 variant complex are conserved between the wild type and the variant structures. In the case of B.1.617 variant Ser19-Ala475, Glu30-Lys417, Tyr34- Ser494, His34-Gly496, Tyr41-Asn501 established essential new hydrogen bonds while Glu30 maintained two salt bridges with Lys417. The three hydrogen bond pairs Tyr41-Asn501, Lys353-Gln498 and Lys353-Gly502 reported here are also previously observed as the key interactions that were limited in the SARS-CoV RBD-ACE2 complex. [11,24b,31] Interestingly the interaction Glu38-Tyr449 established here was also reported in a previous study and concluded that this interaction might also play a significant role in the enhanced binding. Additionally, the two salt bridges may justify the increased electrostatic energy of the B.1.617 variant complex. Conclusively this shows that during the simulation, both the structures are altered significantly and established several new contacts, which are essential for the binding differences. The interaction pattern of B.1.617 variant complex is shown in Figure 4B , while the details of the interactions of both complexes along with the bonding distances are given in Table 2 . Furthermore, we also performed the total hydrogen bonding analysis (both intra and intermolecular) as a function of the total number of frames in order to understand the variations in the bonding network induced by the substitutions in the RBD of the Spike protein. From each trajectory, 25000 frames were considered, and the total number of hydrogen bonds during the simulation was calculated and presented in Figure 5 . In the wild type, the average hydrogen bonds during the 500 ns simulations were reported to be 388, while in replicate 1 the average hydrogen bonds were 394, and in replicate 2 the average hydrogen bonds were 396. These results are consistent with our docking results, which showed that the wild type has 11 hydrogen bonds while the mutant complex has only seven hydrogen bonds. This finding shows that the mutations (B.1.617 variant) have altered their hydrogen-bonding network and may use a different strategy than the B.1.1.7, B.1.351, and P.1 variants. Moreover, to justify our findings, we also calculated binding free energy using 500 ns simulation trajectories. The MM/GBSA energies of both the wild type and the B.1.617 variant was compared and contrasted to extract ideas about the mutations induced conformation changes in the RBD region and their influence on ACE2 binding. Detail atomic-level energies at different time intervals demonstrated that and presented in Supporting Table S1 while the binding free energies for the last 200 ns and their averages were calculated and given in Table 3 . The rationale behind the presentation of the 200 ns MM/GBSA result is due to uniform equilibration state attained by all the complexes including the wild type and mutant replicas. In a particular case of Replica 2 the equilibration was also attained in the earlier time intervals therefore, the free energy results for the 500 ns simulation time intervals are presented in the Supporting Table S1. The data show that the wild type (van der Waals energy: À 90.08 � 0.09 kcal/mol), the B.1.617 variant had a higher van der Waals energy contribution (À 96.90 � 0.12 kcal/mol in Replica 1 and À 98.43 � 0.13 kcal/ mol in Replica 2) upon ACE2 binding. However, the electrostatic energy is more favourable and seems to contribute heavily to the stability of the variant complex with ACE2. In both replicas, the B.1.617 variant showed double the electrostatic energy contribution of the wild type. This finding implies that the mutations in the B.1.617 variant may allow for a fitter conformation with ACE than the wild type. The electrostatic contribution to solvation energy for the wild type was (À 693.33 � 0.74 kcal/mol) and while for the B.1.617 variant (À 1037.22 � 0.70 kcal/mol in Replica 1 and À 1178.08 � 0.42 kcal/mol in Replica 2). The non-polar solvation energy estimated for the complexes favoured good stability for both the wild type and the variant. For wild type, the non-polar solvation energy was À 11.99 � 0.01 kcal/mol, versus À 12.00 � 0.01 kcal/mol and À 13.56 � 0.01 kcal/mol for Replica 1 and Replica 2, respectively. Overall, net binding energy revealed that the B.1.617 variant was somewhat stable in terms of complex formation with ACE2 (À 70.42 � 0.09 kcal/mol in Replica 1 and À 86.95 � 0.15 kcal/mol in Replica 2) compared to the wild type (À 63.97 � 0.09 kcal/mol). Unlike the results obtained through docking studies based on single structure prediction, the current results using 25000 structural frames concluded that the B.1.617 variant shows a stronger binding affinity with ACE2 than the wild type. Conclusively, these findings align with the above docking and simulation data are an indication that mutations' induced structural dynamics to support the RBD's increased binding to ACE2. This study was conducted to explore the binding differences of the spike RBD of the wild and B.1.617 variant (L452R-E484Q) with the host ACE2 by using combined structural modelling and biophysical approaches. This investigation revealed that the B.1.617 variant possess a stronger binding affinity for the host ACE2 in both replicates and the bonding network is substantially different from the wild type. We concluded that L452R-E484Q mutations and other factors such as genetic variability in the other proteins might help the virus enforce infection and its severity. This study provides a strong impetus to develop novel drugs the new variants. All the data are available on RCSB, UniProt and any simulation data would be provided on reasonable demand. The accession numbers to access this data are given in the manuscript. 6595-6628; b) A. Kawana, Nihon Rinsho Proc. Natl. Acad. Sci. USA 2020 Science 2021, eabg6105; b) SSRN 2021 The COVID-19 Genomics UK Revised manuscript received Accepted manuscript online Version of record online The authors declare no conflict of interest.Keywords: ACE2-spike docking · B.1.617 variant · biophysical simulation · dissociation constant · SARS-CoV-2