key: cord-0835261-liq62vko authors: Dokainish, Hisham M.; Sugita, Yuji title: Structural Ramifications of Spike Protein D614G Mutation in SARS-CoV-2 date: 2022-01-25 journal: bioRxiv DOI: 10.1101/2022.01.24.477651 sha: 1c60bcd23e9b291fad7b7987e6d7f3f0f1b23f78 doc_id: 835261 cord_uid: liq62vko A single mutation from aspartate to glycine at position 614 has dominated all circulating variants of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). D614G mutation induces structural changes in the Spike (S) protein that strengthen the virus infectivity. Here, we use molecular dynamics simulations to dissect the effects of mutation and 630-loop rigidification on wild-type structure. The introduction of mutation with ordered 630-loop induces structural changes toward S-G614 Cryo-EM structure. An ordered 630-loop weakens the stabilizing interactions of the anionic D614, suggesting its disorder in wild-type. The mutation allosterically alters the receptor binding domain (RBD) forming an asymmetric and mobile Down conformation, which facilitate Up transition. The loss of D614_K854 salt-bridge upon mutation, generally stabilize S-protein protomer, including the fusion peptide proximal region that mediates membrane fusion. Understanding of the molecular basis of D614G is crucial as it dominates in all variants of concern including Delta and Omicron. The emergence of new variants of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continue to threaten the ongoing efforts to stop the pandemic 1-3 . Variants of concern (VOCs) usually show an enhanced virus fitness, either by altering transmissibility and/or severity rate as well as their potential evade from the naturally/vaccine acquired immunity, reducing the efficacy of the currently deployed vaccines 1, 4 . As of December 2021, several VOCs have been reported (e. g. and N501Y in Alpha, Beta and Gama, while L452R and T478K in different classes of Delta 1, 6, 8 . On contrast, the D614G mutation is more universal and has dominated all circulating strains worldwide within a month 10-12 . Mutations in Spike (S) protein are at the heart of the SARS-CoV-2 variants' evolution [13] [14] . S-protein is an octadomain homotrimeric glycoprotein that decorates the virus surface, playing a central role in its cycle [15] [16] [17] . It is formed of two subunits connected by a furin cleavage site 15, 18 . This includes the N-terminal subunit (S1) which mediates the virus binding to the host cell Angiotensin converting enzyme-2 (ACE2), and the C-terminal subunit (S2) which mediates membrane fusion, upon cleavage from S1 [19] [20] [21] . The S1 subunit consist of four domains, the N-terminal domain (NTD), the receptor binding domain (RBD) and two subdomains (SD1 and SD2). RBD undergo an essential conformational change from the ACE2 inaccessible (Down) to ACE2 accessible (UP), initiating cell entry [20] [21] . Mutations in S-protein were reported in both subunits, however the majority of variations mainly occur in S1 1, 4, [13] [14] . This includes: 1) RBD mutations such as N501Y, E484K and L452R, with enhanced ACE2 and/or reduced antibodies binding affinities 4, [6] [7] 22 ; 2) NTD mutations/deletions (e.g. H69-, H70-and R158G) with direct effect on NTD antibodies binding 5, 23 ; and 3) distal mutations in SD1 and SD2 subunits with allosteric effect on RBD structure and motion, such as the D614G and A570D mutations that shift RBD Down/Up populations 12, [24] [25] . D614G mutation is one of the early spotted dominant mutations in S1, as of March 2020 10 . It occurs as a result of a single nucleotide mutation (A-to-G) at the 23,403 position in the original Wuhan strain 26 . G614 has become globally dominant and currently occurs in all variants of concern and interest including, Alpha, Beta, Gama, Delta, lambda, Mu and even Omicron, suggesting its convergent evolution and central role 27 . Several studies investigated the effect of this mutation on the virus infectivity, severity, neutralization, and S-protein structure 2, 11-12, 24, 28-32 . Notably, D614G shows an enhanced transmissibility and infectivity rate not only in different cell lines but also in different species, which has been correlated to higher viral load 12 . The mutation was neither associated with higher mortality rate nor reduced neutralization 27 . In general, analysis of G614 Sprotein show that the effect of mutation is mainly exerted at the membrane fusion step where G164 S-protein (S-G614) conformational equilibrium is shifted toward the ACE2 accessible (UP) conformation, as reported in several studies 24, [28] [29] [32] [33] . However, contrast reports exist on how D614G mutation affect ACE2 binding with weakened or stronger binding 11, 30 . The mutation was also found to increase S-protein stability, where it prevents premature cleavage of the two subunits before ACE2 binding, while enhances cleavage upon binding 11, 29 . S-G614 Cryo-electron microscopy (Cryo-EM) studies have revealed several structural differences with respect to wild-type (S-D614). This includes an outward rotation in S1 (away from S2), more flexible RBD, flexible Down conformation and slightly open conformations even in Down 24, 29, 32 . Consequently, several mechanisms were proposed to explain the structural basis of abovementioned allosteric effects. Originally, such structural effects were attributed to break of a hydrogen bond with the adjacent S2 protomer Thr859, based on wild type Cryo-EM structure 10 . Although, several studies advocated such hypothesis 28, 33 , this proposal was later challenged suggesting the formation of salt bridge with Lys854 as shown in wild-type Cryo-EM structures ( Figure 1 ) 24, 32 . Note that D614 is located in SD2 at the interface with the adjacent protomer fusion peptide proximal region (FPPR); FPPR mediates membrane fusion after proteolysis at the TMPRSS2 cleavage site 29, 32 . Likewise, D614 is neighbored to a generally unresolved loop region in the majority of S-protein Cryo-EM structures, known as 630-loop ( Figure 1 ). Notably, Zhange et al. 32 shed the light on G614 induced structural changes as they show the rigidification of this loop in S-G614 and its insertion in a gap between NTD and SD1 (Figure 1 ), forming multiple hydrophobic contacts. The authors also suggested that such insertion is hindered in S-D614 due to smaller gap, inducing a disordered 630-loop. In contrast, such rigidification was also observed in the wild type S-D614 at low pH (D614 neutral), where it forms helical structure (Figure 1 Although D61G has been extensively studied, the link between mutation and structural changes are still elusive. Here, we use classical molecular dynamics simulations (cMD) to scrutinize the atomistic basis of D614G mutation, ordered vs disordered 630-loop, and their structural ramification on S-protein conformation, stability and subsequently infectivity. Starting from wild-type structure (PDB: 6ZGE (2.6 Å)) 35 , three S-D614 and one S-G614 systems (Table 1 and Figure S1 ) were investigated including S-D614 with disordered 630-loop (D614loop), S-D614 with ordered loop (D614SS), neutral S-D614 with ordered loop (DH614SS) and S-G614 with ordered loop (G614SS). Taking advantage of the trimeric nature of S-protein, we analyzed six protomers per system from two independent simulations for 1 each, summing up to 6 of simulation data per system (24 in total). To assign the induced conformational changes by mutation, we utilize the vast number of S-protein Cryo-EM structures guided by the work of Henderson et al. 36 and our previous study 37 . Wherein, we constructed an 11 beads per protomer coarse-grained model ( Figure 2a and Table S1 ) of the 52 S-protein three RBD Down structures (156 promoters). Next, we performed principal component analysis (PCA) of the monomeric (11 beads) as well as trimeric (33 beads) models. Figure 1a shows that the first and second PCs, both represent an outward motion of S1 with respect to S2, with a total contribution of over 84% of the observed motion. PC1 and PC2 differs only in the direction of outward motion toward relative down or up, respectively. A similar PCs were also obtained for the 33 beads trimeric model ( Figure S2a ). The projection of the 156 protomers on PC1-PC2 map in Figure The predicted outward motion in S1 was further analyzed by calculating the angles formed between S1 domains. Figure S4 shows that the angle formed by adjacent domains, SD1, SD2 and the base of NTD (NTD(b)) center of mass, is increased in all simulations with ordered 630-loop. D614loop have a probability distribution of 67°-85°, while D614SS and G614SS shows a distribution between 74° and 90°. Note that in G614SS, B-2 also shows the largest shift in agreement with monomeric PCA analysis. We also calculated the angle between non-adjacent domains RBD, SD2 and core of NTD, which also shows an increase in all systems with ordered 630-loop. Although 630-loop rigidification drive an insertion and increased inter-domain angle in S1 ( Figure S4 ), regardless of the nature of 614 residue, such motion doesn't necessary shift S-protein conformation to mutant structure ( Figure S3 ). Furthermore, monomeric and trimeric PCA analysis suggest that both ordered 630-loop and D614G mutation are required to reproduce S-G614 like structure. To lesser extent, 630-loop rigidification in S-D614 could drive changes in one protomer (Figure 2b ), raising a question if loop rigidification could happens in S-D614! Previous study suggest that ordered loop doesn't occur in D614 due to limited gap between NTD and SD1 32 . In contrast, our data indicate the formation of this gap as a consequence of the rigidification. To date, several computational studies were performed to study S-protein but not much consideration have been giving to the modeling of the missing 630-loop conformation and its effect on simulations' result, one might model it as helical or disordered [37] [38] [39] . Ironically, the modeling of such small region (less than 30 residues) has a great effect on the S-protein motion. This suggests careful modeling of Spike variants is needed especially if the mutation is located near an experimentally unresolved region. The correlation between D614G mutation and 630-loop rigidification as well as the possibility of secondary structure formation in the 630-loop in wild type S-D614 are uncertain. it is also unclear how the presence of anionic residue (D614) affect loop conformations. To answer these questions, we compare our simulations of S-D614 with ordered and disordered 630-loop. Structural and electrostatic potential analysis show that the D614 is located at the interface of a hydrophobic pocket ( Figure 3a ) with one charged residue in its vicinity (Lys854). D614 was proposed previously to be stabilized by H-bonding with Thr859 based on Cryo-EM structure (PDB:6VSB) 10, 28 , however investigation of the pdb structure show that the orientation of both residues doesn't reflect direct interaction. Figure 3a also shows that D614 forms two possible hydrogen bonds with K854 and main chain of G594, in PDB:6ZGE. Accordingly, we first analyzed the total number of hydrogen bonds that could stabilize the side chain of anionic D614. Although the unfolding of 630-loop is beyond the scope of this study, we compared the stability of helical structure adjacent to D614G mutation site. Figure 3c illustrate that the stability of helical structure between residue 620 and 623 is reduced in Comparison of H-bonding, hydrophobic interactions and secondary structure stability suggest that structural changes mainly originate from the breaking of salt bridge with K854 upon mutation. Wherein wild type S-D614 would preferer the formation of a flexible 630-loop to stabilize the anionic charge of D614. The formation of this salt bridge as well as D614_G594 H-bond forms a kink around residue 613, hinder the formation of -sheet and subsequently allow for different pattern of hydrophobic interactions including V615_V635. In contrast, D614G mutation free the loop region from this interaction, extending thesheet to include Q613 which increase the distance between V615 and V635 leading to formation of different forms of hydrophobic contacts. In addition, the loss of D614_K854 interaction free the loop region as indicated by their C distance that permit insertion between SD1 and NTD. Our results explain the possibility of observing such ordered loop in S-D614 at low pH 34 , due to the breaking of the salt bridge. This hypothesis can be confirmed experimentally upon mutating the K854 in wild type protein or by reducing the hydrophobicity at the D614 S2 interface. Two main mechanisms have been proposed to explain D614G superior transmission rate, which includes a regulated shedding mechanism of S1/S2 depending on the absence or presence of ACE2, and the shift in conformational equilibrium toward the RBD Up state 29, 32 . To understand the allosteric effect of the mutation, we compare our simulation of wild type with disordered loop D614loop with G614SS. Figure 4a shows that a change in one RBD due to mutation is expected to alter neighboring RBD organization. In figure 4b, we calculated the center of mass distances between all three RBDs. A general increase in RBD_RBD distance were observed in S-G614 (Figure 4b) , wherein RBD (B-2) shows the largest distances from the neighboring RBDs. Furthermore, G614SS RBD_SD1 hinge angle reflects the break of symmetry in Down conformation in comparison to wild-type (D614loop). These results suggest the formation of S-G614 like structure leads to more spacing between asymmetric RBDs, which forms a slightly more open conformation, in agreement with Cryo-EM structure 32 . Figure 7a and b, show that a similar effect was observed upon the rigidification of 630-loop in S-D614 (D614SS) but with lesser extent, supporting our finding that S-D614 would prefer flexible loop. The effect of RBD rearrangement on the exposure of receptor binding motif (RBM (residue 410-510)) and glycans cover in Down conformation was examined. The RBM average solvent-accessible surface area (SASA) value was generally reduced from 16352 Å 2 in D614loop to 1530.9 Å 2 in G614SS. This difference was scrutinized by calculating the per residue SASA value that indicate an overall reduction in all residues (Figure 4c ). Plotting the difference in accessibility on RBM surface reflect a shift of residue exposure due to RBD rearrangement, wherein K444 has become relatively more solvent exposed in S-G614 while V445 has higher accessibility in S-D614. Furthermore, an important point of mutation in Spike variants (E484) was found to be less exposed in G614SS. Comparison of SASA values in the absence of glycans ( Figure S8 ) clearly show large difference in accessibly between G614SS and D614SS, especially between 470-503, suggesting that RBD rearrangement alter RBM accessibility in Down conformation. The effect of D614G mutation on S-protein stability was elucidated by calculating the root mean square fluctuation (RMSF) in all protomers. G614SS shows an overall stabilizing effect upon mutation in all domains (Figure 4d ), in comparison to D614loop. Remarkably, the fusion peptide proximal region (FPPR) has better stability in the mutant S-G614. Despite the ordered loop in D614ss, FPPR has larger fluctuation that suggest the general destabilizing effect in S-D614 regardless of loop conformation. Protonating D614 (DH614SS), increase FPPR stability, confirming the role of D614_K854 salt bridge to the observed instability. The furin cleavage site was also found to be relatively stable in G614ss and DH614SS while the presence of anionic D614 show higher fluctuation. Figure 4d , also show that the rearrangement of RBD loosen its hook region, leading to higher fluctuation and disorder in this region. Such disorder was recently observed in other spike variants of concern that includes the D614G mutation 1 . Allosteric effect of D614G mutation alters S-protein conformation at different level that goes beyond the Cryo-EM observed S1 rotation. First, the mutation induces a rearrangement in RBD organization breaking the symmetry. Note that the break of symmetry and the formation of flexible Down is a prerequisite for N343 glycan contact changes that initiate the up conformational transition, as suggested previously 37 . In addition, the high disorder in RBD hook region might also help loosening RBD_RBD interaction and Up transition, which align with previous proposal for E484K mutant 1 . These results might explain the origin of the observed higher population in Up conformation in comparison to wild type. Likewise, Gobeil et al 1 . previously suggested the increase RBD mobility in Down reduces the barrier for Up transition in B.1.1.17. Second, the rearrangement of RBD was also found to alter RBD interface and residue solvent exposure as indicated by SASA values. Such reduction in accessibility might partially compensate for the larger exposure to neutralization due to the conformational shift to Up state. Indeed, S-G614 was found to be moderately more sensitive to neutralization despite large shift in Up population 30 . Third, our simulations also show the general stabilizing effect of the mutation, especially on FPPR and furin cleavage site. It also shows similar increase in stability upon the protonation of D614. Note that Cryo-EM structure also suggest a change from disorder to order upon mutation. Gobeil et al. 1 Cryo-EM study of different variant of concern has suggested the regulatory role of FPPR region and 630 loop order/disorder on S-protein stability and structural rearrangement. In this study, we performed classical molecular dynamic simulations to study the effect of D614G mutation and 630-loop It also points out the importance of careful modeling of Spike structures in the upcoming emerging variants. Notably, understanding the molecular basis and consequence of mutation is crucial for vaccine and antiviral drug development. The wild-type S-D614 Cryo-EM structure (PDB:6ZGE) 35 was used as the initial model for all simulations. As it has a 2.6 Å resolution and only lacks small number of residues per protomer. The missing residues, namely 71-75, 677-688 and 941-943 was modelled using modeller9.19 program. In addition, the 630 loop was partly resolved with missing residues between 618 and 632 only. Consequently, this missing region Protonation of D614 and mutation to G614 was performed using CHARMM-GUI 41 . Similar to our previous study 18 N-glycans and 1 O-glycan were added per protomer, based on previous experimental and computational study [37] [38] 42 . The full list of added glycans can be found in figure supplement 2 of our previous study 37 . In addition, 14 bisulfide bonds were modelled in each protomer. CHARMM-GUI was also used to make All simulations were performed using GENESIS 2.0 Beta MD software on Fugaku supercomputer [43] [44] , with an overall average performance of 55 ns/day using 128 nodes. Two independent simulations for 1 each was performed for each system with a total simulation time of 8 . Protein and glycans were parametrized using CHARMM 36M force field, while CHARMM TIP3P was used for water molecules 45 . All simulations were first minimized for 5000 steps, while applying positional restraints of 1 kcal mol -1 Å -2 on protein backbone and weak restraint of 0.1 kcal mol -1 Å -2 on all heavy atoms. Then all systems were gradually heated to 310 K using Velocity Verlet integrator and stochastic velocity rescaling thermostat [46] [47] [48] . Subsequently, all simulations were equilibrated in a step manner: 1) in NVT ensemble using same integrator and thermostat, 2) in NPT ensemble with stochastic velocity rescaling thermostat and MTK barostat 47 ; 3) in NVT upon after removing all restraints and, 4) with multiple time-step integrator (MTS) with time step of 2.5 fs and 5 fs for the fast and slow motion, respectively [49] [50] . Finally, a production run was performed in NVT using MTS and stochastic velocity rescaling thermostat. The first 200 ns of the production run was generally excluded from analysis and considered part of the equilibration process. smooth particle mesh Ewald (SPME) 51 was used to compute Electrostatic interactions with 128 ´ 128 ´ 128 grids and the 6 th -order B-spline function. The group-based approach was used to evaluate temperature where the thermostat was applied every 10 steps 49 . Bonds involving hydrogens and water molecule were constrained with SHAKE/RATTLE algorithm 52 . Analysis of simulation trajectories was performed using GENSIS 1.6 analysis tools. PCA analysis were performed using all available Cryo-EM structures of Spike Down conformation in the Protein Data Bank that was release by end of September 2021.Structures that include other molecules such as neutralizing antibodies were excluded from the selection, to avoid any induced conformational changes due to binding. Also structures that lack resolution at region involved in coarse grained model preparation were also excluded. A total of 52 Spike structure were selected for the analysis including 156 protomer. Similar to our previous work, we represent all selected structures using a coarse-grained beads model, while we used 11 beads instead of 9 per protomer to include better represent S2. As previous, our model is consisted of two beads for RBD (core and top), 3 beads for NTD (core, base and top), one bead for SD1 and SD2 each and finally 4 beads for S2. We performed PCA analysis of both monomeric 11 beads and trimeric 33 beads models using the selected 156 and 52 structures, respectively. S1 root mean square analysis (RMSD) were performed using only Cα atoms, upon fitting S2 (residue 689-827 and 854-1134). Likewise, only rigid regions in RBD (328-444, 462-468, 489-501, 503-533) and NTD (14-69, 80-143, 165-172, 186-245, 263-306) were used for the analysis. Stride in VMD was used to assign secondary structure in the 630-loop region. Average residue_residue contact was also calculated with iTRAj plugin in VMD 40 . Electrostatic potential was performed with ABPS in PYMOL 53 . Solvent accessible surface area calculations were also calculated using PYMOL with a probe radius of 7.2 Å. RBD_RBD distance, hinge angle and SASA analysis were performed using the last 200 ns of the simulation to characterize changes after Spike conformational shift. All structural figures were made using PYMOL 53 . While VMD program was used for trajectories visualization 40 . Contact analysis is defined as any atom distance less than 2.5 Å, where the selected pairs are based on a 30% occurrence threshold in any protomer. The first 200 ns of the simulations were excluded from hydrogen bonding and contact analysis. Effect of natural mutations of SARS-CoV-2 on spike structure, conformation, and antigenicity The Spike D614G mutation increases SARS-CoV-2 infection of multiple human cell types Escape from neutralizing antibodies by SARS SARS-CoV-2 variants, spike mutations and immune escape Membrane fusion and immune evasion by the spike protein of SARS-CoV-2 Delta variant Antibody resistance of SARS-CoV-2 variants B.1.351 and B.1.1.7 SARS-CoV-2 Omicron strain exhibits potent capabilities for immune evasion and viral entrance Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity Spike mutation D614G alters SARS-CoV-2 fitness The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity Complete Mapping of Mutations to the SARS-CoV-2 Spike Receptor-Binding Domain that Escape Antibody Recognition Structure of SARS-CoV-2 spike protein Coronavirus biology and replication: implications for SARS-CoV-2 Distinct conformational states of SARS-CoV-2 spike protein Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Cell entry mechanisms of SARS-CoV-2 Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2 Structural and functional ramifications of antigenic drift in recent SARS-CoV-2 variants Molecular basis of immune evasion by the Delta and Kappa SARS-CoV-2 variants The effect of the D614G substitution on the structure of the spike glycoprotein of SARS-CoV-2 Distant residues modulate conformational opening in SARS-CoV-2 spike protein Molecular Aspects Concerning the Use of the SARS-CoV-2 Receptor Binding Domain as a Target for Preventive Vaccines The virological impacts of SARS-CoV-2 D614G mutation Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant SARS-CoV-2 D614G spike mutation increases entry efficiency with enhanced ACE2-binding affinity Structural impact on SARS-CoV-2 spike protein by D614G substitution Bimodular effects of D614G mutation on the spike glycoprotein of SARS-CoV-2 enhance protein processing, membrane fusion, and viral infectivity A pH-dependent switch mediates conformational masking of SARS-CoV-2 spike SARS-CoV-2 and bat RaTG13 spike glycoprotein structures inform on virus evolution and furin-cleavage effects Controlling the SARS-CoV-2 spike glycoprotein conformation Developing a Fully Glycosylated Full-Length SARS-CoV-2 Spike Protein Model in a Viral Membrane VMD: Visual molecular dynamics Simulations Using the CHARMM36 Additive Force Field Site-specific glycan analysis of the SARS-CoV-2 spike New parallel computing algorithm of molecular dynamics for extremely huge scale biological systems GENESIS 1.1: A hybrid-parallel molecular dynamics simulator with enhanced sampling algorithms on multiple computational platforms CHARMM36m: an improved force field for folded and intrinsically disordered proteins Canonical sampling through velocity rescaling Isothermal-isobaric molecular dynamics using stochastic velocity rescaling Kinetic energy definition in velocity Verlet integration for accurate pressure evaluation Group-based evaluation of temperature and pressure for molecular dynamics simulation with a large time step Reversible multiple time scale molecular dynamics A smooth particle mesh Ewald method Settle: An analytical version of the SHAKE and RATTLE algorithm for rigid water models The PyMOL Molecular Graphics System