key: cord-0947788-wucv77x1 authors: Mori, Takaharu; Jung, Jaewoon; Kobayashi, Chigusa; Dokainish, Hisham M.; Re, Suyong; Sugita, Yuji title: Elucidation of Interactions Regulating Conformational Stability and Dynamics of SARS-CoV-2 S-Protein date: 2021-01-21 journal: Biophys J DOI: 10.1016/j.bpj.2021.01.012 sha: b67ecd035baecade4b37dd277adf554d868b67ad doc_id: 947788 cord_uid: wucv77x1 Ongoing COVID-19 pandemic caused by new coronavirus, SARS-CoV-2, calls for urgent developments of vaccines and antiviral drugs. The spike protein of SARS-CoV-2 (S-protein), which consists of trimeric polypeptide chains with glycosylated residues on the surface, triggers the virus entry into a host cell. Extensive structural and functional studies on this protein have rapidly advanced our understanding of the S-protein structure at atomic resolutions, while most of structural studies overlook the effect of glycans attached to S-protein on the conformational stability and functional motions between the inactive Down and the active Up forms. Here, we performed all-atom molecular dynamics (MD) simulations of both Down and Up forms of a fully glycosylated S-protein in solution as well as targeted MD (TMD) simulations between them to elucidate key inter-domain interactions for stabilizing each form and inducing the large-scale conformational transitions. The residue-level interaction analysis of the simulation trajectories detects distinct amino-acid residues and N-glycans as determinants on conformational stability of each form. During the conformational transitions between them, inter-domain interactions mediated by glycosylated residues are switched to play key roles on the stabilization of another form. Electrostatic interactions as well as hydrogen bonds between the three receptor binding domains work as driving forces to initiate the conformational transitions toward the active form. This study sheds light on the mechanisms underlying conformational stability and functional motions of S-protein, which are relevant for vaccines and antiviral drugs developments. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused coronavirus disease 2019 (COVID-19) (1) (2) (3) . The rapid spreading of this virus infection since December of 2019 has driven researchers in both academic institutes and pharmaceutical companies to develop vaccines and antiviral drugs in the earliest demand (4, 5) . SARS-CoV-2 is an enveloped RNA virus belonging to the family Coronaviridae. The spike protein (S-protein) protruding from the envelope mediates the virus entry into a host cell and therefore it is one of the primary targets for vaccines and antiviral drugs developments. S-protein is a trimeric protein and each monomer consists of S1 and S2 subunits responsible for host-cell receptor binding and membrane fusion, respectively (6) (7) (8) . Sequence similarity between SARS-CoV and SARS-CoV-2 S-proteins (sharing 76% amino acid sequence) (3, 9, 10) helps to prompt screening of already approved drugs by repositioning and to design an effective vaccine, while no prominent success has been reported yet. Structural information of a protein targeted in vaccines and antiviral drugs developments is generally essential. Until now, a large number of high-resolution X-ray and cryo-EM structures of SARS-CoV-2 proteins have been accumulated rapidly in RCSB Protein Data Bank (PDB) (11). 369 structures are deposited to the "COVID-19/SARS-CoV-2 Resources" in PDB as of Sep 4. They include S-proteins both at the prefusion and postfusion states (9, [12] [13] [14] [15] [16] , and S-protein bound to the peptidase domain of angiotensin-converting enzyme 2 (ACE2) receptor (17) (18) (19) (20) (21) . Cryo-EM structures show that S-protein at prefusion state takes, at least, two possible conformations: the Down form with three receptor binding domains (RBDs) buried at the interface and the Up form with one of RBDs protruding from the interface (9). The atomic structures of S-protein bound to human ACE2 receptor indicate that Up form plays a prominent role in the binding (18, 22) . Furthermore, cryo-EM structure of a human antibody bound to SARS-CoV-2 S-protein suggests that more than one RBD can take Up forms to bind a variable domain of antibody, individually (17) . A determinations show a possibility to alter the conformational property of S-protein by modifying distinct N-glycans (27, 28) . Intrinsic dynamic nature of SARS-CoV-2 as well as structural/functional roles of the glycosylation can be overlooked just by structural studies. Molecular dynamics (MD) simulations starting from the atomic structures have opportunities to give complementary information to experimental studies. So far, there exist several reports about modeling of a fully glycosylated full-length SARS-CoV-2 S-protein with palmitoylation at Cys1236 and Cys1241 anchored on a biological membrane (28) (29) (30) (31) . Although a trimeric structure of SARS-CoV-2 S-protein consisting of S1 and S2 subunits is a large biomolecular complex, MD-specific supercomputer, Anton2, achieved multiple 10-s simulations starting from the active and inactive forms (32, 33). These simulations provide atomic pictures of a glycan shield hiding the S-protein surface, and further predict a passive role of distinct N-glycans to stabilize the active form (30) . Thorough analysis of N-glycan heterogeneity at the S-protein surface suggests possible roles of the glycans in modulating the immune response to SARS-CoV-2 virus (28, 31) . Large-scale simulations of four fully glycosylated full-length S-proteins anchored on a lipid bilayer were carried out to map SARS-CoV-2 epitopes not shielded by N-glycans (15, 34) . MD simulations also provide structural and energetic details of the interaction between S-protein and human ACE2 receptor, suggesting that balanced hydrogen bonding and hydrophobic contacts make SARS-CoV-2 a stronger binder to human ACE2 than SARS-CoV (35) (36) (37) (38) (39) . In this study, we also performed all-atom MD simulations starting from the inactive Down and the active Up forms of a fully glycosylated SARS-CoV-2 S-protein in solution as well as targeted MD (TMD) simulations linking two forms. Our focus is to elucidate key inter-domain interactions that regulate conformational stability of each forms by taking into accounts of their dynamic structures. We also ask how the key interactions changes in the conformational transitions between the two forms. Although similar questions were already raised by the previous studies, no systematic analysis has been made using a fully glycosylated SARS-CoV-2 S-protein, to our best knowledge. To answer these questions, atomic interactions involving N-glycans are essential so that we carefully analyze atomic-level inter-domain interactions in our simulation trajectories, including the effect of glycosylation. This study will add further mechanistic insight into the highly dynamic nature of S-protein and can help designing vaccines and antiviral drugs on COVID-19. (upstream helices (UH), fusion peptide (FP), heptad repeats (HR1, HR2), central helix (CH), and transmembrane domain (TM)) reside in S2 subunit. RBD binds to ACE2 receptor to initiate the virus attachment to a host cell surface, which is followed by the virus-host cell membrane fusion mediated by S1/S2 (upstream of FP) cleavage and the large conformational changes in S2 subunit. Cryo-EM studies reported the two possible conformations of RBDs, namely, Down (PDBID: 6VXX) (9) and Up (PDBID: 6VSB) (16) . Our simulations started from the solvated models of either Down or Up form, which are available in CHARMM-GUI COVID-19 Archive (http://www.charmm-gui.org/docs/archive/covid19) ( Figure 1B ) (29) . The detailed modeling procedure can be found in the original paper by Woo and co-workers. (29) In brief, the models include N-terminus to the residue right before HR-linker in S2 subunit (residues 1-1146 (40, 41) The long N-terminal region (residues and the rest of missing parts were remodeled based on the electron density map by ISOLDE software package. (42) The models introduce 15 disulfide bonds (10 and 5 in S1 and S2 subunit, respectively) involving Cys480-Cys488 and Cys840-Cys851 in the missing loop region whose existence was suggested during the fit to the density map. All histidine residues were modeled as singly protonated on ND1 (HSD). Other titratable residues were modeled as a standard state. These We also performed targeted MD (TMD) simulations (55) , which give one of the available transition pathways between the starting and target structures by applying external forces. The external force is given to hold the root mean square deviation (RMSD) from the target structure to a certain value at each time step in the MD simulation. The constraint is calculated using the mass-weighted RMSD of all the heavy atoms of three protein chains except for N-glycans. Therefore, this TMD is useful to examine whether the glycan-protein interactions observed in the conventional MD simulations depend on the initial computational models of N-glycans predicted by In the four MD simulations, overall structures of SARS-CoV-2 S-protein were stable. No transitions between Down and Up forms were observed in the short time scales. Three RBDs were generally stable as rigid domains in the Down-form MD simulations (MD1_Down ( Figure S2A ) and MD2_Down ( Figure S3A) ). In MD1_Down, RBD A moved slightly toward the Up form ( Figure S2A ). Larger RMSDs of both RBDs are observed in MD1_Up ( Figure S2B ) and MD2_Up ( Figure S3B ). In contrast, NTDs in Down and Up forms are almost equally rigid ( Figures S2 and S3 ). In MD1_Up, rigid domain movements of NTD B and NTD C were observed around 400 ns and 650 ns, respectively (RMSD ~ 10 Å). Therefore, we mainly focus on the last 500 ns of MD1_Up and MD1_Down, which are considered to be fully equilibrated. In Figure The rest of interactions provide hydrophobic contacts to support the Up conformation. Interestingly, Y489 and Q493 switched the interaction pairs before and after the conformational transition ( Figure 4A , bottom). In Figure 3B , we observed many inter-domain contacts and hydrogen bonds between residue pairs and residue-glycan pairs between RBD and NTD. In particular, E465-N234 and E516-Y200 pairs include side-chain contacts as well as hydrogen bonds. Some of N-glycans interact with multiple sidechains, likely adding their hydrophobic interactions to the stabilization. In particular, a high-mannose-type N-glycan at N234 strongly interact with S459 and K462, while the glycan at N165 seems to loosely contact with various residues ( Figure 4B , top). In Up form, most of contacts and hydrogen bonds in RBD A -NTD interface are lost. Loose contacts between N165 and RBD A and also hydrogen bonds between H519 and N234 play an important role ( Figure 4B , bottom). The high-mannose-type N-glycan at N234 intrudes a cavity between NTD B and RBD B and interacts with D985 and R983 in S2 B , and also T415 in RBD C ( Figure 3C ). Interestingly, most of the contacts and hydrogen bonds between RBD B and NTD C as well as those between RBD C -NTD A are lost in Up form. This suggests the increased flexibility of RBD B , RBD C , and three NTDs after taking Up form of RBD A , which is consistent with our observations of RMSFs ( Figure 2A ). Movement of the N-glycans will be further discussed in the analysis of TMD. The Down form contains many contacts among three NTDs and S2 subunit as shown in Figure 3D . The contacts mainly consist of residue-residue interactions. In particular, hydrogen bonds J o u r n a l P r e -p r o o f between S383 and R983 or between S383 and D985 seem to be dominant interactions ( Figure 4C , top). Compared to the interactions between RBDs ( Figure 3A ) or between RBDs and NTDs ( Figure 3B ), electrostatic interactions involving Glu, Asp, and Arg seem to be important in the domain interfaces between RBDs and S2 subunit. All the contacts and hydrogen bonds between RBD A and S2 disappear in Up form ( Figure 4C , bottom), while some remain between RBD B /RBD C and S2 subunit. Alternatively, R983 and D985 in S2 interact with the glycan at N234 (Figure 4B, bottom) . As we see in Figure 3B , RBD A -NTD B interactions are mainly stabilized through the glycans at N165 and N234. In addition, the intruded N234 mediates the interaction between RBD A and S2 subunit. These interactions likely contribute to the stabilization of Up form. Detailed interactions between amino-acid residues and glycosylated amino-acid residues N165, N234, and N343 in MD1_Down/Up are illustrated in Figure S5 . We could see that most interaction pairs shown in Figure 3 are stable over the last 500 ns, demonstrating a good convergence of our simulations. Here we examine TMD simulation trajectories to answer how the detected inter-domain interactions changes from Down to Up or from Up to Down transitions. The analysis is worth to understand the interactions related to the N-glycans, since the previous analysis depends on the starting structures of N-glycans in MD simulations. In TMD simulations starting from Down to Up or vice versa, we added restraints only to the protein heavy atoms, allowing the N-glycans to move freely ( Figure S9 ). Figure 5 shows the minimum distances of the inter-domain amino-acid residue and residue-glycan pairs in TMD3_ToUp (see Figure S10 for all trajectories). The two amino-acid The analysis of inter-domain contacts and hydrogen bonds suggests the importance of charged amino-acid residues and their interactions. Figure 7 shows the surface charge distributions of RBDs at 500 ns of MD1_Down simulation using the APBS tool in PyMOL software (56, 57) . In the top view of RBDs ( Figure 7A ), we found that the surface of the central cavity formed by three RBDs is positively charged, which may present repulsive forces. The positive charges mainly result from sidechain of K417, R408, and K378 as we see in the side view ( Figure 7B ). There exists negatively charged regions near K378, while they result from the exposed backbone carbonyl groups in F374 as well as S374 and do not interact with the other RBD surfaces. D405 is close to R408, while the sidechain of D405 is not exposed to the RBD interface. In Up form, K378 in RDB can interact with Q493 in another nearby RBD (Figures 3 and 4 ). The loop structure including Q493 is flexible (Figure 2 ). This loop also includes the negatively Finally, we discuss about the conformational changes of S-protein between Down and Up forms. We performed six TMD simulations between the two states and observed the structural changes between them. In TMD simulations, we add the RMSD restraint forces to protein heavy atoms and force the protein toward the target structure. As shown in Figure S9 and Up form will be discussed in elsewhere. Based on the fully glycosylated SARS-CoV-2 spike protein structural models, we performed -K378 D405-S375 D405-T376 T415-S383 Y421-T385 L455-S383 F456-S383 Y473-T385 S477-K529 S477-T385 Y489-T385 Q493-C379 Q493-K378 G502- A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster A new coronavirus associated with human respiratory disease in China A pneumonia outbreak associated with a new coronavirus of probable bat origin Research and Development on Therapeutic Agents and Vaccines for COVID-19 and Related Human Coronavirus Diseases Identification of Antiviral Drug Candidates against SARS-CoV-2 from FDA-Approved Drugs Molecular structure analyses suggest strategies to therapeutically target SARS-CoV-2 Cell entry mechanisms Thanki Structures and distributions of SARS-CoV-2 spike proteins on intact virions Controlling the SARS-CoV-2 spike glycoprotein conformation Distinct conformational states of SARS-CoV-2 spike protein In situ structural analysis of SARS-CoV-2 spike reveals flexibility mediated by three hinges Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Structures of Human Antibodies Bound to SARS-CoV-2 Spike Reveal Common Epitopes and Recurrent Features of Antibodies Structural Basis for Potent Neutralization of Betacoronaviruses by Single-Domain Camelid Antibodies Exploitation of glycosylation in enveloped virus pathobiology Vulnerabilities in coronavirus glycan shields despite extensive glycosylation Site-specific glycan analysis of the SARS-CoV-2 spike Identification of Protective Lassa Virus Epitopes That Are Restricted by HLA-A2 Glycans on the SARS-CoV-2 Spike Control the Receptor Binding Domain Conformation Virus-Receptor Interactions of Glycosylated SARS-CoV-2 Spike and Human ACE2 Receptor Spike Protein Model in a Viral Membrane Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer Map of SARS-CoV-2 spike epitopes not shielded by glycans Enhanced receptor binding of SARS-CoV-2 through networks of hydrogen-bonding and hydrophobic interactions The SARS-CoV-2 Exerts a Distinctive Strategy for Interacting with the ACE2 Human Receptor On the interactions of the receptor-binding domain of SARS-CoV-1 and SARS-CoV-2 spike proteins with monoclonal antibodies and the receptor ACE2 Effect of mutation on structure, function and dynamics of receptor binding domain of human SARS-CoV-2 with host cell receptor ACE2: a molecular dynamics simulations study Computational Prediction of Mutational Effects on SARS-CoV-2 Binding by Relative Free Energy Calculations The FALC-Loop web server for protein loop modeling GalaxyTBM: template-based modeling by building a reliable core and refining unreliable local regions ISOLDE : a physically realistic environment for model building into low-resolution electron-density maps Update of the CHARMM All-Atom Additive Force Field for Lipids: Validation on Six Lipid Types Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ 1 and χ 2 Dihedral Angles CHARMM36m: an improved force field for folded and intrinsically disordered proteins A smooth particle mesh Ewald method Reversible multiple time scale molecular dynamics New spherical-cutoff methods for long-range forces in macromolecular simulation GENESIS 1.1: A hybrid-parallel molecular dynamics simulator with enhanced sampling algorithms on multiple computational platforms GENESIS: A hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations Targeted Molecular Dynamics Simulation of Conformational Change-Application to the T ↔ R Transition in Insulin Gaussian Accelerated Molecular Dynamics: Unconstrained Enhanced Sampling and Free Energy Calculation Gaussian Accelerated Molecular Dynamics in NAMD Replica-exchange molecular dynamics method for protein folding This research used computer resources of Fugaku in RIKEN Center for Computational Science (G9330003). The computer resources of Oakforest-PACS were also provided through HPCI System