key: cord-0805340-4vz155ku authors: Roy, Urmi title: Comparative Structural Analyses of Selected Spike Protein-RBD Mutations in SARS-CoV-2 Lineages date: 2021-06-23 journal: bioRxiv DOI: 10.1101/2021.06.23.449639 sha: 03b549609f3f76da1a14cbb4263a832e9a62d500 doc_id: 805340 cord_uid: 4vz155ku The severity of the covid 19 has been observed throughout the world as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) had spread globally claiming more than 2 million lives and left a devastating impact on peoples’ life. Recently several virulent mutant strains of this virus, such as the B.1.1.7, B.1.351, and P1 lineages have emerged. These strains are predominantly observed in UK, South Africa and Brazil. Another extremely pathogenic B.1.617 lineage and its sub-lineages, first detected in India, are now affecting some countries at notably stronger spread-rates. This paper computationally examines the time-based structures of B.1.1.7, B.1.351, P1 lineages with selected spike protein mutations. Additionally, the mutations in the more recently found B.1.617 lineage and some of its sub-lineages are explored, and the implications for multiple point mutations of the spike protein’s receptor-binding domain (RBD) are described. , the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has claimed more than 2 million lives and left a devastating global impact since March 2020 [1] . In recent months, several variants of this life-threatening virus have been emerged with greater spreadrates, adaptability and fitness. Among these virulent lineages B.1.1.7, B.1.351 and P1 were initially detected in UK, South Africa and Brazil, respectively [2] . These nomenclatures are based on Pango lineage [3] . As summarized in Table 1 , these "Variant of Concern" have several notable mutations, some of which are common among the three lineages The Table also includes the highly pathogenic B.1.617 lineage and some of its sub-lineages, that have been first found in India more recently, and already included in CDC's "Variants of Interest" [4] . The delta variant (also known as B.1.617.2) has been listed under CDC's "Variant of Concern" on June 15th, 2021. The present work focuses on a set of comparative structural analyses of these new SARS-CoV-2 variants. Computational structural biology is rapidly becoming an integral part of applied immunology, as this field continues to aid the understanding of the structural basis of proteins, and thus, plays a key role in the development of preventive drug designs [5, 6] . Since the beginning of the Covid-19 pandemic, the availability of experimental results about the structure/function, epidemiological distribution and mutational fitness of this novel pathogen has been very limited in the commonly available literature. As a result, scientists have heavily relied on simulation-based tools and strategies to investigate this virus. In this regard, computational tools of immunoinformatics can be particularly useful to investigate such evolving infectious pathogens and host-pathogen interactions [7] [8] [9] . Our present effort is guided by these considerations. In our previous papers, we have analyzed several biologically relevant protein structures and mutant models of the angiotensin peptide coordinated to the Zn-bound ACE2 receptor; more recently, we have reported a model structure of the SARS-CoV-2 N501Y variant [10] [11] [12] [13] [14] [15] . In this current investigation of SARS-CoV-2 lineages, we will examine the implications for multiple point mutations on their spike RBD. In particular, we will measure the structural and conformational variations of these mutant variants as functions of time and, demonstrate how structural change corresponds to their functions. [2] 69/70 deletion, 144Y deletion, E484K*, S494P*, N501Y, A570D, D614G, P681H E484K, S494P, N501Y B.1.351 [2] K417N, E484K, N501Y, D614G K417N, E484K, N501Y P1 [2] K417T**, E484K, N501Y, D614G K417T , E484K, N501Y B.1.617 [4] L452R, E484Q, D614G L452R, E484Q B.1.617.1 [4] T95I*, G142D, E154K, L452R, E484Q, D614G, P681R, Q1071H B.1.617.3 [4] T19R, G142D, L452R, E484Q, D614G, P681R, D950N variant with possible mutations [16] L452R, Y453F L452R,Y453F * found in some sequences ** Initially observed as K417N/T, later this mutation was identified as K417T in the P1 lineage The SARS-CoV-2 genome contains non-structural (NSPs) as well as structural proteins. Structural illustrations of SARS-CoV-2 and its genome structure are schematically presented in Fig 1a- b. Though it has some structural similarities with SARS and MERS, as observed in their sequence similarities, the fast transmissibility and adaptability of the highly pathogenic SARS-CoV-2 is rather unique. The SARS-CoV-2 is ~1273 amino acids (AAs) long, that contains the S1 (~14-685AAs) and S2 (~686-1273AAs) subunits. At the start of this structure, a small N-terminus signaling peptide (~1-13AAs) is present. Then comes the S1 subunit, that helps in receptor binding and comprised of two domains, N-terminal domain (NTD; ~14-305AAs) and receptor binding domain (RBD; ~319-541AAs) [17] . The mutations we discussed in this report are centered on the lately found variants of S1 RBD. We have used the spike protein RBD for the first set of simulations, where we have selected the wt type 6M0J: E as SARS-CoV-2 S1 RBD [18] and Starting from the native structures (6M0J:E), the mutant variants were generated using the mutator gui of Visual Molecular Dynamics (VMD) [19] . In total we have simulations for the mutations listed in Table 1 , one wt S1 RBD and five S1-RBD variants with selected mutations. These simulations have used the Nanoscale Molecular Dynamics (NAMD), quickMD and Visual Molecular Dynamics (VMD) software programs [19] [20] [21] . After completing the initial protocols (minimization/annealing and equilibration processes), the MD simulation was continued for 30 ns. The integration time was 2 fs for all procedures. All protocols have used the Generalized Born solvent-accessible surface area implicit solvation model [22] . For annealing and equilibration, the backbones were restrained, but no atoms were restrained during the final simulation process. Using Langevin dynamics the temperature was maintained at 300 K during final simulation process. The details of the simulation protocols are described elsewhere [15] . The proteins' 3D models were set up by using Biovia's Discovery Studio Visualizer [23] . Fig.1a shows a generalized schematic of the wt SARS-CoV-2 structure, where the S, E, M, N and the viral RNA, as well as the two subunits S1 and S2 of the S protein are identified. The mutations considered in this report are found in the RBD, within the S1 subunit of the S protein. Fig. 3c . We have plotted the RMSD and RMSF graph for the wt RBD structure previously but for comparison we have included it in Figure 3a and c [15] . Figure 3d shows the hydrogen bond numbers during the simulation time; for none of the cases considered, these numbers exhibit any significant variations. The stable mutation in residue 452 may form a stronger complex with ACE2. The Y453F in the Figure 5e ' also exhibits stronger stability with time. The wt E484 residue has been recognized as a "repulsive residue" between the RBD-ACE2 complex [24] . Since the mutations in the E484K/Q residue are particularly stable in the P1 and B.1.617 strains (Figure 4c'and 4d According to the results presented here, the mutant RBD variant of B.1.617 (as well as some of its sub-lineages), P1 and the potential variant with two possible mutants are the most stable forms. Among the mutations we have studied in this work, L452R, Y453F, E484Q, S494P are fairly stable. N501Y does not show significant variations during the simulation timescale. The E484K within the P1 strain is also rather stable. Since these newly found lineages are more spreadable than their predecessor species, some of the mutations may escape from antibody neutralization and cellular immunity. In fact, some of the variants with mutations K417N/T, E484K, L452R and Y453F are recognized as antibody neutralizing escape mutants [16, 25] . The steady mutations identified here to occur within the highly infective species may help to further understand for the associated antibody cross-reactivity, and may also facilitate the task of designing effective inhibitors. The enhanced stabilities of some of the mutant residues, as found here with these newer variants, may have implications in the context of future vaccine developments to combat other impending strains and pathogenic variants of SARS-CoV-2. SARS-CoV-2 Variant Classifications and Definitions A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology SARS-CoV-2 Variant Classifications and Definitions Molecular Interactions Between Innate and Adaptive Immune Cells in Chronic Lymphocytic Leukemia and Their Therapeutic Implications Crystal structure of the complex between actin and human vitamin Dbinding protein at 2.5 A resolution High affinity targets of protein kinase inhibitors have similar residues at the positions energetically important for binding Interaction of local anesthetics with the K (+) channel pore domain: KcsA as a model for drug-dependent tetramer stability Exploring the role of amino acid-18 of the leucine binding proteins of E. coli Structural Characterizations of the Fas Receptor and the Fas-Associated Protein with Death Domain Interactions Structural modeling of tumor necrosis factor: A protein of immunological importance 3D Modeling of Tumor Necrosis Factor Receptor and Tumor Necrosis Factor-bound Receptor Systems Structural and molecular analyses of functional epitopes and escape mutants in Japanese encephalitis virus envelope protein domain III Modeling Substrate Coordination to Zn-Bound Angiotensin Converting Enzyme 2 L452R and Y453F SARS-CoV-2 mutations increase transmission and evade immunity Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19 Structure of the SARS-CoV-2 spike receptorbinding domain bound to the ACE2 receptor VMD: Visual molecular dynamics Scalable molecular dynamics with NAMD QwikMD -Integrative Molecular Dynamics Toolkit for GPU/CPU Algorithm for Generalized Born/Solvent-Accessible Surface Area Implicit Solvent Calculations Discovery Studio Modeling Environment Key residues of the receptor binding domain in the spike protein of SARS-CoV-2 mediating the interactions with ACE2: a molecular dynamics study Complete map of SARS-CoV-2 RBD mutations that escape the monoclonal antibody LY-CoV555 and its cocktail with LY-CoV016 The author acknowledges utilization of the following simulation and visualization software The author declares no conflict of interest.