key: cord-0967996-cvtajxrk authors: Sharma, Shilpa; Deep, Shashank title: pH Effect on the Dynamics of SARS-CoV-2 Main Protease (Mpro) date: 2020-11-30 journal: bioRxiv DOI: 10.1101/2020.11.30.404384 sha: 0ac3b15c604366c02f2cc0f2b6f0b1989070a3c5 doc_id: 967996 cord_uid: cvtajxrk The SARS-CoV-2 main protease (Mpro) is a crucial enzyme responsible for the maturation of novel coronavirus, thus it serves as an excellent target for drug discovery. SARS-CoV-2 is found to have similarity with SARS-CoV, which showed conformational changes upon varying pH. There is no study till date on how pH change affect the conformtional flexibilty of SARS-CoV-2 Mpro, therefore, we attempt to find the effect of pH variation through constant pH molecular dynamics simulation studies. Protein is found to be most stable at neutral pH and as pH turns basic protein structure becomes most destabilized. Acidic pH also tends to change the structural properties of Mpro. Our study provides evidence that the flexibility of Mpro is pH dependent like SARS-CoV Mpro. COVID-19 is a severe respiratory illness caused due to novel coronavirus, SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2). 1 Coronaviruses belong to Nidovirales family, these are enveloped, positive-sense RNA viruses, having very long genome around 30000 bases. 2 10 kilobases are used for structure and accessory functions and 20 kilobases encode for non-structural proteins. 2 The replicase gene encodes for the two overlapping polyproteins pp1a and pp1ab, which are critical for replication and transcription. These polyproteins are processed extensively by virus proteinases and the functional proteins are released for carrying out the virus replication. 3 M pro also referred as 3CL pro , is a 33 kDa cysteine protease responsible for cleaving and processing of viral polyproteins. The structure of SARS-CoV-2 M pro free enzyme is recently solved by Zhou et al 4 at pH 7. M pro is a homodimeric protein, shown in Fig. 1 . Structure of each monomer is comprised of three do-mains. Residues 10-99, 100-182 and 198-303 are involved in the formation of domains 1,2 and 3, respectively. Dimerization of M pro is regulated by domain 3. Catalytic dyad of the enzyme is formed by residues His 41 and Cys145. Dimerization is necessary for the catalytic activity of M pro . 5,6 Due to its role in viral transcription and replication, M pro serves as an attractive target for drug designing. Many studies have been carried out to identify the potential inhibitors of M pro . 7, 8 Function of a protein is closely related to its structure. As structure perturbs from native conformation, its activity changes and under drastic conditions protein may become functionally inactive. Various extraneous conditions like heat, concentration of like and unlike molecules around and pH of surrounding etc affect the protein structure, therefore, their effects are important to be examined on the protein of interest. Proteins are composed of various amino acids having acidic, basic, aliphatic, polar and non-polar side chains. As pH of the surroundings is changed, the protonation states of these side chains also modify which in turn alters several interactions and give protein a distinct structure. Effect of pH was seen on the structure of SARS-CoV M pro by Jinzhi et al., 9 their study showed that the conformational flexibility of M pro is pH dependent. Henedersen et al. 10 utilized constant pH MD simulations to assess the proton coupled dynamics of various PLpros including SARS-CoV-2. In our knowledge, there is no pH dependant study of SARS-CoV-2 M pro structure has been conducted, therefore, in our study, we have seen the effect of pH alteration, ranging from acidic to basic pH, on the structure of SARS-CoV-2 main protease (M pro ) through constant pH molecular dynamics simulations (CpHMD). All atom constant pH molecular dynamics (CpHMD) simulations 11,12 were performed using AMBER18 13 and amberff10 14 force field was used for deriving the bonded and nonbonded parameters. Initial structure of protein was obtained from RCSB (PDB: 6y2e 4 ) and protein was solvated using TIP3P 15 water molecules inside trucated octahedron box of 18 nm. Charges on the protein were neutralized by adding counter ions. Energy minimization was carried out using 1000 steps of steepest descent followed by 4000 steps of conjugate gradient method, keeping backbone atoms restrained. Temperature of the system was slowly increased from 10 K to 300 K over 400 ps under NVT ensemble, followed by equilibration at constant pressure for 4 ns. Berendsen barostat 16 was used for maintaining constant pressure condition. The study utilized periodic boundary conditions and PME summation 17 for electrostatic calculations.The shake methodology 18 was applied to restrict covalently bonded hydrogen atoms. Time step of 2 femtosecond with 8 A 0 cut off for non-bonded interactions were applied. Convergence of energy, temperature and density were monitored. Parallel to the molecular dynamics simulation, the constant pH method applied in pmemd through Monte Carlo sampling of the Boltzmann distribution of individual protonation states. Distribution of protonation states is influenced by solvent pH, which is set as an external parameter. A total of five CpHMD simulations were run at pH 4, 5, 6, 7 and 8, by varying the solvent pH. All the acidic aminoacids (ASP and GLU) are assumed to be deprotonated and basic aminoacids (ARG, LYS and CYS) to be protonated in this pH range. Protonation states of catalytic HIS 41 and CYS 145 were unchanged, all other twelve histidines present in the protein were set as titrable residues and their protonation states are subjected to change by changing the partial charges on the atoms of titrable residues. Production run was carried out for 100 ns, during production after every 100 steps protonation state change of every titrable residue was attempted. GB salt concentration was set to 0.1 M. Structural changes were seen through root mean square deviation (RMSD) of backbone atoms, which was calculated with respect to the first frame structure. Residue wise fluctuations were seen through root mean square fluctuation (RMSF) which is time average RMSD. Surface area of protein was calculated using LCPO algorithm by Weiser et al. 19 For calculating the native contacts, 7 A 0 cut off was set, active contacts were found to be present if the distance between two C-alpha atoms were within 7 A 0 at any given time frame. Clustering analysis was done using average linkage method, by keeping epsilon equal to 1. The most populated cluster was selected for comparing substrate binding pockets at different pH. Distance between two residues was calculated using distance tool of cpptraj. 20 The change in protonation states bring about local fluctuations in protein which then contribute to overall dynamical changes in protein, therefore to see these changes, root mean square deviation (RMSD) was monitored throghout the simulation, as shown in Fig. 2(a) . Initially, RMSD values increased slowly at all the pH values but a sudden increase in RMSD was observed at pH 8 around 50 ns (1 ns=500 frames), which kept on increasing till 80 ns and became constant after that. For a better understanding, probability distribution curve of each RMSD value was constructed, (Fig. 2(b) ), two peaks were observed around 1.8 A 0 , 2.3 A 0 at pH 4 and around 2.4 A 0 , 2.9 A 0 at pH 8. For pH 5, 6 and 7 RMSD values having maximum probablities were observed at 2.2 A 0 , 1.9 A 0 and 2 A 0 , respectively. From RMSD analysis, it can be seen that at pH 4 and pH 8 protein sampled large number of conformations and fluctuations were maximum at pH 8, minimum at pH 7. Contribution of each residue towards the total structural deviation was seen through root mean square fluctuation (RMSF) analysis, Fig. 3 Solvent accessible surface area (SASA) and number of native contacts can be used to monitor the proper folding of protein, therefore, these were calculated at all the pH (Fig. 4) . At pH 4, 5 and 7, only a small change in SASA was observed with time, most probable values of SASA were equal to 267.37 nm 2 , 269.31 nm 2 and 261.98 2 , respectively. At pH 6, SASA started increasing around 20 ns and kept on increasing till 100 ns, with two peaks at 264.27 and 275 nm 2 in probability plot. At pH 8, value of SASA first decreased and then started increasing again after 20 ns, a wide probability distribution of SASA was obtained with 270.78 nm 2 having maximum probability. In the beginning, native contacts were around 2240 at all pH, after few nanoseconds native contacts started decreasing and at pH 8 maximum decrease in contacts was observed, Fig. 4(c) . At pH 7, only slight changes in native contacts were observed. From probability distribution curve, native contacts were found to be highest, 2220, at pH 7. At pH 4, 2170 native contacts were present. At pH 5, 6 and 8, a broad distribution of native contacts was seen, with increase in contacts from pH 5 to 6 and least number of native contacts were there at pH 8, Fig. 4(d) . Surface area was found to be minimum and native contacts were found to be maximum at pH7, implying that the structure of M pro is most stable at neutral pH. Dimerization of main protease is essential for its activity (because the N-finger of each of the two protomers interacts with Glu166 of the other protomer and thereby helps shape the S1 pocket of substrate binding site), its monomeric state is not active. Substrate binding site of enzyme is shaped through the interaction of N-finger of one monomer and Glu166 of another monomer, therefore, distance between residues Ser1 and Glu166 was monitored to ensure intact dimer formation throughout the simulation at all pH values, Fig. 5(a) . At pH 7, distance between (a) (b) Figure 5 : Plots for (a) distance between Nfinger of chain A and Glu166 of chain B, (b) probability distribution of distance at every pH value. Ser1 and Glu166 was remained constant at all time frames. At pH 4, 5 and 6, this distance increased with time and value fluctuated between 8 A 0 and 12 A 0 . At pH 8, distance started increasing after 20 ns and a sudden increase was observed around 45 ns followed by a constant fluctuation around 16 A 0 at all frames. Similarly, in probability histogram multiple peaks are observed at pH 4, 5, 6 and 8, however, a single peak was observed at pH 7, Fig. 5(b) , which suggests that dimerization process is weak at acidic and basic pH. From clustering analysis, representative structures of the most populated clusters were obtained at all pH values. Proper orientation of active site residues is necessary for protein to carry out its function. The active sites, of the obtained structures of M pro from clustering, were compared, Fig. 6 (a)-6(e). His 41 and Cys 145 forms the catalytic dyad of protein, their proper orientation was observed only at pH 7, Fig. 6(d) . At all the pH other than pH 7, N 2 of His 41 was not found to be orienting towards S γ of Cys 145, as shown in Fig. 6 (a), 6(b), 6(c) and 6(e). Thus, the orientation of catalytic dyad of M pro S1 binding pocket also gets affected by pH. Our study has demonstrated that structure of SARS-CoV-2 main protease (M pro ) shows pH dependence. At basic pH (pH 8), the structure of M pro was found to be least stable and as pH becomes more acidic, structure of M pro gets destabilized.The structure of M pro showed maximum stability at neutral pH. At all the pH other than neutral pH, dimerization and S1 binding pocket were found to be affected. Our study throws light on how an environmental factor like pH can affect the integrity of active site of M pro and encourage researchers to explore the small molecules that can modulate the intracellular pH to prevent the replication by lowering the activity of M pro . SS thanks UGC India for fellowship. Authors thank the IIT Delhi HPC facility for computational resources. Authors also thank Scf-Bio for providing the access of AMBER18. There is no conflict of interest. Respiratory Virus Infections: Understanding COVID-19 Coronaviruses: an overview of their replication and pathogenesis Expression and Functions of SARS Coronavirus Replicative Proteins Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved -ketoamide inhibitors Coronavirus Main Proteinase (3CLpro) Structure: Basis for Design of Anti-SARS Drugs Zhenming nad Du Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors In-silico drug repurposing for targeting SARS-CoV-2 main Silico Drug Repurposing for SARS-CoV-2 Main Proteinase and Spike Proteins pH-dependent conformational flexibility of the SARS-CoV main proteinase (M(pro)) dimer: molecular dynamics simulations and multiple X-ray structure analyses Assessment of proton-coupled conformational dynamics of SARS and MERS coronavirus papainlike proteases: Implication for designing broad-spectrum antiviral inhibitors Constant pH Replica Exchange Molecular Dynamics in Explicit Solvent Using Discrete Protonation States: Implementation, Testing, and Validation The Amber biomolecular simulation programs How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules Comparison of simple potential functions for simulating liquid water Molecular dynamics with coupling to an external bath Accuracy and efficiency of the particle mesh Ewald method A fast SHAKE algorithm to solve distance constraint equations for small molecules in molecular dynamics simulations Approximate atomic surfaces from linear combinations of pairwise overlaps (LCPO) Software for Processing and Analysis of Molecular Dynamics Trajectory Data