key: cord-1039497-5jfpmgfh authors: Dubanevics, Igors; McLeish, Tom C.B. title: Computational Analysis of Dynamic Allostery and Control in the SARS-CoV-2 Main Protease date: 2020-07-20 journal: bioRxiv DOI: 10.1101/2020.05.21.105965 sha: 3ba10db58099403f03b457f72a556c7a3e087974 doc_id: 1039497 cord_uid: 5jfpmgfh The COVID-19 pandemic caused by the novel coronavirus SARS-CoV-2 has generated a global pandemic and no vaccine or antiviral drugs exist at the moment of writing. An attractive coronavirus drug target is the main protease (Mpro, also known as 3CLpro) because of its vital role in the viral cycle. A significant body of work has been focused on finding inhibitors which bind and block the active site of the main protease, but little has been done to address potential non-competitive inhibition which targets regions beyond the active site, partly because the fundamental biophysics of such allosteric control is still poorly understood. In this work, we construct an Elastic Network Model (ENM) of the SARS-CoV-2 Mpro homodimer protein and analyse the dynamics and thermodynamics of the main protease’s ENM. We found a rich and heterogeneous dynamical structure in the correlated motions, including allosterically correlated motions between the homodimeric protease’s active sites. Exhaustive 1-point and 2-point mutation scans of the ENM and their effect on fluctuation free energies confirm previously experimentally identified bioactive residues, but also suggest several new candidate regions that are distant from the active site for control of the protease function. Our results suggest new dynamically-driven control regions as possible candidates for non-competitive inhibiting binding sites in the protease, which may assist the development of current fragmentbased binding screens. The results also provide new insight into the protein physics of fluctuation allostery and its underpinning dynamical structure. In 2019, a rapidly spreading disease named COVID-19 caused by the novel coronavirus SARS-CoV-2, 2 has since generated a global pandemic. Preventive measures have been taken by a majority of 3 countries, but no vaccine or anti-viral drugs exist, at the time of writing, although candidates are 59 No crystallographic structure of the SARS-CoV-2 M pro * active form with a polyprotein is available 60 to date. Only empty (apo) structures or structures with synthetic ligands/substrates attached to 61 the active site available. Therefore, the ENM study reported here used a recent crystallographic cross-correlation of motion map for the all residues in the ENM apo (ligand-free), holo1 (only one 83 active site at chain A occupied) and holo2 (both active sites occupied) structures are shown in figure 84 3. We discuss the dynamic features of each structure in the following: to 213, also anti-correlates with C145 on the opposite chain; while surprisingly (since its mutation is 94 effective in allosteric control) N214 shows no correlation with the catalytically vital C145 at all. We 95 observe strong positive correlation between this helix and two regions forming the active site pocket: 96 residues K137-N142 (loop) and E166-H172 (β-turn) (Fig. 3A , narrow purple rectangles in lower 97 right and upper left quadrants). However, this correlation can partially be accounted for by spatial 98 proximity. S284-L285-A286 residues (henceforth SLA) on one monomer show positive cross-chain 99 dynamic coupling of motion with the identical residues on the other, in this case through spatial 100 proximity (Fig. 3A the active site's catalytic dyad, H41 and C145, and residues around it (Fig. 3A , not shown). Thus, 107 we observe both dynamic correlation at a distance, as well as that due to immediate spatial proximity, 108 supporting previous findings regarding SARS-CoV M pro dynamically allosteric inactivation. done. Figure 4C reports the effect of the same mutational scan on the allosteric free energy K 2 /K 1 170 (Eq. 3) of binding between the two active sites. White corresponds to values of free energy predicted by the wild-type ENM. Red corresponds to The map for the vibrational free energy change plotted in real space onto the wild-type M pro homodimer structure at k R /k=4.00. (C) A map for the global control space of allostery in M pro . The map plots the change in cooperativity coefficient (K 2 /K 1 ) due to the dimensionless change in the spring constant (k R /k) for the mutated residue with the amino acid number shown. White corresponds to values of K 2 /K 1 predicted by the wild-type ENM. Red corresponds to an increase in K 2 /K 1 (stronger negative cooperativity), whereas blue corresponds to a decrease in K 2 /K 1 (weaker negative cooperativity or result considering that no ligand is present to emphasise the spatial nature of the active sites. They seem dynamically pre-disposed to dynamic allosteric communication, in agreement with the 175 cross-correlation map (Fig. 3) . Both termini display mutation peaks due to their spatial proximity to the active sites. Additionally, a very sharp peak is seen around residues 187-192 where a free We also note 7 residues which are located on the homodimer chains' interface (K5, P9, K12, E14, M276, I281 and S284), recalling that in the CAP homodimer, residues located on the interface were 201 critical in allosteric regulation (9). Especially responsive is E14 located on the very first alpha helix. Fig. S4) . Therefore, the experimentally- which in our ENM corresponds to k R /k>1 for similar reasons as for N214A mutation (SI, Fig. S4) . 220 We see a decrease in cooperativity for S284 spring stiffening, while for spring relaxation the ENM's 221 cooperativity increases. (Fig. 5A,C) , k R /k=4.00, models the effect of small molecule/ligand binding to the mutated 232 residues (and would also model mutations such as N214A (SI, Fig. S4) ); while k R /k=0.25 map 233 looks at the opposite extreme to the stiffening case, which would model mutations that weaken local 234 bonding (Fig. 5B,D) . scans, is the difference in total free energy of the apo structure. As on the 1-point map for apo 6LU7 237 structure strong lines are observed (Fig. 5A,B) , but spring relaxation resolves fewer biologically 238 active residues than spring stiffening. In figure 5A only residues around H41, a loop region forming and N214 (0.29 %) define maximum absolute response upon relaxation and stiffening of these residues 244 on both chains, respectively (Fig. 5A,B) . In both cases the SLA region show moderate fluctuation 245 free energy change. We conclude that stiffening is a better choice for resolving critical residues in 246 fluctuation free energy control. Note that, in the case of this protein, the 2-point mutations combine 247 approximately linearly: the effect of the first mutation (vertical lines) is not strongly affected by the 248 second mutation. Nevertheless, response to relaxation and stiffening qualitatively different plots. spring relaxation and stiffening on the allosteric free energy between the two active sites was 251 calculated (Fig. 5C,D) . While the 2-point apo maps show qualitatively different behaviour for 252 relaxation and stiffening, the allosteric free energy 2-point mutation maps are qualitatively exactly 253 the same with an inverted allosteric free energy change sign. A new strong control site that did not appear on the 2-point apo maps (Fig. 5A,B) is found at the beginning of the second beta-sheet 255 (T25-L32) on each chain (Fig. 5C,D) . Additionally, S1 and G302 exhibit strong allosteric control 256 due to spatial closeness to the active site and, thus, the ligand. The apparent increased cooperativity of control of the allosteric free energy itself is more subject to noise, being a difference-quantity, but 282 sufficiently to identify strong candidates for control regions (SI, Fig. S3 ). 283 The analysis shows that SARS-CoV-2 M pro possesses a rich dynamical structure that supports several Fig. S5 ). The position 296 of these residues suggest them as possible candidates for non-competitive inhibiting binding sites. 297 We also draw attention to eight residues located on M pro interface surface (Tab. 1) as a potential dynamically allosteric control residues. Computational studies such as this, therefore, accompany and support concurrent experimental 300 programs of scanning for small-molecule binding candidates to the protein in question. We note that • The force field everywhere arises from sums over ENM harmonic 326 The whole NMA method can be reduced to three steps: 327 1. Construct mass-weighted Hessian for a system. For a protein ENM the system consists of the 328 co-ordinates of the C-alpha atoms (N ) for each residue from the corresponding PDB structure. The diagonalisation of the 3N × 3N mass-weighted Hessian matrix is written as √ m i ∂r j √ m j : the potential energy function V ; distance between nodes r; node masses m. The eigenvectors of the mass-weighted Hessian matrix, columns of A, are the normal mode eigenvectors a. Λ is a 3N × 3N diagonal matrix with diagonal values equal to the associated normal modes' squared angular frequencies ω 2 . The potential function used in this study is: where r c is a cut-off radius, which for this work is set at 8 Å; while r (0) is the equilibrium distance between 333 nodes derived form PDB crystallographic structure. For the wild-type protein, all spring constants are 334 equal k ij = k=1 kcal Å −2 mol −1 . 335 Cross-correlation of Motion. The cross-correlation, C, is estimated between an ENM node pair as a 336 normalised dot product sum between their normal mode eigenvectors over v modes. 338 C value of 1 implies perfectly correlated motion, -1 perfectly anti-correlated motion and 0 implies totally 339 non-correlated motion. Normal Mode Fluctuation Free Energy. Using statistical mechanics it is possible to calculate an estimate to the fluctuation free energy of a system using the frequency of vibrations such as the normal modes. For this method, the partition function for the quantum harmonic oscillator (31), Z, for normal mode k is given as where k B is the Boltzmann's constant, is the reduced Planck's constant, T is temperature in Kelvin and 341 ω is, already mentioned, angular frequency. Gibbs free energy (for a given mode) expressed in terms of 342 partition function, with an approximation of little change in volume, can be written as On the nature of allosteric transitions-a plausible model Allostery without conformational change. a plausible model Dynamic allostery of protein alpha helical coiled-coils Allostery without conformation change: modelling protein 361 dynamics at multiple scales The 'allosteron' model for entropic allostery of self-363 assembly Dynamics of a small globular protein in terms of low-frequency 365 vibrational modes Normal modes for specific motions of macromolecules: application to the 367 hinge-bending mode of lysozyme Direct evaluation of thermal fluctuations in proteins using a single-369 parameter harmonic potential Modulation of global low-frequency motions underlies allosteric regulation: 371 demonstration in crp/fnr family transcription factors Ddpt: a comprehensive toolbox for the analysis of protein motion Crystal structure of sars-cov-2 main protease provides a basis for design of 375 improved α-ketoamide inhibitors Sars-cov 3cl protease cleaves its c-terminal autoprocessing site by novel 377 subsite cooperativity An overview of severe acute 379 respiratory syndrome-coronavirus (sars-cov) 3cl protease inhibitors: Peptidomimetics and small 380 molecule chemotherapy Structure of m pro from covid-19 virus and discovery of its inhibitors Structure-based design of antiviral drug candidates targeting the sars-cov-2 main 383 protease Structural basis for the inhibition of sars-cov-2 main protease by antineoplastic drug 385 carmofur Mapping active allosteric loci sars-cov spike proteins by means of protein 387 contact networks Prediction of the sars-cov-2 (2019-ncov) 3c-like protease (3cl pro) 389 structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates Characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune 392 cross-reactivity with sars-cov Rapid identification of potential inhibitors of 394 sars-cov-2 main protease by deep docking of 1.3 billion compounds Virtual screening and repurposing of fda approved drugs against 396 covid-19 main protease Topological analysis of sars cov-2 main protease Large-scale ligand-based virtual screening for sars-cov-2 inhibitors using 399 deep neural networks Drug repurposing for coronavirus (covid-19): 401 in silico screening of known drugs against coronavirus 3cl hydrolase and protease enzymes Dynamically-driven inactivation of the catalytic machinery of the sars 3c-like protease 404 by the n214a mutation on the extra domain Dynamically-driven enhancement of the catalytic machinery of the 406 sars 3c-like protease by the s284-t285-i286/a mutations on the extra domain The role of protein-ligand contacts in allosteric regulation of the escherichia 408 coli catabolite activator protein Global low-frequency motions in protein allostery: Cap as a model system Main Protease Structure and XChem Fragment Screen (Link) Normal mode analysis of biomolecular structures: 414 functional mechanisms of membrane proteins Concepts in thermal physics