key: cord-0912956-0kyf3bij authors: Hilgenfeld, Rolf; Anand, Kanchan; Mesters, Jeroen R.; Rao, Zihe; Shen, Xu; Jiang, Hualiang; Tan, Jinzhi; Verschueren, Koen H. G. title: Structure and Dynamics of Sars Coronavirus Main Proteinase (M(PRO)) date: 2006 journal: The Nidoviruses DOI: 10.1007/978-0-387-33012-9_106 sha: 09cc1d654aeecb5c98853b3e01c417e7d896629c doc_id: 912956 cord_uid: 0kyf3bij nan All protein functions required for SARS coronavirus replication are encoded by the replicase gene. 1, 2 This gene encodes two overlapping polyproteins (pp1a and pp1ab), from which the functional proteins are released by extensive proteolytic processing. This is primarily achieved by the 34-kDa main proteinase (M pro ), which is frequently also called 3C-like proteinase (3CL pro ) to indicate a similarity in substrate specificity with the 3C proteinase of picornaviruses. 3 While useful at the time of initial description of the coronaviral enzyme, there are in fact large differences between the structures and mechanisms of these enzymes, making the designation of the coronavirus main proteinase as 3CL pro rather misleading. We will therefore use the term M pro exclusively. The functional importance of the SARS-CoV M pro in the viral life cycle makes it a preferred target for discovering anti-SARS drugs. [4] [5] [6] [7] However, in order to apply rational drug design or virtual screening, information on the structure of the target enzyme is required. Initially, this came from homology models of the SARS-CoV M pro that were constructed on the basis of crystal structures of human CoV (HCoV) 229E M pro and of porcine transmissible gastroenteritis virus (TGEV) M pro that we had previously determined. 4, 8 The SARS virus enzyme shares about 40% sequence identity with these proteinases of group I coronaviruses. More recently, the crystal structure of the SARS-CoV M pro has been determined. [9] [10] [11] As with other CoV M pro s, the molecule comprises three domains (Figure 1 ). Domains I (residues 8-101) and II (residues 102-184) are β- barrels and together resemble the structure of chymotrypsin, whereas domain III (residues 201-306) consists mainly of α-helices. The active site, containing a Cys…His catalytic dyad, is located in a cleft between domains I and II. Domains II and III are connected by a long loop (residues 185-200). In vitro experiments demonstrated that deletion of domain III abolished almost completely the proteolytic activities of the main proteinases of TGEV and SARS-CoV. 8, 12 This domain is essential for the dimerization of the M pro , 13 which in turn assures proper orientation of the N-terminal residues of monomer B that play an important role for the catalytic activity of monomer A (and vice versa; see below). 9, 10 In all known crystal structures of coronavirus main proteinases, the enzyme exists as a dimer, 4, [8] [9] [10] [11] and dimerization is also observed in solution at slightly elevated concentrations. 8, [12] [13] [14] The dimer is the enzymatically active species because the specific activity increases linearly with increasing enzyme concentration. 14 A special feature first discovered for the SARS-CoV M pro (but most probably present in all coronavirus main proteinases) is that in the monoclinic crystals grown at pH 6.0, the two monomers have different conformations around the S1 substrate-binding site, because the loop 138-145, in particular Phe140, as well as Glu166 undergo dramatic conformational rearrangements. As a result, one protomer exists in an active and the other in an inactive conformation. 9 In the latter, the S1 substrate-binding pocket has hole no longer exists due to the conformational change of residues 138-145. When the crystals are equilibrated at pH 7.6 and 8.0, both monomers are in an active conformation. 9 We have proposed 9,10 that these conformational changes are controlled by the protonation state of His163, an absolutely conserved residue at the bottom of the S1 substratespecificity pocket ( Figure 2 ). This subsite is designed to accommodate the P1-glutamine residue of M pro substrates, with high specificity. No other amino-acid side chain must be accepted in this position, in particular not glutamate (as opposed to glutamine). This is achieved by ensuring that over a broad pH range, His163 is uncharged. Two important interactions made by the imidazole ring are responsible for keeping it in the neutral state: (i), stacking or edge-on-face interaction with the phenyl ring of Phe140 ( Figure 2 , left 586 virtually collapsed as a consequence of the reorientation of Glu166, and the oxyanion panel), and (ii), acceptance by its Nδ1 atom of a strong hydrogen bond from the hydroxyl group of the buried Tyr161. This means that only the Nε2 atom of His163 can normally carry a proton, and it is this nitrogen that will donate a hydrogen bond to the side chain oxygen of the P1 glutamine residue of the substrate (Figure 2 , right panel). 8 In agreement with this structural interpretation, any replacement of the conserved histidine residue (His162 in this case) abolishes the proteolytic activity of HCoV 229E and feline infectious peritonitis virus (FIPV) M pro . 15, 16 Furthermore, FIPV M pro Tyr160 (corresponding to SARS-CoV M pro Tyr161) mutants have their proteolytic activity reduced by a factor of > 30. 15 These observations and the absolute conservation of these residues in the coronavirus main proteinases underline the importance of the uncharged state of His163 in binding the substrate with the required specificity. However, when the SARS-CoV M pro crystals are grown at a pH near or below the pK a value of this histidine residue, the latter can be protonated, leading to drastic structural consequences. In order to compensate for a positive charge on His163, which is in a largely hydrophobic environment, Glu166, which forms part of the wall of the S1 pocket, will move inwards and form a salt-bridge with the protonated His163 ( Figure 2 , middle). Through this conformational rearrangement, Glu166 will fill the S1 pocket, thereby preventing the binding of substrate. But the consequences are even more farreaching. When His163 is protonated, its hydrophobic interaction with the phenyl ring of Phe140 is no longer possible, and the latter undergoes a major displacement with an amplitude of > 5.5 Å (compare the middle panel with the two other images in Figure 2 ). Along with this, the oxyanion loop (residues 138-145) changes conformation and is no longer able to stabilize the tetrahedral intermediate of peptide-bond cleavage through donation of hydrogen bonds from the amide groups of Gly143 and Cys145. In addition to His163, there is a second histidine residue involved in formation of the S1 pocket. In the crystal structure obtained at pH 7.6, where both subunits are found in the active conformation, 9 His172 forms part of the wall of the subsite, being engaged in a salt-bridge with Glu166. At this pH, the histidine is likely to be positively charged, because pK a values of histidine residues involved in salt-bridges tend to be 2 units higher than those of isolated histidines. 17 The His172...Glu166 ion pair also exists in the active subunit of the dimer at pH 6.0, whereas in the inactive one, His172 loses its partner which moves into the S1 pocket, in order to compensate for the positive charge on the protonated His163 at the bottom of the subsite. Furthermore, at pH 8.0, His172 is likely to be uncharged. As a result, Glu166 is no longer fixed at its position in the wall of the S1 pocket, but tends to be flexible and partly blocks the entry to the S1 pocket. 10 These observations nicely agree with the enzymatic activity of SARS-CoV M pro , 10,12 which has its maximum at pH 7.0 and about 50% activity each at pH 6.0 (S1 pocket and oxyanion hole collapsed in one subunit of the dimer) and pH 8.0 (wall of S1 pocket no longer stable due to interruption of His172...Glu166 salt-bridge, Glu166 partly blocking the entry to the S1 site). Therefore, we propose that the bell-shaped pH-activity curve of the SARS-CoV M pro is governed by the protonation of His163 on its low-pH side, and by deprotonation of His172 on its high-pH side. 10 In nice agreement with this, Chou et al. 12 reported that the apparent pK a values characteristic of this curve are 5.7 ± 0.4 and 8.7 ± 0.4. However, they discuss that these may originate from the Glu290...Arg4 ion pair, or from the His41...Cys145 catalytic dyad, both less likely options from our point of view. We have confirmed by molecular dynamics (MD) simulations that large conformational changes of the observed type can indeed be triggered by protonation of His163 and deprotonation of His172. 10 In three different 10-ns simulations at pH 6.0, 7.6, and 8.0, we found the same type of rearrangements as seen in the crystals. This is reassuring, because it should not be forgotten that unless the crystals diffract to better than 1.0 Å resolution, X-ray diffraction cannot normally determine hydrogen positions because of their low scattering power. The force fields used in MD simulations, on the other hand, fully take into account the hydrogens bound to non-carbon atoms. Thus, because these simulations yield the same conformational rearrangements as those seen by X-ray diffraction, the interpretation of the crystallographic results is likely to be correct. Also, our preliminary NMR data with 15 N-labeled SARS-CoV M pro (J. George et al., unpublished) appear to support these conclusions. In addition to the MD simulations, we also investigated the dynamic behavior of the system with His163 in both subunits of the dimer protonated, i.e., at a presumed pH led to a dimer that had both monomers in the inactive conformation, with their S1 pockets and oxyanion holes collapsed. Thus, the MD simulations were apparently able to transform one conformation of the substrate-binding site of SARS-CoV M pro into the other. 10 Experimental support for this theoretical prediction was provided very recently by the analysis of two new crystal forms of the SARS-CoV M pro . In addition to the monoclinic crystal form of the enzyme originally described in 2003, 9 we managed to < 6.0. Starting from the (energy-minimized) crystal structure at pH 6.0, this simulation obtain tetragonal (space group P4 3 2 1 2) and orthorhombic (P2 1 2 1 2) crystals of the enzyme, 10 Both of these crystal forms contain one SARS-CoV M pro monomer per asymmetric unit; the dimer is generated through the crystallographic twofold axis. Thus, by necessity, the two monomers of the dimer are identical in these crystals. The question, however, was whether they would be in the active or in the inactive conformation. Because the monoclinic form had one monomer in the active and the other in the inactive form when crystallized at pH 6.0, but both of them in the active conformation after equilibration of the crystals at pH 7.6, it was unclear what to expect for the new crystal forms which were crystallized at pH 5.9 (tetragonal form) and 6.6 (orthorhombic). Interestingly, it turned out that both of them are in the inactive form, i.e. the S1 pocket and the oxyanion hole have collapsed. We have also attempted to equilibrate tetragonal crystals at higher pH values. Structures were determined at pH 7.0 (1.65 Å resolution), 7.4 (1.57 Å), and 8.0 (1.65 Å). Although these are the highest-resolution structures reported so far for the SARS-CoV M pro , a reliable interpretation of the electron density in the substrate-binding region was difficult because of dual (or even multiple) conformations. In any case, our preliminary analysis shows that the same type of conformational rearrangements occur in this crystal form as was observed in the monoclinic crystals, but an ensemble of inactive and active conformations appear to co-exist at all pH values. In fact, the tetragonal crystals of SARS-CoV M pro , containing < 30% solvent, may be less suited for studying the conformational transition because the molecules are tightly packed and seem to have a contracted substrate-binding site. This is supported by the observation that the tetragonal crystals crack when equilibrated at pH 8.0 for longer than 4 hours. Interestingly, the orthorhombic crystal form, which has the (generated) dimer in the inactive conformation when crystallized at pH 6.6 in the presence of malonate, can also be obtained in an active form at about the same pH (6.5) when ammonium sulfate is used as a precipitant (J. Lescar, personal communication). At present, it is unclear why the change of precipitant should induce such changes; more likely, it is subtle differences in the final pH of the crystallization medium that are determining the resulting conformation when working near the pK a value of His163. Apart from the differences in detail in the substrate-binding site, the SARS-CoV M pro dimers as seen in the new structures are very similar to the dimer in the original monoclinic crystals. From the monoclinic crystal structure obtained at pH 6.0, the monomers in the new crystal forms display overall r.m.s. deviations for Cα atoms (monomers A and B, respectively) of 0.95/0.76 Å (tetragonal form) and 1.10/0.78 Å (orthorhombic form). It is reassuring that the monomers in the new crystal forms, which are in the inactive conformation, are more similar to the inactive monomer B of the dimer in the monoclinic form (second number), than to the active monomer A (first number). One important intermolecular interaction in the coronavirus M pro dimer was not mentioned so far. In the active conformation of the SARS-CoV M pro dimer, the Nterminus of monomer B (which is the inactive subunit at pH 6.0) was shown to interact with the main chain amide and carbonyl of Phe140 and with the carboxylate of Glu166, both of monomer A. 9 This interaction appears to help shape the substrate-binding site of and determined these crystal structures at 2.0 and 2.8 Å resolution. monomer A. On the other hand, the collapsed binding site of monomer B lacks these intermolecular interactions, and as a result, residues 1 and 2 of monomer A are disordered and not seen in the electron density. The same observation was made in our X-ray structures derived from the tetragonal and orthorhombic crystal forms 10 : here, both monomers are in the inactive conformation, and accordingly, both N-termini are disordered to the extent that no electron density is seen for residue Ser1. When we determined the first structure of a coronaviral M pro , that of the TGEV enzyme, 8 we saw that residues 1-7 were squeezed in between domain III of its own monomer and domain II of the other monomer in the dimer. The same interactions of this segment, which we later called the N-finger, 9 were seen in HCoV 229E M pro and SARS-CoV M . When we deleted residues 1-5 in the TGEV proteinase, the enzyme was almost totally 8 18 pro proteolytically inactive, but the rather surprising finding was that it still forms a dimer. 18 This result has to be seen in light of the finding by Shi et al. that even isolated domain III of SARS-CoV M pro will dimerize. 13 On the other hand, when Hsu et al. removed only residues 1-3 from the enzyme, it retained 76% of its proteolytic activity. 19 Further, in apparent contradiction to the findings by Chen et al., 18 they found that the ∆(1-4) SARS-CoV M pro was predominantly monomeric at concentrations of 0.1 mg/ml. Whatever the reason for the discrepancy might be, it could well be that while the presence and correct placement of the "N-finger" is important, the tip of the finger may not be as essential as thought hitherto. In fact, our MD simulations as well as the X-ray structure obtained from the monoclinic form at pH 7.6 also suggested that the active conformation of monomer A can also be retained without direct interaction with residue Ser1 of monomer B, provided residues 3-7 are in the correct position. In summary, from the various crystallographic studies on the coronavirus M pro and from our MD simulations, we conclude that the enzyme is a very flexible protein, the conformational state of which appears to depend on the pH value of the medium. This is probably of biological significance, because the viral polyproteins (of which the M pro is a domain before self-activation by autocleavage) assemble on the late endosome where local pH tends to be acidic. For designing inhibitors of the M pro , knowledge of the dynamics of the target will be essential. This has been clearly demonstrated in the case of HIV-1 proteinase over the years, where understanding the flexibility of the enzyme did not only turn out to be a prerequisite for designing potent inhibitors but also for explaining many of the observed drug-resistance mutations. Today, HIV proteinase is perhaps the one enzyme best understood in terms of structure and dynamics, and this knowledge is mainly based on several hundred crystal structures, most of them complexes with various inhibitors. In terms of number of crystal structures, HIV-1 proteinase is probably followed by trypsin and lysozyme, but extrapolating from the current research activities, coronavirus main proteinases are likely to catch up. The work described here was supported, in part, by the Sino-European Project on SARS Diagnostics and Antivirals (SEPSDA, contract no. inactive with a synthetic pentadecapeptide as substrate. In agreement with this, Chen Molecular mechanisms of severe acute respiratory syndrome (SARS) Mechanisms and enzymes involved in SARS coronavirus genome expression Coronavirus genome: prediction of putative functional domains in the non-structural polyprotein by comparative amino acid analysis Coronavirus main proteinase (3CL pro ) structure: basis for design of anti-SARS drugs Coronaviruses with Special Emphasis on First Insights Concerning SARS Design of wide-spectrum inhibitors targeting coronavirus main proteinases Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra α-helical domain The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor pH-Dependent conformational flexibility of the SARS-CoV main proteinase Mechanism of the maturation process of SARS-CoV 3CL protease Quarternary structure of the severe acute respiratory syndrome (SARS) coronavirus main protease Dissection study on the severe acute respiratory syndrome 3C-like protease reveals the critical role of the extra domain in dimerization of the enzyme: Defining the extra domain as a new target for design of highly specific protease inhibitors Biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3C-like proteinase Mutational analysis of the active centre of coronavirus 3C-like proteases Virus-encoded proteinases and proteolytic processing in the Nidovirales Stabilization of ion selectivity filter by pore loop ion pairs in an inwardly rectifying potassium channel Severe acute respiratory syndrome coronavirus 3C-like proteinase N terminus is indispensable for proteolytic activity but not for enzyme dimerization: Biochemical and thermodynamic investigation in conjunction with molecular dynamics simulations Critical assessment of important regions in the subunit association and catalytic action of the severe acute respiratory syndrome coronavirus main protease A 3D model of SARS-CoV 3CL proteinase and its inhibitors design