key: cord-0874493-3xc9y5mo authors: Zhong, Nan; Zhang, Shengnan; Xue, Fei; Kang, Xue; Zou, Peng; Chen, Jiaxuan; Liang, Chao; Rao, Zihe; Jin, Changwen; Lou, Zhiyong; Xia, Bin title: C-terminal domain of SARS-CoV main protease can form a 3D domain-swapped dimer date: 2009-01-01 journal: Protein Science DOI: 10.1002/pro.76 sha: 5c8e478a1cae415e3aeaf8738b3bed3309856e23 doc_id: 874493 cord_uid: 3xc9y5mo SARS coronavirus main protease (M(pro)) plays an essential role in the extensive proteolytic processing of the viral polyproteins (pp1a and pp1ab), and it is an important target for anti-SARS drug development. We have reported that both the M(pro) C-terminal domain alone (M(pro)-C) and the N-finger deletion mutant of M(pro) (M(pro)-Δ7) exist as a stable dimer and a stable monomer (Zhong et al., J Virol 2008; 82:4227-4234). Here, we report structures of both M(pro)-C monomer and dimer. The structure of the M(pro)-C monomer is almost identical to that of the C-terminal domain in the crystal structure of M(pro). Interestingly, the M(pro)-C dimer structure is characterized by 3D domain-swapping, in which the first helices of the two protomers are interchanged and each is enwrapped by four other helices from the other protomer. Each folding subunit of the M(pro)-C domain-swapped dimer still has the same general fold as that of the M(pro)-C monomer. This special dimerization elucidates the structural basis for the observation that there is no exchange between monomeric and dimeric forms of M(pro)-C and M(pro)-Δ7. SARS coronavirus (SARS-CoV) was identified as the etiological agent of the pandemic transmissible disease, severe acute respiratory syndrome. [1] [2] [3] The SARS-CoV 5 0 two-thirds genome of the virus encodes two overlapping polyproteins, pp1a (486 kDa) and pp1ab (790 kDa), which are proteolytically processed into 16 matured nonstructural proteins (nsp1-16) by two proteases included in these two polyproteins. These nonstructural proteins mediate the viral replication and transcription. 4 Main protease of SARS-CoV (M pro ) plays an important role in the extensive proteolytic processing of the viral polyproteins, which makes it essential for the viral life cycle and represents an attractive target for antiviral agent development. [5] [6] [7] The first structure of M pro was solved in 2003 and revealed a homodimer which is highly similar to other previously reported coronavirus main proteases structures. 5, 8 It has been reported that M pro exists in solution as an equilibrium between monomeric and dimeric forms, 9 and only the dimeric form is enzymatically active. 10 Shi et al. 11 have reported that the M pro N-terminal alone is a monomer, whereas the M pro C-terminal domain alone (M pro -C) exists as a stable dimer. However, our previous studies demonstrated that M pro -C exists as both a stable monomer and a stable dimer in solution, and so does the N-finger deletion mutant of M pro (M pro -D7) which can also form a stable dimer through dimerization of the C-terminal domain. 12 Here, we report structures of monomeric and dimeric forms of the C-terminal domain of M pro (M pro -C). M pro -C monomer maintains the same fold as that in the crystal structure of M pro . On the other hand, the M pro -C dimer has a novel structure characterized by 3D domain-swapping, which provides the structural basis for the dimer stability. Solution structure of M pro -C monomer We have obtained nearly complete backbone resonance assignments for the M pro -C monomer except those for residues F219 and E288, and more than 95% of side-chain resonances were assigned. The solution structures of the M pro -C monomer were calculated using interproton NOE-derived distance restraints together with the dihedral angle and hydrogen bond restraints ( Table I) . The 20 structures with the lowest energies are shown in Figure 1 (A), together with the ribbon diagrams of the mean structure [ Fig. 1(B) ]. The Ramachandran plot indicates that a majority of residues (99.7%) have their Phi and Psi angles in allowed regions, and only 0.3% of them are in disallowed region. The root mean square deviation (RMSD) for backbone heavy atoms in secondary structure elements is 0.20 AE 0.05 Å and that for all heavy atoms in secondary structure elements is 0.62 AE 0.08 Å ( Table I ). All these indicate that the solution structure is determined with good quality. Just as the C-terminal domain in the full-length M pro , the M pro -C monomer adopts a globular all-alpha fold consisting of five a-helices (a 1 (T201-N214), a 2 (L227-Y237), a 3 (Q244-T257), a 4 (V261-N274), and a 5 (P293-Q299)), two well-defined loops (L 2 (N238-T243) and L 3 (G258-A260)), and two flexible loops (L 1 (D216-T226) and L 4 (G275-T292)). Helix a 1 is enwrapped by helices a 2 ,a 3 ,a 4 , and a 5 , along with L 4 , which forms the hydrophobic core [ Fig. 1(B) ]. The mean structure of the M pro -C monomer overlaps well with the corresponding part in the structure of M pro (1UK2) except for loops L 1 and L 4 , with a 0.5 Å RMSD for backbone heavy atoms in secondary structure elements [ Fig. 1(C) ]. As loops L 1 and L 4 are relatively flexible in solution structure, the structural difference for these two loops is not unexpected. Therefore, the structure of M pro -C monomer should remain the same as that of the C-terminal domain of M pro . Crystal structure of M pro -C dimer As the NMR data quality for the M pro -C dimer is very poor, we have determined its crystal structure with a resolution of 2.4 Å. The crystal structure of the M pro -C dimer was determined by molecular replacement using the structure of the C-terminal 120 residues from M pro crystal structure as the search model (Table II) . The N-terminal 11 residues and C-terminal 8 or 9 residues in two protomers are invisible on the electron density map. To our surprise, the structure of the M pro -C dimer is characterized by 3D domain-swapping with the two helices a 1 of the two molecules interchange their positions (Fig. 2) . Each helix a 1 is now surrounded by helices a 2 -a 5 of the other molecule and these five helices from two molecules adopts a compact globular all-alpha fold the same as that of M pro -C monomer, which is called a ''folding subunit.'' Thus, the M pro -C dimer is consisted of two identical folding subunits and each is reconstituted by chain fragments from both protomers. In other words, helix a 1 and helices a 2 -a 5 from one protomer are now located in different folding subunits for the M pro -C dimer, and the two protomers are symmetrically related by a crystallographic twofold axis. The two folding subunits of the M pro -C dimer are linked by two hinge loops consisting of residues D216-T226 which form loop L 1 in the structure of the M pro -C monomer. As a result, a new short 3 10 helix is formed by residues Trp218, Phe219, and Leu220 on both hinge loops. Consequently, helix a 4 , which is covered by loop L 1 in the M pro -C monomer, is now exposed in both folding subunits of the M pro -C dimer [ Fig. 2(D) ]. According to the nomenclature of Liu et al., 13 the interface between domains of a domain-swapped oligomer with the structural characteristics presented in the monomer is termed the closed interface and that found only in the oligomer is termed the open interface. The M pro -C dimer has an extensive close interface of about 3000 Å 2 , which involves interchanging of a 1 helices of both protomers and produces tight entanglement of the two protomers. This elucidates the structure basis for the exceeding stability of the M pro -C dimer and the lack of exchange between the monomer and dimer. 12 However, there is no apparent interaction to define an open interface between the two hinge loops. In addition, the electron densities of residues on the hinge loops are relatively weak and their main chain B-factors are relatively high (>50 Å 2 ), which suggest that the hinge loops linking the two folding subunits are rather flexible. Thus, we expect that the relative orientation of two folding subunits in the M pro -C dimer should not be fixed in solution. We have reported that M pro -C exists in both monomeric and a dimeric forms in solution, and here, we have solved the structures of both forms. We found that the M pro -C monomer structure adopts the same conformation as that of the C-terminal domain in the crystal structure of full-length M pro . However, the M pro -C dimer has an unusual structure which is a 3D domain-swapped dimer. We have carried out denature-refolding experiments and found that both stable monomeric and dimeric forms can be regenerated from refolding of either M pro -C monomer or dimer (data not shown). Also, taking into consideration that both M pro -C monomer and dimer are produced in E. coli and there is no exchange between the two, it seems that forming domain-swapped dimer is not dependent on other components of the virus or human cells, but should be an intrinsic ability of M pro -C. Furthermore, our previous NMR studies demonstrated that the two C-terminal domains of the M pro -D7 dimer have identical dimerization pattern to that of the M pro -C dimer. 12 Thus, M pro -D7 can also form a domain-swapped dimer through domain-swapping of their C-terminal domains, which also explains why the M pro -D7 dimer is stable and does not exchange with the M pro -D7 monomer. However, what puzzles us is that we failed to detect any domain-swapped dimer for the full-length M pro expressed in E. coli. Why could the full-length M pro not form domain-swapped dimer? Could it be that the N-finger of M pro is able to interfere and prevent the 3D domain-swapping dimerization of the C-terminal domain of M pro ? Is there a biological relevance for this 3D domain-swapping? It is clear that the two N-termini of both protomers in the crystal structure of the M pro -C dimer stretch out from the proteins in opposite directions, and the relative orientation of the two folding subunits in the domain- swapped M pro -C dimer is different from that of the two C-terminal domains in the crystal structure of the full-length M pro . Even if the full-length M pro could form a domain-swapped dimer, the two N-terminal domains of the dimer would be far away, and the domain-swapped dimer could not adopt the active conformation. In other words, a domain-swapped dimer of the full-length M pro would not be active. But, why would the virus retain the C-terminal domain within M pro that has the potential to deactivate the enzyme? These are the questions we will pursue in the future. For the M pro -C, the DNA fragment encoding residues 187-306 was cloned into pET21a vector. Samples of M pro -C monomer and dimer were prepared according to previously published method. 12 All NMR samples were at a concentration of about 1 mM and were prepared in buffer containing 50 mM potassium phosphate (pH 7.0), 1 mM EDTA, 0.03% NaN 3 , in 90% H 2 O/10% D 2 O, plus Complete, an EDTA-free Protease Inhibitor Cocktail (Roche, Germany). All NMR experiments were performed at 298 K on a Bruker Avance 500 MHz (with cryoprobe), 600 MHz NMR, and 800 MHz spectrometers. Resonance assignments were obtained using standard methods. 14 The interproton nuclear Overhauser effect (NOE) was employed to generate the distance restraints. Dihedral angles were determined from backbone chemical shifts using TALOS. 15 Hydrogen bond restraints were generated from the H-D exchange experiments in combined with the CSI secondary structural prediction. 16, 17 Structures were calculated and refined using the program CYANA and AMBER. [18] [19] [20] M pro -C dimer sample was concentrated to 30 mg/ mL in 20 mM Tris pH 7.0. Crystallization was performed by the sitting-drop vapor-diffusion method at 289 K in 48-well plates. The crystals selected for diffraction studies grew in 0.2M Sodium chloride, 0.1M BIS-TRIS (pH 5.5), 25% w/v PEG3350. A 2.4 Å resolution diffraction data set was collected at 100 K from a single M pro -C dimer crystal using an in-house Rigaku MM-007 generator and an R-Axis VIþþ detector. The beam was focused by osmic mirror. A total of 270 frames of data were collected. Processing of diffraction images and scaling of the integrated intensities were performed using the HKL2000 software package. 21 Initial phases were obtained by molecular replacement with PHASER 22 using the crystal structure of SARS-CoV main protease (PDB code: 2H2Z, excluding the N terminal 186 residues) as the searching model. The final manual rebuilding and refinement were performed in COOT, 23 Refmac, 24 and CNS regarding to the 2Fo-Fc and 1Fo-Fc density map. The solution and crystal structures were analyzed using the program packages PROCHECK 25 and Figures were created by using MOLMOL. 26 Coordinates and structure factors have been deposited at the PDB under accession code 3EBN for the crystal structure of the M pro -C dimer and 2K7X for the solution structure of the M pro -C monomer. Coronavirus in severe acute respiratory syndrome (SARS) Newly discovered coronavirus as the primary cause of severe acute respiratory syndrome A novel coronavirus and SARS Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage Coronavirus main proteinase (3CLpro) structure: basis for design of anti-SARS drugs The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor Design of wide-spectrum inhibitors targeting coronavirus main proteases Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra alpha-helical domain SARS CoV main proteinase: the monomer-dimer equilibrium dissociation constant Biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3C-like proteinase Dissection study on the severe acute respiratory syndrome 3C-like protease reveals the critical role of the extra domain in dimerization of the enzyme: defining the extra domain as a new target for design of highly specific protease inhibitors Without its N-finger, the main protease of severe acute respiratory syndrome coronavirus can form a novel dimer through its C-terminal domain A domain-swapped RNase A dimer with implications for amyloid formation Multidimensional nuclear-magnetic-resonance methods for protein studies Protein backbone angle restraints from searching a database for chemical shift and sequence homology The chemical-shift index-a fast and simple method for the assignment of protein secondary structure through NMRspectroscopy The C-13 chemical-shift index-a simple method for the identification of protein secondary structure using C-13 chemical-shift data Torsion angle dynamics for NMR structure calculation with the new program DYANA Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA The amber biomolecular simulation programs Processing of X-ray diffraction data collected in oscillation mode Phaser crystallographic software Coot: model-building tools for molecular graphics Refinement of macromolecular structures by the maximum-likelihood method PROCHECK: a program to check the stereochemical quality of protein structures MOLMOL: a program for display and analysis of macromolecular structures All NMR experiments were carried out at the Beijing NMR Center.