key: cord-0994568-zzxjv666 authors: Campanacci, Valérie; Egloff, Marie‐Pierre; Longhi, Sonia; Ferron, François; Rancurel, Corinne; Salomoni, Aurelia; Durousseau, Cécile; Tocque, Fabienne; Brémond, Nicolas; Dobbe, Jessika C.; Snijder, Eric J.; Canard, Bruno; Cambillau, Christian title: Structural genomics of the SARS coronavirus: cloning, expression, crystallization and preliminary crystallographic study of the Nsp9 protein date: 2004-06-07 journal: Acta Crystallogr D Biol Crystallogr DOI: 10.1107/s0907444903016779 sha: afc29ba38f925a7d3ed22338b5a9b34f4b30a49c doc_id: 994568 cord_uid: zzxjv666 The aetiologic agent of the recent epidemics of Severe Acute Respiratory Syndrome (SARS) is a positive‐stranded RNA virus (SARS‐CoV) belonging to the Coronaviridae family and its genome differs substantially from those of other known coronaviruses. SARS‐CoV is transmissible mainly by the respiratory route and to date there is no vaccine and no prophylactic or therapeutic treatments against this agent. A SARS‐CoV whole‐genome approach has been developed aimed at determining the crystal structure of all of its proteins or domains. These studies are expected to greatly facilitate drug design. The genomes of coronaviruses are between 27 and 31.5 kbp in length, the largest of the known RNA viruses, and encode 20–30 mature proteins. The functions of many of these polypeptides, including the Nsp9–Nsp10 replicase‐cleavage products, are still unknown. Here, the cloning, Escherichia coli expression, purification and crystallization of the SARS‐CoV Nsp9 protein, the first SARS‐CoV protein to be crystallized, are reported. Nsp9 crystals diffract to 2.8 Å resolution and belong to space group P6(1/5)22, with unit‐cell parameters a = b = 89.7, c = 136.7 Å. With two molecules in the asymmetric unit, the solvent content is 60% (V (M) = 3.1 Å(3) Da(−1)). The recent epidemics of Severe Acute Respiratory Syndrome (SARS) represent a real paradigm for emerging viral pathogens, as well as an example of worldwide coordinated efforts to control a serious viral outbreak, a test of the reaction time of the scienti®c community. The ®rst cases of Severe Acute Respiratory Syndrome originated from the Guangdong province in South East China. The number of cases reported and our current knowledge regarding this illness are still currently evolving, but a number of basic facts have been ®rmly established. The aetiologic agent of SARS is a positivestranded RNA virus belonging to the Coronaviridae family and its genome differs substantially from those of previously identi®ed coronaviruses, including two other human coronaviruses (Peiris et al., 2003; Ksiazek et al., 2003; Drosten et al., 2003; Snijder et al., 2003) . The virus, whose name SARS-CoV is now currently accepted, is mainly transmitted by the respiratory route. However, evidence for a secondary faeco± oral route of transmission has also been presented. The viral strain probably primarily infected wild animals traded in Asian markets and crossed the species barrier to infect humans. There is to date no vaccine and no prophylactic or therapeutic treatments against this agent. A prophylactic treatment would have been useful to combat the epidemics; the only effective measure available to prevent the spread of the virus is to quarantine all persons that have been exposed to SARS-CoV. The number of antiviral molecules that can be used to treat patients infected by RNA viruses is incredibly low. Accordingly, it is important to search for ef®cient antiviral drugs for a large number of RNA viruses, while giving priority to viruses transmitted by the respiratory route because they have the highest potential for causing pandemic outbreaks. The scienti®c community has reacted promptly and ef®ciently to identify and characterize this new infectious agent, as well as to develop methods for SARS-CoV detection and containment protocols. In the meantime, a wide effort is being made to design drugs active against SARS-CoV. Ribavirin has been used in the absence of other candidates, but its intrinsic ef®ciency against SARS-CoV appears to be low (Koren et al., 2003) . To select drugs active against a viral pathogen, one usually relies on screening candidate drugs for their ef®cacy in virusinfected cell cultures and/or animal models. However, during the current research on drugs for treating hepatitis C virus (HCV) infections, a novel and promising approach has been introduced. The RNA-dependent RNA polymerase of HCV has been puri®ed and crystallized and enzymatic tests have been used to ®nd potent nucleoside and non-nucleoside inhibitors of the virus, the structure±activity relationships of which allow further testing and clinical developments (de Francesco et al., 2003) . This approach is gaining momentum owing to a concomitant increase in the power of new technologies and technological developments. Among those, genomics approaches are being conducted to solve the crystal structures of large sets of clinically relevant proteins, which will become the subjects of future structure±function relationship studies. A crystal structure has not yet been determined for any of the 28 predicted mature SARS-CoV proteins. The crystal structure of the main (or 3CL) protease of transmissible gastroenteritis virus, a related coronavirus, has been determined and was used to construct a model of the SARS-CoV 3CL protease, facilitating future drug design against this important target (Anand et al., 2003) . The putative coronavirus RNA-dependent RNA polymerase has been puri®ed, but is inactive in vitro (Grotzinger et al., 1996) . In this context, we have developed a SARS-CoV wholegenome approach aimed at determining the crystal structure of all SARS-CoV proteins. We anticipate that this will greatly facilitate drug design as well as the study of many other aspects related to the biology of these complex viruses. Coronaviruses are enveloped viruses with a single-stranded RNA genome of positive polarity (Lai & Holmes, 2001) . Their genome is between 27 and 31.5 kbp in length, the largest of the known RNA viruses. Like other coronaviruses, the SARS-CoV genome is known to encode two large replicase polyproteins (the ORF1a and ORF1ab proteins), which are processed into a set of mature non-structural proteins (Nsps) by internal viral proteases (Snijder et al., 2003) . The functions of many of these products, such as the Nsp9±Nsp10 polypeptides produced from the C-terminal domain of the ORF1aencoded polyprotein, are still unknown. In the related mouse hepatitis virus, which is a group 2 coronavirus, the SARS-CoV Nsp9 corresponds to a 12 kDa cleavage product (P1a-12) that is found preferentially in the perinuclear region of infected cells, where it co-localizes with other components of the viral replication complex (Bost et al., 2000) . No clues to the function of the Nsp9 equivalent of any coronavirus have been obtained thus far. Here, we report the cloning, expression, puri®cation and crystallization of the SARS-CoV Nsp9 protein, a 113-residue protein ( Fig. 1) , which is the ®rst SARS-CoV protein to be crystallized. Vero cells were infected with SARS-CoV (Frankfurt-1 strain; NCBI Accession No. AY291315; Drosten et al., 2003) at a multiplicity of infection of 0.01. At the onset of the cytopathogenic effect (approximately 40 h post-infection), intracellular RNA was isolated by cell lysis for 10 min at room temperature with 5% lithium dodecyl sulfate in LET buffer (100 mM LiCl, 1 mM EDTA, 10 mM Tris±HCl pH 7.4) containing 20 mg ml À1 of proteinase K. After shearing of the cellular DNA using a syringe, lysates were incubated at 315 K for 15 min, extracted with phenol (pH 4.0) and chloroform and the RNA was ethanol-precipitated. cDNA was obtained by reverse transcription using primer SAV009 (5 H -GGACAG-CAACCGCTGGACAATC-3 H ), complementary to nucleotides 13644±13665 of the Frankfurt-1 genome, using Thermoscript reverse transcriptase (Invitrogen). The SARS-CoV Nsp9-coding sequence was ampli®ed by PCR from the cDNA prepared above using two primers containing the attB sites of the Gateway recombination system (Invitrogen Cultures were grown at 310 K until OD 600 reached 0.6 and were then stored for 2 h on ice; 2% ethanol was added for the induction of stress chaperones (Gong & Shuman, 2002) . Expression was induced by adding 50 mM IPTG and cells were incubated for 16 h at 290 K. Cells were collected by centrifugation and the bacterial pellets were resuspended and frozen in 50 mM Tris±HCl, 150 mM NaCl, 10 mM imidazole pH 8.0. Cellular suspensions were thawed with 0.25 mg ml À1 lysozyme, 0.1 mg ml À1 DNase and 20 mM MgSO 4 and were centrifuged at 12 000g. The supernatant was applied onto an Ni-af®nity column connected to an FPLC system (Amersham Pharmacia Biotech). The protein was eluted with 50 mM Tris± HCl, 150 mM NaCl, 250 mM imidazole pH 8.0 and then applied onto a preparative Superdex 200 gel-®ltration column pre-equilibrated in 10 mM Tris±HCl, 300 mM NaCl pH 8.0. The recombinant protein was characterized by N-terminal sequencing, mass spectroscopy, dynamic light scattering (DLS) and circular dichroism (CD). DLS was performed with a Dynapro Microsampler (Protein Solutions) using a protein solution at 5.8 mg ml À1 in 10 mM Tris±HCl, 300 mM NaCl pH 8.0. The CD spectrum of the ®nal puri®ed product was recorded between 185 and 260 nm on a JASCO J810 spectrometer using a protein solution at 0.1 mg ml À1 in sodium phosphate buffer pH 7.0 containing 25 mM NaCl. Crystallization screening was performed by vapour diffusion with nanodrops using a Cartesian robot as described previously (Sulzenbacher et al., 2002; Vincentelli et al., 2003) . Brie¯y, three commercial kits were used: Wizard Screens 1 and 2 (Emerald BioStructures), Structure Screens 1 and 2 and Stura Footprint screen (Molecular Dimensions Ltd). The crystals were obtained in 2.0 M ammonium sulfate, 0.1 M phosphate±citrate pH 4.2 and with a protein concentration of 5.8 mg ml À1 in the gel-®ltration buffer. The optimization of the crystallogenesis was performed with nanodrops in a twodimensional matrix (Lartigue et al., 2003) with a precipitant range of 1.8±2.2 M ammonium sulfate and a pH range of 4.0±4.5 (0.1 M phosphate±citrate), leading to a crystal size of $100 Â 100 Â 80 mm (Fig. 2) . The crystals were cryocooled in a pure solution of silicone oil DC200. They were exposed at beamline ID14-EH1, ESRF, Grenoble using a Quantum ADSC Q4R detector. A total of 110 1 oscillations were recorded with a crystal-to-detector distance of 180 mm and a collection time of 9 s per frame. Diffraction data were integrated with DENZO (Otwinowski & Minor, 1997) and were reduced with SCALA (Collaborative Computational Project, Number 4, 1994). We have subcloned 35 SARS-CoV targets in the Gateway system, including 20 full-length proteins and 15 protein domains. To date, 70 constructs have been generated, of which 28 were expressed, 14 were soluble and ®ve were puri®ed. Four of them led to small crystals, among which were those of the Nsp9 protein described in this report. Expression of selenomethionine-substituted Nsp9 was performed using the method of methionine-biosynthesis pathway inhibition (Doublie Â, 1997) . Puri®cation of the selenomethionine protein was performed as described above and crystal optimization is under way. Nsp9 crystals diffract to 2.8 A Ê at ID14-EH1 (ESRF, Grenoble). Data integration and reduction indicate that they belong to the P622 space group. R sym is 5.3%, an excellent value considering the redundancy of the data (Table 1) . Re¯ections are observed at multiples of six along the c axis (00l), indicating that the space group is either P6 1 22 or its enantiomorph P6 5 22. The unit-cell parameters are a = b = 89.7, c = 136.7 A Ê , which lead to a V M value of 3.1 A Ê 3 Da À1 (60% solvent) with two molecules in the asymmetric unit (Matthews, 1968) . The observed distribution of centric or acentric intensities overlaps with the theoretical curve, an indication that merohedral twinning, a feature that is often observed in trigonal or hexagonal crystals, is not present. SARS-CoV Nsp9 has been puri®ed to homogeneity in two steps. The identity of the ®nal product has been con®rmed by N-terminal sequencing. The oligomeric status of Nsp9 has been checked using gel ®ltration and DLS. The former technique indicates that the protein is monomeric, while the DLS analysis is consistent with a monodisperse species with an apparent Stokes radius of 26 A Ê and an equivalent mass of 31 kDa, which corresponds to a dimer. This discrepancy might be related to the concentration differences between the two techniques. A PSI-Blast search retrieved seven homologous sequences, all belonging to members of the Coronaviridae family. They were aligned using MULTALIGN (Corpet, 1988) with standard options. The consensus of the secondary-structure predictions obtained with JPRED (Cuff et al., 1998), PSI-PRED (McGuf®n et al., 2000) and PREDICT PROTEIN (Rost, 1996) converges to a fold of seven -strands. A foldrecognition analysis was performed with the threading programs 3D-PSSM (Kelley et al., 2000) and INBGU (Fischer, 2000) . Both programs fail to detect any protein homologue to Nsp9, but converge to a fold of two seven-stranded -sheets. In agreement, the CD spectrum of puri®ed Nsp9 reveals a structured protein formed by a majority of -strands (35%) and -turns (18%), but which also contains 15% -helix. Random-coil segments account for 32% of the total. The SARC-CoV Nsp9 protein expressed in E. coli was readily crystallized using the nanodrop screening (Sulzenbacher et al., 2002) and optimization (Lartigue et al., 2003) approaches. Crystals diffract to 2.8 A Ê resolution and are amenable to structure determination using SeMet substitution and MAD methods (Hendrickson, 1991) at synchrotrons. Nucleic Acids Res. 16, 10881±10890. Cuff Methods Enzymol. 276 Fields Virology Acta Cryst. D59, 916±918 Methods Enzymol. 276 Methods Enzymol. 266 Acta Cryst. D58 This study was funded by the SPINE project of the European Union 6th PCRDT (QLRT-2001-00988), by the French Genopole programme and by the Conseil General of the Bouches-du-Rhone. We thank H. W. Doerr and H. Rabenau (Institute for Medical Virology, Johan Wolfgang Goethe University, Frankfurt-am-Main, Germany) for providing us with the virus and P. Bredenbeek, S. Gorbalenya and W. Spaan for technical assistance and helpful discussions/ suggestions.