key: cord-0972864-aiy8clpy authors: Zuo, Xun; Mattern, Michael R.; Tan, Robin; Li, Shuisen; Hall, John; Sterner, David E.; Shoo, Joshua; Tran, Hiep; Lim, Peter; Sarafianos, Stefan G.; Kazi, Lubna; Navas-Martin, Sonia; Weiss, Susan R.; Butt, Tauseef R. title: Expression and purification of SARS coronavirus proteins using SUMO-fusions date: 2005-02-23 journal: Protein Expr Purif DOI: 10.1016/j.pep.2005.02.004 sha: e27bfcbf7f7a6c0d0e30aa945af857c240f6cd6b doc_id: 972864 cord_uid: aiy8clpy Severe acute respiratory syndrome coronavirus (SARS-CoV) proteins belong to a large group of proteins that is difficult to express in traditional expression systems. The ability to express and purify SARS-CoV proteins in large quantities is critical for basic research and for development of pharmaceutical agents. The work reported here demonstrates: (1) fusion of SUMO (small ubiquitin-related modifier), a 100 amino acid polypeptide, to the N-termini of SARS-CoV proteins dramatically enhances expression in Escherichia coli cells and (2) 6× His-tagged SUMO-fusions facilitate rapid purification of the viral proteins on a large scale. We have exploited the natural chaperoning properties of SUMO to develop an expression system suitable for proteins that cannot be expressed by traditional methodologies. A unique feature of the system is the SUMO tag, which enhances expression, facilitates purification, and can be efficiently cleaved by a SUMO-specific protease to generate native protein with a desired N-terminus. We have purified various SARS-CoV proteins under either native or denaturing conditions. These purified proteins have been used to generate highly specific polyclonal antibodies. Our study suggests that the SUMO-fusion technology will be useful for enhancing expression and purification of the viral proteins for structural and functional studies as well as for therapeutic uses. Severe acute respiratory syndrome (SARS) 1 is a respiratory illness that has only recently been reported in Asia, North America, and Europe. After the Wrst case of the disease in humans was found in Southern China late 2002, the outbreak spread quickly to about 35 countries on Wve continents, resulting in more than 8000 cases and 800 deaths. At present, there is no eYcacious treatment regime for SARS. The need for both a reliable diagnostic assay and a therapeutic agent (antiviral or vaccine) is obvious. A previously unknown coronavirus has been identiWed as the causative agent of SARS. Scientists at the CDC and other laboratories determined the genomic sequence of this coronavirus and named it SARS-CoV [1] [2] [3] . Coronavirus, a genus within the family Coronaviridae, contains a group of large, positive stranded, enveloped, pathogenic RNA viruses that infect many species of animals, including humans. They cause respiratory, enteric, and central nervous system diseases [4] . The genomic sequence of SARS-CoV provides important information for the development of diagnostic tests and vaccines. This information aVords the opportunity to express any SARS-CoV protein of choice for recombinant subunit vaccines. Development of protein-based diagnostic and therapeutic methods would be greatly facilitated by the ability to produce viral proteins of high quality in tractable amounts, which requires protein engineering, expression, and puriWcation. Six proteins of SARS-CoV, namely Spike (S), Nucleocapsid (Nc), Envelope (E), SARS polymerase (RdRp), SARS protease (3CL), and membrane (M), have become the focus of eVorts to produce antiviral agents and vaccines against SARS. The SARS-CoV proteins investigated in this study are described brieXy below. SARS-CoV 3CL protease (3CL, 3CL pro or M pro ) is the principal coronavirus protease utilized by the virus to process its replicase proteins into mature forms. The full length of the 3CL has 306 amino acids (molecular weight »33.8 kDa). The protease cleaves the replicase polyproteins (pp1a and pp1ab) to generate RNA-dependent polymerase (RdRp), 3CL, and helicase, all crucial for viral replication [5] [6] [7] . Therefore, 3CL represents an attractive target for the design and discovery of coronavirus antiviral agents, as does the polymerase [8] . SARS-CoV Nucleocapsid protein (N or Nc) is a phosphoprotein containing 423 amino acids (molecular weight »46 kDa) [9] . Large quantities of the protein are translated on free polysomes in the cytoplasm, where some molecules are rapidly phosphorylated. It is known that the protein binds the viral RNA and forms the nucleocapsid, but its exact mechanisms and role in replication are not yet clear. The Nc protein is known to have B and T cell epitopes and to elicit host protective immune responses [10, 11] . Spike protein (S or Spk) is a glycoprotein containing 1255 amino acids [12] . Upon translation, it is inserted into the rough endoplasmic reticulum and glycosylated with N-linked glycans [13] . Some of the proteins accumulate in the Golgi apparatus, and a fraction of oligomeric spike protein is transported to the membrane, where it mediates cell-cell fusion. Like those of other coronaviruses, the SARS-CoV spike protein likely contains many of the neutralizing antibody epitopes as well as T cell epitopes [14] . A supply of puriWed SARS-CoV proteins would be valuable for both clinical and investigational purposes. Although several strategies have been developed over the years to express heterologous recombinant proteins in bacterial, yeast, mammalian, and insect cells, the expression of heterologous genes in bacteria is by far the simplest and most inexpensive means available for research or commercial purposes. However, heterologous gene products often fail to attain their correct three-dimensional (3-D) conformation, or are simply expressed very poorly in Escherichia coli. Selection of ORFs for structural genomics projects has shown that only »20% of all heterologous genes expressed in E. coli render soluble or correctly folded proteins [15, 16] . Several gene-fusion systems, such as NusA, maltose binding protein (MBP), glutathione-S-transferase (GST), ubiquitin (UB), and thioredoxin (Trx), have been developed [17, 18] . All of these conventional methods have shortcomings, primarily ineYcient expression and/or inconsistent cleavage. Small ubiquitin-related modiWer (SUMO) is a ubiquitin-related protein that functions by covalent attachment to other proteins. SUMO and its associated enzymes are present in all eukaryotes and are highly conserved from yeast to humans [19] [20] [21] . SUMO has 18% sequence identity with ubiquitin [22] . The yeast Saccharomyces cerevisiae has only a single SUMO gene (SMT3) that is essential for viability [20] . In contrast to yeast SMT3, three members of SUMO have been described in vertebrates: SUMO-1, SUMO-2, and SUMO-3. Human SUMO-1, a 101 amino acid polypeptide, shares 50% sequence identity with human SUMO-2/SUMO-3 [23] , which are close homologues. Yeast SUMO shares 47% sequence identity with mammalian SUMO-1. Although overall sequence identity between ubiquitin and SUMO is only 18%, structure determination by NMR reveals that they share a common three-dimensional structure characterized by a tightly packed globular fold with -sheets wrapped around a single -helix [24, 25] . It is known that SUMO, fused at the N-terminus with other proteins, can fold and protect the protein by its chaperoning properties, making it a useful tag for heterologous expression [26] . All SUMO genes encode precursor proteins with a short Cterminal sequence that extends from the conserved C-terminal Gly-Gly motif. SUMO proteases remove SUMO from proteins, by cleaving the C-termini of SUMO (-GGATY) in yeast to the mature form (-GG) or deconjugating it from lysine side chains [27, 28] . The former activity (protease) is useful for removal of SUMO as an expression tag. There are 2 SUMO proteases in yeast [27, 28] and at least 6 in humans, the human enzymes ranging from 238 to 1112 amino acid residues [22, [29] [30] [31] . We have developed a novel SUMO-fusion system that provides increased levels of expression of heterologous proteins in E. coli and allows rapid puriWcation of proteins of interest [26, 32] . We report here the application of SUMO-fusion technology to the expression and puriWcation of major SARS-CoV proteins. SARS-CoV 3CL Protease (3CL), SARS-CoV Nucleocapsid (Nc), and SARS-CoV Spike C-terminal fragment protein (Spk C) were fused with SUMO and expressed in E. coli. For expression of the proteins, SARS-CoV cDNA was derived from infected cell RNA, provided by the CDC, Atlanta, to S.R.W. (University of Pennsylvania). Expression constructs encoding the SUMO-fusion proteins all utilized the pSUMO plasmid (LifeSensors, Malvern, PA) as the backbone. The pET24 derivative carrying the SMT3 gene of S. cerevisiae, which encodes the yeast SUMO protein, has been described previously pSUMO [26] . It contains an N-terminal hexahistidine (6£ His) tag, introduced by PCR into the SUMO coding sequence, as well as a unique BsaI site at the C-terminus. The cloning strategy to express fusion proteins employed this BsaI site to insert the SARS-CoV protein coding sequences in frame with SUMO. PCR primers (Table 1) incorporating this site or Esp 3I were used to amplify the SARS-CoV coding sequences from cDNA clones carried in pTOPO vectors. The 3Ј primers carried a BamHI site for insertion into the multiple cloning site of pET24d. The primer pairs used to PCR amplify the SARS-CoV protein genes are listed in Table 1 . Because of its large size, Spike protein was designed as two half-molecules, S1 (N-terminal fragment, amino acids 1-667, SpK N) and S2 (C-terminal fragment, amino acids 668-1193, SpK C) domains and the Spk C was tested for expression and puriWcation in this study. For PCR ampliWcation of the genes of interest, a proofreading polymerase was used (Platinum Taq, Invitrogen, Carlsbad, CA). PCR fragments were subcloned into pET24-6£ His-SUMO or pET24-6£ His (a parallel vector that does not carry the SUMO sequence) to produce parallel sets of constructs encoding 6£ His-SUMO and 6£ His fused versions of the proteins of interest. All plasmids were routinely sequenced. To test and compare expression of the SARS proteins, a single colony of the E. coli strain BL21 (DE3) containing each of the plasmids described above was inoculated into 5 ml of either Luria-Bertani (LB) or M9 minimal (MM) media. The antibiotic kanamycin was also included at 30 g/ml in all media. The cells were grown at 37°C overnight with shaking at 250 rpm. The next morning the overnight culture was transferred into 50 ml fresh medium to permit exponential growth. When the OD 600 value reached »0.6-0.7, protein expression was induced by addition of 1 mM IPTG (isopropyl--Dthiogalactopyranoside), followed by prolonged growth at either 37 or 20°C to determine optimal induction conditions. For protein puriWcation, cultures were scaled up to 0.5-1.0 L LB medium. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was used to verify expression of the protein. BrieXy, 1.5 ml samples of culture were removed just before expression was induced and after induction, and cells were collected by centrifugation at 6000 rpm for 5 min. The cell pellets were suspended in 50 l of distilled water, and the samples were freezethawed once to facilitate disruption of the cells. The cell suspensions were treated with RNAse and DNAse (both at 40 g/ml) to digest nucleic acids. After mixing with SDS-PAGE sample buVer containing SDS andmercaptoethanol, samples were heated at 95°C for 5 min to facilitate denaturation and reduction of proteins. Proteins were detected using SDS-polyacrylamide gels with Tris-glycine running buVer and Coomassie blue staining. Proteins separated by SDS-PAGE were transferred onto nitrocellulose membranes at 42 V (»150 mA) for 2.5 h. Membranes were then incubated with 30 ml of TTBS buVer (pH 8.0), containing 5% nonfat dry milk for 1 h at room temperature. The expressed proteins were probed with either monoclonal anti-His-tag or polyclonal antibodies obtained from rabbits immunized against individual SUMO-SARS-CoV-fusion proteins Table 1 PCR primers for amplifying the SARS-CoV protein genes Restriction enzyme recognition sites used for cloning are indicated in uppercase letters. Owing to the presence of a BsaI site within the 3CL protease coding sequences, a diVerent restriction enzyme, Esp3I, was used at the 5Ј end to join the BsaI site in the pET-SUMO vector. Region of genes Primers Spike protein-C terminal fragment AA 668-1193 tttGGTCTCaaggtatgagtactagccaaaaatctattgtggc cgcGGATCCtcatttaatatattgctcatattttc Nucleocapsid protein Entire gene tttGGTCTCaaggtatgtctgataatggaccccaatc cgcGGATCCtcatgcctgagttgaatcagcag CL Protease Entire gene tttCGTCTCaaggtagtggttttaggaaaatggcattcccg cgcGGATCCtcattggaaggtaacaccagagc (Rockland Immunochemicals) by incubating overnight at 4°C with 1: 1000 dilution of the primary antibodies. After the membranes were washed with TTBS buVer for 5 min, they were incubated with a secondary antibody (Peroxidase-conjugated goat anti-rabbit IgG, Rockland Immunochemicals, diluted 1000-fold) for 45 min. The membranes were Wnally washed with TTBS for 10 min before the chemiluminescent Western blot substrates were applied (Roche, Mannheim, Germany), and visualized on Wlms (Kodak BioMax). Because the SUMO constructs bear an N-terminal 6£ His tag, expressed SARS-CoV proteins fused with SUMO can be rapidly puriWed by Ni-NTA aYnity chromatography. In this study, the soluble proteins from E. coli cell lysates and the insoluble proteins from the cell inclusion bodies were puriWed under native and denaturing conditions, respectively. A typical procedure for puriWcation of the SARS-CoV proteins is illustrated in Fig. 1 . Protein concentrations were determined using the Bradford color-reaction assay (Bio-Rad) measured spectrophotometrically at 595 nm with bovine serum albumin as a standard, according to the manufacturer's instructions. SDS-PAGE and Coomassie blue staining were used to evaluate the eVectiveness of the puriWcations and cleavage of SUMO-SARS-CoV protein fusions. The E. coli cells expressing the SARS-CoV proteins were harvested from LB medium (typically, 1.0 L) by centrifugation (8000g for 10 min at 4°C). Typically, the wet weights of the E. coli cells harvested from 1 L culture were 10-15 g. The cell pellets were resuspended in lysis buVer (PBS containing additional 150 mM NaCl, 10 mM imidazole, 1% Triton X-100, and 1 mM PMSF, pH 8.0) at 3 ml for 1 g of the cells, resulting in »4 mg protein per ml after the proteins were extracted. The cells were lysed by sonication (50% output for 5 £ 30 second pulses). Sonication was conducted with the tube jacketed in wet ice and observing 1 min intervals between pulse cycles to prevent heating. After the lysates were incubated with DNase and RNase (each at 40 g/ml) for 20 min, they were centrifuged at 20,000g for 30 min at 4°C, and supernatants (soluble protein fractions) were collected. The pellets containing inclusion bodies were washed three times in buVer (PBS containing 25% sucrose, 5 mM EDTA, and 1% Trition X-100, pH 7.5) followed by centrifugation, as described above. The washed inclusion bodies were resuspended in denaturing solubilization buVer (Novagen), which contained 50 mM Caps (pH 11.0), 0.3% N-lauryl sarcosine, and 1 mM DTT, and incubated for 30 min at room temperature with shaking to extract the insoluble proteins. Because debris from inclusion bodies was much smaller than that in the cell lysate, the extract for the insoluble proteins was obtained by high-speed centrifugation (80,000g for 30 min at 4°C). The soluble proteins extracted from E. coli cells were puriWed under native conditions and a BioLogic Duo-Flow FPLC system (Bio-Rad) was used for fractionations. BrieXy, the cell lysate (typically, 20-40 ml containing 0.2-0.5 g proteins) was loaded onto a column containing »10 ml Ni-NTA superXow resin (Qiagen, Valencia, CA) and the samples of Xow-through containing unbound proteins were collected for subsequent analysis. The resin was extensively washed with »50-100 ml of wash buVer (PBS containing 20 mM imidazole and additional 150 mM NaCl, pH 8.0) until OD 280 reached or fell below the base line (UV value D 0). Finally, the 6£ His-tagged SUMO-fusion proteins were eluted with elution buVer (PBS containing 300 mM imidazole and additional 150 mM NaCl, pH 8.0). The puri-Wed SUMO-fused proteins eluted as a single isolated UV peak. The proteins with high OD 280 values were collected in 4 ml fractions that were checked on SDS-gels and pooled. The insoluble proteins extracted from the E. coli inclusion bodies were puriWed under denaturing conditions, which were similar to the native conditions described above except for the use of highly alkaline pH buVer containing detergent. BrieXy, an insoluble protein sample (»20-40 ml) prepared in the denaturing buVer (50 mM Caps, 0.3% N-lauryl sarcosine, and 1 mM DTT, pH 11.0) was incubated with »10 ml of Ni-NTA super-Xow resin at 4°C for 1 h with shaking for eVective binding of the 6£ His-tagged proteins to the resin. The mixture was then loaded into an empty column and the Xow-through sample was collected. Subsequently, the resin was continually washed with denaturing wash buVer that contained 20 mM imidazole, 0.3% N-lauroyl sarcosine, 0.3 M NaCl, and 50 mM Caps, pH 11, until OD 280 fell below the base line. The 6£ Histagged SUMO-fusion proteins were Wnally eluted using denaturing elution buVer that contained the same components as in the denaturing wash buVer, except that the concentration of imidazole was increased to 300 mM. The SUMO protease used in this study was produced in our laboratory as described [26] , and a unit of the protease activity was deWned as the amount of SUMO protease that cleaves 100 g of SUMO-Met-GFP-fusion substrate at 25°C in 1 h in buVer containing 20 mM Tris-HCl, pH 8.0, and 5 mM -mercaptoethanol [26] . Before adding the enzyme for cleavage, the puriWed SUMO-fusion proteins (soluble fraction) were dialyzed with 3.5 kDa cutoV membranes against PBS (pH 7.4) for 12-15 h at 4°C to remove high salt and imidazole, while the puriWed sample in denaturing buVer were refolded by extensive dialysis for at least 24 h against 20 mM Tris-HCl (pH 8.0) containing 10% glycerol. No protein precipitation was observed during the dialysis. The minimum amount of SUMO protease required for complete cleavage of a given SUMO-fusion was variable. Typically, for most of the puriWed SUMO-SARS-CoV proteins we added the enzyme at a ratio of 1 U to 15 g substrates and incubated in either PBS (pH 7.4) or 20 mM Tris buVer (pH 8.0), containing 5 mM -mercaptoethanol, at 30°C for 1 h. In this study, cleavage of the SUMO-SARS-CoV Nc protein was achieved with a lower amount of the SUMO protease after checking eVectiveness of the enzyme in serial dilution (see Fig. 8 ). Since both SUMO and SUMO protease had 6£ His tags, but SARS-CoV proteins did not, the cleaved SUMO-fusion samples could be re-applied to the nickel column to obtain the puriWed membrane proteins by subtracting the 6£ His-tagged proteins. BrieXy, after the SUMO-fusions were cleaved by the SUMO protease, the sample was loaded onto a nickel column with Ni-NTA resin. Most of the SARS-CoV protein without 6£ His tags was eluted in the Xow-through (unbound) fractions, and the rest was recovered by washing the resin with PBS. The eluted and washed proteins appearing in fractions with high-UV values at OD 280 were pooled as the Wnal puriWed sample. The puriWed proteins were checked on SDS-gels and the samples were stored at ¡80°C after glycerol was added to 10%. SARS-CoV proteins 3CL, Nc, and Spike C, in versions fused to either 6£ His-SUMO or 6£ His, were expressed in E. coli cells under various conditions. The expressed proteins were readily identiWed by their migration positions in SDS-gels based on their molecular weights, and were further conWrmed by immunological reactions with their respective antibodies on Western blots. The expressed SARS-CoV 3CL protease (3CL) was detected in lysates of E. coli cells under several culture and induction conditions (Fig. 2) ; induced cell lysate samples showed appropriate protein bands (approximately 35 kDa for 3CL and 47 kDa for SUMO-3CLfusion) on the SDS-gels (the sequence-predicted sizes of 3CL and SUMO-3CL are 33.8 and 45.8 kDa, respectively). When fused to the 3CL, SUMO signiWcantly enhanced expression of its partner protein in both LB and MM media under all the conditions tested, compared to the 3CL expressed without SUMO-fusion (Fig. 2) . Overnight growth (»15 h) at 20°C resulted in an increased yield of SUMO-fused 3CL compared to a 6 h culture at the same temperature and a 3 h culture at 37°C (Fig. 2) . Expressed SARS-CoV Nucleocapsid (Nc) was detected in either unfused (»46 kDa) or SUMO-fused (»60 kDa) versions from IPTG-induced E. coli cells under various culture and induction conditions (Fig. 3) . Notably, much higher yields of the expressed proteins were observed from rich medium (LB) than from minimal medium (MM) (Fig. 3) , suggesting the former should be better for large-scale production and puriWcation of the proteins. Similar to the 3CL results, expression enhancement was seen when Nc was fused to SUMO and expressed in minimal medium, but in LB medium there were no signiWcant diVerences between the expression of Nc without SUMO and Nc fused with SUMO (Fig. 3) . The SUMO-fusion also greatly increased the level of expression of the C-terminal half of the SARS-CoV Spike protein (Spk C) compared to that of unfused Spk C in LB media (Fig. 4) . Only a very weak protein band (»58 kDa) of unfused Spk C could be seen in the SDSgel and no band was seen in the Western blot probed with anti-His-tag antibodies, indicating that Spk C was poorly expressed without SUMO-fusion under the conditions tested. In contrast, an intense protein band was observed at the SUMO-Spk C migration position (»68 kDa) on the SDS-gel (Fig. 4, left panel) when SpK C was fused with SUMO and the identity of the fusion protein was conWrmed by reactions with anti-His-tag antibody (Fig. 4, right panel) . PuriWcation of SARS-CoV 3CL protease Fig. 5 shows detection of the proteins from a representative puriWcation of soluble SARS-CoV 3CL under native conditions. The cell lysate containing soluble SUMO-3CL was used for this puriWcation, because a Fig. 2 . The yields of expressed SUMO-Nc proteins were higher than the Nc expressed without SUMO in minimal media, but there were no signiWcant diVerences in their expression in LB media. majority of the expressed protein (>80%) was present in the soluble fraction (data not shown). Proteins without 6£ His tags were removed from the Ni-NTA resin using wash buVer containing 20 mM imidazole, and the 6£ Histagged SUMO-3CL-fusion was eluted using elution buVer containing 300 mM imidazole. After the SUMO-3CL fractions were pooled, the sample was dialyzed extensively against PBS (pH 7.4) at 4°C to remove high salt and imidazole, which would interfere with the cleavage reaction. The SUMO-fusion was cleaved by addition of SUMO protease at 30°C for 1 h under the conditions described in Materials and methods. The completeness of cleavage was conWrmed by checking the proteins on a 12% SDS-gel, since the band of the SUMO-3CL disappeared and two new bands corresponding to the expected molecular weights of SUMO and 3CL were detected. After the cleaved sample was re-applied to a Ni-NTA column to subtract 6£ His-tagged SUMO and SUMO protease, Wnal puriWed 3CL was obtained (Fig. 5) ; the protein from the subtracted sample ran as a single, intense band (»34 kDa), indicating that 3CL had been puriWed successfully (>95% purity). In this experiment, a high yield (totally »56 mg) of the pure 3CL was achieved from 1 L of E. coli cultured and induced at 20°C overnight ( Table 2) . We used the anti-SUMO-3CL-fusion antibody to identify the puriWed 3CL protein, since the antibody Table 2 Summary of the SARS-CoV proteins resulting from representative puriWcations of 1 L E. coli culture The SARS-CoV proteins fused with SUMO were expressed in E. coli and induced at 20°C overnight. The wet weights of the cells harvested from 1 L of E. coli culture for the 3CL, Nc, and Spk C were 14, 13, and 10 g, respectively. The samples were prepared and puriWed as described in Materials and methods. could react with the SUMO-3CL-fusion and their cleaved partners. The puriWed protein was conWrmed to be the SARS-CoV 3CL by the Western blot probed with the anti-SUMO-3CL antibody (see below and Fig. 7) . Similar to the SARS-CoV 3CL protease, most of the expressed SUMO-Nc protein was found in the soluble fraction from E. coli cells, and therefore the supernatant of the cell lysate was used for puriWcation of the SARS-CoV Nc protein. The proteins resulting from various steps in the puriWcation procedure were detected using SDS-PAGE (Fig. 6 ). Using Ni-NTA aYnity to purify the 6£ His-tagged SUMO-fusion was an eYcient method, since only a single, high-density protein band was detected in the eluted fractions (Fig. 6) . After the puriWed sample was dialyzed and the SUMO protease added under the conditions described above, complete cleavage of the fusion was achieved. A single, highly intense band (»46 kDa) was detected in the Wnal puriWed sample, indicating that >95% pure SARS-CoV Nc was obtained (Fig. 6) . In this experiment, approximately 26 mg of the Nc was puriWed from the 1 L E. coli culture ( Table 2 ). The protein's identity was conWrmed by its reaction with the anti-SUMO-Nc antibody (see Fig. 7 ). Fig. 7 shows that the SUMO-3CL-fusion antibody reacted speciWcally with the puriWed 3CL, with a little cross-reactivity with Nc; likewise, SUMO-Nc-fusion antibody had a highly speciWc reaction to puriWed Nc, without any cross-reaction with 3CL. The results not only conWrmed the identities of the SARS-CoV proteins but also suggested that the puriWed SARS proteins maintained their immunity response properties. To evaluate the eVectiveness of SUMO protease on the cleavage of SUMO-SARS-CoV proteins, serial 1:1 dilutions of the enzyme (starting at 2.0 U) were used to digest aliquots (10 g) of puriWed SUMO-Nc in PBS (pH 7.4) containing 5 mM -mercaptoethanol at 30°C for 1 h ( Fig. 8) . Since it is known that SUMO has a molecular mass of 11.5 kDa (although it migrates »20 kDa in an SDS-polyacrylamide gel), and the Nc band is »46 kDa, cleavage is judged to be successful if the protein band representing full-length substrate fusion (e.g., 20 + 46 D 66 kDa in the case of SUMO-Nc) disappears and new bands corresponding to the expected molecular weights of the hydrolysis products are detected. Fig. 8 shows that as little as 0.063 U of the enzyme cleaved >95% of 10 g of SUMO-Nc-fusion (lane 6) and 0.008 U cleaved »50% of the substrate (Lane 9) under the tested conditions. When fused with SUMO, the C-terminal half of SARS-CoV Spike protein (Spk C) was expressed at high levels in E. coli (Fig. 9A) . Because approximately 60% of the total fusion protein expressed was in the bacterial inclusion bodies, the insoluble protein sample extracted from the inclusion bodies (Fig. 9A, lane 3) was used for puriWcation of the Spk C with Ni-NTA aYnity chromatography under denaturing conditions. BrieXy, the 6£ His-tagged SUMO-Spike C-fusion was eluted by elution buVer containing 300 mM imidazole, but a few other minor proteins that were without 6£ His tags but possibly rich in histidine and/or cysteine were also bound to the resin, resulting in impurities of the sample. The unwanted proteins did not interfere with the cleavage of the SUMO-fusion proteins, but reduced the purity of the sample (Fig. 9B, lane 1) . After the puriWed SUMO-Spike C protein was extensively dialyzed, the fusion was eVectively cleaved by addition of SUMO protease (>95% cleavage was achieved, see Fig. 9B, lane 2) . Finally, the 6£ His-tagged SUMO and SUMO protease were removed by applying the cleaved sample to the Ni-NTA column to purify the Spk C. An SDS-gel of the resulting sample showed unfused Spk C (»58 kDa) along with three minor proteins (see Fig. 9B, lane 3) , indicating that partially puriWed Spk C was obtained. Alternative puriWcation approaches can be used after the Ni-NTA puriWcation to get rid of the impurities if >90% purity is required. In this study, approximately 12 mg of the partially puriWed SpK C sample was obtained from the 1 L E. coli culture ( Table 2 ). At least six types of protein are encoded by the SARS coronavirus (SARS-CoV) genome. Large-scale production of these proteins in pure, functionally active form is critical to meet urgent needs in the development of diagnostic and therapeutic methods for SARS, such as antiviral drugs and vaccines, as well as for basic research purposes. Such a task is diYcult using conventional expression systems. Several major protein fusion technologies have been developed to improve expression and puriWcation of heterologous recombinant proteins in bacterial, yeast, mammalian, and insect cells. These include maltose binding protein (MBP), glutathione-S-transferase (GST), and thioredoxin (Trx) gene fusion systems [17, 33] . However, many proteins are not expressed well with these fusion systems in commonly utilized hosts. Fusion of an unstable or misfolded protein with proteins such as ubiquitin and ubiquitin-like proteins, which have a highly evolved structure, can stabilize the candidate protein. We have conducted a systematic comparison of the eVectiveness of various fusion tags (MBP, GST, Trx, NusA, and SUMO) when used as GFP fusions expressed in E. coli, and have found SUMO to be superior to the other tags for expression of the protein. GST and MBP domains have been used as tags to enhance production and puriWcation of proteins of interest [33] . Problems are encountered, however, when these tags must be removed to study the protein's structure by X-ray crystallography or NMR. Although several proteases such as thrombin, Factor Xa, and AcTEV protease are used for these purposes, all of these enzymes recognize short degenerate sequences, and, thus, cleavage can occur within the proteins of interest. Another problem encountered is inaccessibility of the cleavage site within the fusion due to steric constraints, which could reduce the eVectiveness of enzymic cleavage. The SUMO tag, by contrast, is accurately and eYciently removed from the protein of interest [26] . Comparing the cleavage of SUMO-GFP by SUMO protease to the cleavage of NusA-GFP by AcTEV protease, we found that SUMO protease had a 64-fold higher activity than AcTEV protease when the same amount of enzyme was used (unpublished results). Ubiquitin has been reported to exert chaperoning eVects on fused proteins, thus increasing expression of proteins in E. coli and yeast [34] [35] [36] . The fused proteins can be cleaved by Ub-proteases (both UCH and UBP classes), but the enzymes are unstable, diYcult to produce, and often must be used in large quantities (an enzyme to substrate ratio of 1:1), making this technology impractical for large-scale protein production [18] . Our laboratory has exploited the chaperoning properties of several ubiquitin-like proteins including SUMO, and the extreme robustness of SUMO protease has allowed us to develop a technology that provides both enhanced expression and cleavage of the fusion protein. A number of diYcult proteins have been expressed in our laboratory in both unfused and SUMO-fused forms and compared side-by-side to demonstrate that SUMO-fusion dramatically enhances the expression of many types of proteins, including membrane proteins, and that SUMO protease cleaves a variety of SUMO-fusions with high speciWcity and eYciency over a wide range of various conditions, including pH (5.5-10.5) and temperature (4-37°C) [26] . Non-speciWc cleavage of the substrate was not observed, even when the amount of enzyme was deliberately increased to a 1:1 ratio [26] . In this study, titration of the hydrolytic capacity of SUMO protease on the puriWed SARS-CoV Nc proteins conWrmed that SUMO protease is an extremely potent enzyme (Fig. 8) . The predicted molecular weight of the SUMO protease is 26.7 kDa, though it usually runs at »31 kDa position on a SDS-polyacrylamide gel. After evaluating and comparing various SARS-CoV proteins expressed with or without SUMO-fusion in E. coli under several culture and induction conditions, we found that SUMO-fusion signiWcantly increased expression of SARS proteins under nearly all conditions tested. We established a batch production protocol employing 20°C overnight growth for large-scale expression of SUMO-SARS-CoV proteins, since a shorter time (e.g., 6 h) or higher temperature (37°C) resulted in lower yields, especially for soluble proteins (data not shown). Although in most cases, cells growing in rich medium (LB) produced more SUMO-fused protein than cells growing in minimal medium (MM), we will use MM to investigate secreted SARS-CoV proteins in future studies, since rich medium contains a large number of interfering proteins. In addition to producing SARS-CoV proteins in large quantities for basic research and for development of anti-SARS pharmaceutical agents, it is important to produce pure proteins that retain biological activity. The expressed and puriWed SARS-CoV proteins had immunological activity, but the question remains concerning their functional activities. It appears, at least in the case of one SARS-CoV protein, that SUMO-enhanced expression and puriWcation from E. coli results in active protein. In a study to be published, SUMO-fusion enhanced expression of the SARS-CoV RNA-dependent RNA polymerase (RdRp), and the puriWed soluble RdRp was biologically active (unpublished results). Finally, we recently observed that SUMO-fusion signiWcantly enhanced expression and puriWcation of SARS-CoV membrane protein (M) as well. Using the SUMO-fusion technology described here, the expression level of SARS-CoV M protein in E. coli was greatly improved, and the insoluble proteins extracted from the bacterial inclusion bodies were puriWed [32] . Application of the various puriWed SARS-CoV proteins to the development of SARS vaccines and functional assays are underway. A novel coronavirus associated with severe acute respiratory syndrome Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-oV from the coronavirus group 2 lineage Coronaviruses: a comparative review Nucleotide sequence of the human coronavirus 229E RNA polymerase locus Molecular analysis of the human coronavirus (strain 229E) genome Characterization of the human coronavirus 229E (HCV 229E) gene 1 Molecular model of SARS coronavirus polymerase: implications for biochemical functions and drug design The nucleocapsid protein of the SARS coronavirus is capable of self-association through a C-terminal 209 amino acid interaction domain T-cell epitopes in severe acute respiratory syndrome (SARS) coronavirus spike protein elicit a speciWc T-cell immune response in patients who recover from SARS Immunogenicity of SARS inactivated vaccine in BALB/c mice The SARS-CoV S glycoprotein: expression and functional characterization A novel sorting signal for intracellular localization is present in the S protein of a porcine coronavirus but abse2461nt from severe acute respiratory syndrome-associated coronavirus IdentiWcation of an antigenic determinant on the S2 domain of the severe acute respiratory syndrome coronavirus spike glycoprotein capable of inducing neutralizing antibodies Rapid protein-folding assay using green Xuorescent protein Phosphodiesterase expression in human epithelial cells AYnity fusion strategies for detection, puriWcation, and immobilization of recombinant proteins An eYcient system for high-level expression and easy puri-Wcation of authentic recombinant proteins SUMO, ubiquitin's mysterious cousin Ubiquitin and its kin: how close are the family ties? SUMO-nonclassical ubiquitin Ubiquitin-like proteins: new wines in new bottles Functional heterogeneity of small ubiquitin-related protein modiWers SUMO-1 versus SUMO-2/3 Structural attributes in the conjugation of ubiquitin, SUMO and RUB to protein substrates Structure determination of the small ubiquitinrelated modiWer SUMO-1 SUMO fusions and SUMO-speciWc protease for eYcient expression and puriWcation of proteins A new protease required for cell-cycle progression in yeast The yeast ULP2 (SMT4) gene encodes a novel protease speciWc for the ubiquitin-like Smt3 protein A new 30-kDa ubiquitin-related SUMO-1 hydrolase from bovine brain Versatile protein tag, SUMO: its enzymology and biological function A new SUMO-1-speciWc protease, SUSP1, that is highly expressed in reproductive organs Enhanced expression and puriWcation of membrane proteins by SUMO fusion in Escherichia coli Genetic design for facilitated production and recovery of recombinant proteins in Escherichia coli Reconstitution of the vitamin D-responsive osteocalcin transcription unit in Saccharomyces cerevisiae Increasing gene expression in yeast by fusion to ubiquitin Ubiquitin fusion augments the yield of cloned gene products in Escherichia coli The work described here was supported in part by a grant (R43 GM067271-01) from the NIH/NIGMS to T.R.B. and a grant (RO1-AI 17418) from NIH/NIAID to S.R.W.