key: cord-0917395-pz7mi86u
authors: Yang, Yaling; Hussain, Snawar; Wang, Hao; Ke, Min; Guo, Deyin
title: Translational control of the subgenomic RNAs of severe acute respiratory syndrome coronavirus
date: 2009-04-12
journal: Virus Genes
DOI: 10.1007/s11262-009-0357-y
sha: 200541cd31725f25c191225c2f4e04d4c551ce91
doc_id: 917395
cord_uid: pz7mi86u

The 3′-one-third of the severe acute respiratory syndrome coronavirus (SARS-CoV) genome contains genes for four essential structural proteins and eight virus-specific genes. The expression of this genomic information of SARS-CoV involves synthesis of a nested set of subgenomic RNAs (sgRNAs). In this study, we showed that the translational levels of 10 SARS-CoV sgRNAs including the two low-abundance sgRNAs 2-1 and 3-1 varied considerably in translation reporter assays. We also demonstrated that the initiator AUG codon of sgRNA-8 was silent and the repressive control was most likely positioned in the upstream untranslated region (UTR) of sgRNA-8. The initiator AUG codons of most sgRNAs are in poor Kozak contexts and the translation of truncated proteins from downstream AUG codons by leaky scanning was common in our experimental settings. No significant correlation was found between complexity of 5′-UTR and the sequence context of AUG codon with the level of translation of SARS-CoV sgRNAs. These results will be helpful for further studies to reveal the biological functions and translation regulatory mechanisms of sgRNAs in the coronavirus life cycle and pathogenesis.

Coronaviruses are the largest RNA viruses that are enveloped and contain a single-stranded, positive-sense RNA genome ranging from 27 to 31.5 kb in length. The genome of coronaviruses is polycistronic and possesses a 5 0 -cap structure and a 3 0 -poly (A) tail [1] . At the 5 0 -end, the two large open reading frames (ORFs) (1a and 1b) comprise about two-thirds of the entire coronaviruses genome, which encode the viral replicase and are translated directly from the genomic RNA [2] . Besides four essential structural proteins spike (S), envelope (E), membrane (M), and nucleocapsid (N), the 3 0 -one-third of the genome comprises variable number of group-specific genes, which are expressed through a set of nested 3 0 -coterminal subgenomic RNAs (sgRNAs) (Fig. 1a) . A key feature of these sgRNAs is that their 5 0 -and 3 0 -terminal sequences are identical to those of the genome. This nested set structure results from a fusion of the sequence representing the genomic 5 0 -end (leader sequence) and sequences representing different 3 0 -regions of the genome, the so-called mRNA bodies (body sequences). Though the 5 0 -end of genome greatly affects coronavirus discontinuous transcription to produce sgRNAs [3] , the regulatory mechanism of coronavirus gene expression is not well understood.

The 3 0 -proximal one-third of severe acute respiratory syndrome coronavirus (SARS-CoV) genome includes eight virus-specific genes: 3a and 3b genes (located between the S and E genes), 6, 7a, 7b, 8a, and 8b genes (located between the E and N genes), as well as 9b and 9c gene (located within the N gene) [4] . In our previous work, we identified 10 sgRNAs from SARS-CoV-infected cells and showed that the transcription of sgRNAs was in a discontinuous manner at the stage of negative strand synthesis [5] . As all the sgRNAs contain a common leader of about 72 nucleotides (nt), it is still not clear how expressions of the 3 0 -proximal genes are controlled at the translational level. Revelation of the translational control mechanism will help to explain the roles of the group-specific genes and their encoded accessory proteins in viral life cycle and pathogenesis.

In this study, we showed that nine SARS-CoV sgRNAs could be expressed in the reporter system at different levels and the 5 0 -upstream untranslated regions (UTRs) of individual sgRNAs controlled the translational efficiency of their encoded proteins.

Baby hamster kidney (BHK) cells were maintained in Dulbecco's Modified Eagle Medium (DMEM) (Gibco Invitrogen) supplemented with 10% heat inactivated fetal bovine serum (Gibco Invitrogen), 2 mM L-glutamine, 100 U/ml of penicillin and 100 lg/ml streptomycin (Gibco Invitrogen Corporation). The cDNAs of SARS coronavirus strain WHU (GenBank accession no.AY394850) were prepared as described previously [5, 6] .

The 5 0 -end of SARS-CoV sgRNA 2-1 (including the leader sequence and 146 nt of the 5 0 -end body sequence) was PCR amplified as described [5] and cloned into pEGFP-N1 vector (Clontech) ( Table 1 ). The ORF of sgRNA 2-1 was fused in-frame and out-of-frame with that of the green florescent protein (GFP) gene, respectively, resulting in plasmids p2-1-GFP and p2-1-GFP D (Fig. 1b) .

In another set of experiments, the 5 0 -ends (including the leader sequence and 200-400 nt of the 5 0 -end of body sequence) of all 10 sgRNAs were amplified by RT-PCR and cloned into pEGFP-N1 vector, with their open reading frames fused in-frame with GFP gene (p2/S, p2-1, p3/3a, p3-1, p4/E, p5/M, p6, p7/7a, p8 and p9/N) ( Fig. 1 and Table 1 ). To circumvent the problem of wild-type GFP expression by leaky scanning, the initiator AUG codon of GFP gene was substituted with GUG by PCR-based mutagenesis, resulting in pGFP* as a negative control.

In parallel experiments, the same 5 0 -terminal sequences of 10 sgRNAs were cloned into pGL3.0 vector (Promega) by fusing the viral ORF in-frame with luciferase gene to quantitatively measure the sgRNAs translational level. The sequence and position of primers used for plasmid constructions were shown in Table 1 .

Baby hamster kidney (BHK) cells were grown to 70-80% confluence on a 35-mm 2 plate and transfected with the DNA plasmids using Lipofectamine 2000 (Invitrogen) according to the manufacturer's instructions. The protein expression was measured by fluorescence microscopy and western blots after 36 h post-transfection. Briefly, transfected BHK cells were lysed with 29 SDS loading buffer and separated on a 12% SDS-polyacrylamide gel. Proteins were transferred onto polyvinylidene difluoride membranes (Bio-Rad). The membranes were blocked overnight with 5% non-fat milk in PBS and incubated with the monoclonal anti-GFP antibody (1:10,000, Clontech). After washing with PBST (PBS with 0.05% Tween-20) for three times, the membranes were incubated with 0.2 ng/ml of horseradish peroxidase-labeled secondary antibody (Lab Vision, USA) for 2 h. Immune complexes were visualized using the LumiGLO TM chemilumiscent substrate kit (Kirkegaard and Perry Lab, Maryland USA). Firefly luciferase activity assay

In the firefly luciferase reporter gene assays, BHK cells were plated in 24-well plates at 1 9 10 5 cells per well, and transfected with recombinant sgRNA-luciferase fusion plasmids as described above at 2 lg per well. To assess the expression level of sgRNAs, firefly luciferase activity was quantified using a Steady-Glo Luciferase Assay System Kit (Promega) at different time points post-infection. The empty pGL3.0 transfected cells were used as a positive control while the mock-transfected cells were used as negative control. All the values were expressed as a mean of three independent experiments.

Translatability of the low-abundance sgRNA 2-1

To determine whether the low-abundance sgRNA 2-1 discovered in the previous study [5] is a functional message RNA, the 5 0 -proximal 220 nt of the sgRNA 2-1 was fused with the GFP gene both in-frame and out-of-frame. Recombinant plasmids were transfected into BHK cells and the expression of GFP was assessed by fluorescence microscopy (Fig. 2a) . The GFP out-of-frame construct (p2-1-GFP D ) was used as a control to monitor any possible leaky scanning-mediated expression of the reporter gene. Empty pEGFP-N1 vector was used as positive control to assess the transfection efficiency.

As shown in Fig. 2a , relative to p2-1-GFP D transfected cells, robust GFP fluorescence was observed in p2-1-GFP and wild-type GFP transfected cells. We also observed the expression of GFP in p2-1-GFP D from a downstream AUG codon by leaky scanning. To further confirm the expression of GFP from a downstream start codon, AUG codon usage of the sgRNA 2-1 and the existence of fusion protein in transfected cells, we performed western blot to detect the fusion protein using an anti-GFP antibody. As shown in Fig. 2b , a 32 kDa fusion protein and a relatively less intense 27 kDa band of wild-type GFP were detected in cells transfected with p2-1-GFP, whereas only the 27 kDa band was detected in the cells transfected with p2-1-GFP D . These data suggest that the authentic AUG codon of ORF 2b in sgRNA2-1 was used for translation, leading to expression of fusion protein, while leaky expression from the AUG of GFP gene also took place.

Scanning ribosome may initiate translation from the weak AUG in sgRNAs at a low frequency or bypass it in favor of the stronger downstream AUG codon of GFP, which is located at only 144 nt downstream from the initiator AUG of ORF2b. Thus, leaky scanning could probably lead to the expression of wild-type GFP from both in-frame and out-of-frame fusion constructs (Fig. 2) .

Taken together, we have shown that the sgRNA 2-1 could be a functional mRNA in SARS-CoV-infected cells although it was of low-abundance in the host cells. According to the prediction from the sgRNA 2-1 sequence, expression of ORF 2b in the sgRNA may result in production of a truncated S protein, which is predicted to lack Varied translational levels of SARS-CoV subgenomic RNAs in translation reporter system

Ten sgRNAs have been identified [5] and the SARS-CoV accessory proteins 3a, 3b, 6,7a, and 7b can be detected in infected cells or SARS patients besides structural proteins [7, 8] , while the expression of 8a and 8b is controversial [9] [10] [11] [12] [13] . Elucidation of the regulatory mechanism in the translation is important for understanding the pathogenesis of SARS-CoV; however, it is hard to compare the differential translation of sgRNAs because the steady-state level of viral proteins in infected cells reflects the sum of transcription, translation, and the relative stabilities of these transcriptional and translational products. In this study, we adopted the reporter gene system by fusing with partial sgRNA ORF of similar size under the control of the same promoter. This system was supposed to specifically address and compare the translation efficiency of individual sgRNAs by circumventing the problem resulted from different transcription efficiency and protein stability. We cloned the 5 0 -ends containing a full leader sequence and the 5 0 -200-400 nt of the body sequence of all 10 sgRNAs into the pEGFP-N1 vector (Fig. 1) . The predicted start codon AUG of each ORF was cloned in-frame with the GFP gene and the start codon of GFP was replaced with GUG. Strong fluorescence was observed in cells transfected with fusion constructs p2/S, p2-1, p3/3a, p5/M, p6, p4/E, and p9/N, whereas relatively weak fluorescence was observed in cells transfected with fusion constructs p3-1, p7/7a, and p8 (Fig. 3a) . Expressions of GFP-fusion proteins of expected sizes were detected in cells transfected with plasmids p2-1, p3/3a, p3-1, p4/E, p5/M, p6, p7/7a, and p9/N (Fig. 3b) . The major protein band of sgRNA 2-GFP fusion construct (Fig. 3b) was larger than theoretically calculated size (Table 2 ). This discrepancy could be due to the post-translational modification of protein or not fully denatured protein complex. One minor band below the major band may represent the correct fusion translation product (Fig. 3b) . On the other hand, protein bands with smaller sizes were detected in cells transfected with constructs of p3/3a, p4/E, and p5/M, which might result from leaky expression from downstream AUG codons, premature termination or degradation product by cellular proteinases (Fig. 3b) .

Although the initiator AUG of GFP was replaced with GUG in the fusion constructs, strong fluorescence was still observed with pGFP*, indicating that GUG may serve as a non-canonical translation start codon (Fig. 3a) .

This result was further confirmed by western blot analysis in cells transfected with pEGFP-N1 and pGFP* (Fig. 3b) . It may be due to the flanking primary sequence that closely matches to the consensus motif GCCACCAUGG, which is the optimal context for initiation of eukaryotic mRNAs translation [14, 15] . It is known that GUG can function as an efficient start codon in mammalian cells [16] .

When sgRNAs expression levels shown in Fig. 3 are compared, fluorescence intensities represent total expression of GFP in transfected cells, including GFP-fusion protein and GFP expression by leaky scanning from downstream start codon. For example cells transfected with p6 showed stronger fluorescence signal than p7/7a (Fig. 3a) , however, western blot result indicated comparable level of fusion protein in p7/7a and p6 transfected cells (Fig. 3b) . These results suggest that more GFP was translated from downstream start codon in p6 transfected cells as compared to p7/7a transfected cells. Therefore, the western blot analysis provided more specific information on translation initiation efficiency from either the first AUG codon in sgRNAs or downstream AUGs by leaky scanning.

In order to confirm the above results, we cloned the 5 0 -ends of all 10 sgRNAs into the pGL3.0 vector to fuse in-frame with luciferase gene for sensitive and quantitative measurement of the varied sgRNAs translation. The luciferase activity expressed from sgRNA 2-1 (sg2-1), sg3, sg5/M, sg6, sg7/7a to sg9/N was 24-491 fold higher than that from sg8 at 18, 24, and 36 h posttransfections, respectively (Fig. 4) . These results are consistent with the observations in the GFP-fusion assay system.

RNA viruses employ various mechanisms to regulate their gene expression at the translational level. Leaky scanning allows the translation of multiple ORFs from a common mRNA substrate, and such leaky scanning has already been reported for viral RNA translation [17, 18] . For coronaviruses, it has been reported that the SARS-CoV ORF7b and the infectious bronchitis virus (IBV) ORF3b are translated by leaky ribosomal scanning [19, 20] . Our data showed that leaky scanning, which leads to translation from downstream AUG codon, may be common for coronavirus RNAs. Messenger RNAs in which the first AUG codon lacks the preferred nucleotide at both of the key positions (-3, ?4) in the Kozak context have the special property of initiating translation at the first and downstream AUG codons, thereby producing two or more proteins from one mRNA. Further studies are needed to investigate the translation of downstream ORFs as well as the role of truncated proteins (if any such protein exists in SARS-CoV infected cells) expressed from downstream AUG codons by leaky scanning.

The 5 0 -UTR of sgRNA8 could be a cis-acting suppressor element

In most human isolates of SARS-CoV, the sgRNA 8 contains two ORFs, ORF8a and ORF8b. The SARS-CoV WHU strain has a deletion of two nucleotides corresponding to the nucleotides 27,808 and 27,809 in ORF 8a of SARS-CoV Tor2 and Urbani [4, 21] . This 2-nt deletion leads to a shifted ORF 8a of only 24 amino acids instead of 39 amino acids.

Although SARS-CoV 8b gene product could be expressed in vivo when cloned directly behind a promoter [11] [12] [13] ; the expression of 8a and 8b in SARS-CoVinfected cells is still controversial [9, 10, 13] . As shown above, we were unable to detect the protein expression of sgRNA 8 with the 5 0 viral leader sequence, which corroborated with a recent report on ORF8 expression [13] . Cells transfected with p8 displayed significant fluorescence (Figs. 3a, 5a) , but the expression of fusion protein could not be detected in western blot (Fig. 5b) . To investigate a possible role of the sgRNA8 5 0 -UTR in translation, the 5 0 -UTR of sgRNA 8 was replaced by the 5 0 -UTR of sgRNA 5 to create the plasmid p8/5 because the initiator AUG codon of sgRNA 5 was shown to be functional (Fig. 3b) , and the length and the secondary structure of both 5 0 -UTRs were predicted to be similar to sgRNA 8 5 0 -UTR. Interestingly, the replacement of the 5 0 -UTR resulted in the translation of fusion protein from initiator AUG codon of ORF 8b. As expected, when the 5 0 -UTR of sgRNA 5 was replaced by the 5 0 -UTR of sgRNA 8 to create the plasmid p5/8, the expression of p5/8 could not be detected (Fig. 5b) . We speculated that a small ORF (8a), which is present in the 5 0 -UTR of sgRNA 8, might play a role in translational suppression from downstream initiator AUG codon of ORF 8b. To study any possible role of upstream ORF in translation suppression, the GFP was fused with ORF 8a to create the plasmid p8a but no fusion protein was detected in cells transfected with this recombinant plasmid (Fig. 5b) . This shows that translational suppression from the initiator AUG codon of ORF 8b was not the result of the expression of 8a but could be due to other cis-acting elements present in the 5 0 -UTR region. Taken together, these data indicate that the 5 0 -UTR may act as a suppression regulatory element that led to the inhibition of expression of both ORF 8a and 8b in sgRNA 8.

The conservation of Kozak context alone has no correlation with the translation efficiency of SARS-CoV sgRNAs

As the expression levels of SARS-CoV sgRNAs were significantly different, we determined whether the sequence context around the start codon AUG (Kozak sequence) plays an important role in the translation of sgRNAs. The optimal context for initiation of translation in vertebrate mRNAs is ACCAUGG [14, 15] . In this consensus motif, two nucleotides at the highly conserved positions (a G residue following the AUG codon (position ?4) and a purine, preferably A, three nucleotides upstream AUG condon (position -3)) exert the strongest effect. Sequence analysis revealed that AUG codons of sgRNAs 2-1, 5/M and 8 are in better Kozak context (Table 2) . They have only one nucleotide mismatch with the consensus sequence motif ACCAUGG. A pyrimidine (C) is present at position -3 in sgRNA 2-1, whereas a mismatch at position -2 and -1 is present in sgRNA 5 and 8 respectively (Table 2) . However, sgRNA 8 has a low-translation efficiency as shown above. The sequence surrounding the AUG initiator codon of sgRNAs 2 and 7 has two nucleotides mismatch with consensus Kozak sequence, and notably they are lacking a guanine (G) at position ?4. The sgRNAs 3, 3-1, 4, 6 and 9 possess poor Kozak sequence context for translation initiation (Table 2 ) but most could be translated efficiently (Fig. 3b) . Taken together, no significant correlation was found between the Kozak context around AUG initiator codon of sgRNAs and the translational level of the fusion proteins. The length of 5 0 -UTR, the G ? C content, and the secondary structure near the 5 0 -end of an mRNA can drastically affect the translational level of mRNAs [22] [23] [24] [25] . The ORF 8a and 8b were fused in-frame with GFP open reading frame in pEGFP-N1 vector. p8: sgRNA8 ORF8b fused with GFP; p8/5: the 5 0 -UTR of sgRNA 8 was replaced with that of sgRNA 5; p5/8: the 5 0 -UTR of sgRNA 5 with that of sgRNA 8; p8a: ORF 8a fused with GFP; GFP: pEGFP-N1 as control; mock: non-transfected cells We next analyzed whether the properties of the 5 0 -UTR could influence the translation efficiency of sgRNAs. All SARS-CoV sgRNAs contain the same leader sequence but the 5 0 -UTR lengths are variable, ranging from 72 nt to 265 nt (Table 2) . We calculated the G ? C contents of different sgRNA 5 0 -UTRs and analysed the secondary structures and the free energy (DG) of the major loops of sgRNA 5 0 -UTRs (Table 2) . Surprisingly, no significant correlation was found between the length, the G ? C content, the secondary structure of 5 0 -UTR, and the translational level of reporter gene (Table 2 ).

In summary, the current work addressed the difference of SARS-CoV sgRNA translation efficiency, but it would not correlate with the actual steady-state levels of SARS-CoV proteins in infected cells because the latter is also influenced by the abundance of sgRNA resulted from different transcription levels and regulation as well as the different stability of individual viral proteins. Therefore, further studies are required to determine the relationship between the level of transcription, translation, and relative abundance of protein in cells which were infected by SARS-CoV. At the translational step, our data showed that translation from the downstream initiator codon by leaky scanning was common to SARS-CoV sgRNAs and this could lead to synthesis of truncated viral protein products (if the downstream AUG is in the same reading frame) or altered proteins, which may act as decoys to fool immune system and favor viral replication.

Nidovirales Virus taxonomy: Classification And Nomenclature of Viruses

Proc. Natl Acad. Sci. USA