key: cord-0001389-vnafx1ng authors: Lo, Michael K.; Søgaard, Teit Max; Karlin, David G. title: Evolution and Structural Organization of the C Proteins of Paramyxovirinae date: 2014-02-25 journal: PLoS One DOI: 10.1371/journal.pone.0090003 sha: 540ccda8bb32d2b9192782b878cc2759da109212 doc_id: 1389 cord_uid: vnafx1ng The phosphoprotein (P) gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT), and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group) and human parainfluenza virus 1 (Sendai group). We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site) and a highly constrained region (the C-terminus of C), seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations. Paramyxovirinae is a large virus subfamily that contains 9 known human pathogens: measles virus, mumps virus, human parainfluenza viruses type 1 (hPIV1), 2, 3 and 4, Menangle virus, and the recently emerged, highly pathogenic Nipah and Hendra viruses [1] . Paramyxovirinae encode multiple proteins from the phosphoprotein (P) gene transcription unit, including P, V, and C. In almost all Paramyxovirinae, the P gene mRNA is edited, resulting in the expression of at least two proteins, P and V, which share an identical N-terminus (PNT), but have a unique C-terminus ( Figure 1A ) (for a review, see [2] ). In addition, several genera, including Morbilliviruses, Henipaviruses, and Respiroviruses, encode a third protein, C, within their P gene, from an overlapping reading frame [2] . The C proteins are expressed by a variety of mechanisms including: leaky scanning [3] [4] [5] , non-AUG start codons [6, 7] , ribosomal shunting [8] , and proteolytic processing [9] . The region of P that overlaps C, corresponding approximately to PNT ( Figure 1A) , is disordered [10] [11] [12] [13] , and contains conserved sequence motifs, such as soyuz1, found in all Paramyxovirinae, which binds the viral nucleoprotein, and soyuz2, of unknown function [14] . The two primary functions of the C proteins are their abilities to regulate viral transcription/replication and to antagonize the antiviral responses of the host. These functions are thought to be interconnected, since a decrease in viral transcription/replication often correlates with a decrease in the innate antiviral responses of the host [15] [16] [17] [18] (for a review, see [19] ). Most paramyxoviral C proteins inhibit viral RNA synthesis, and thereby presumably regulate viral gene expression [20] [21] [22] [23] [24] . However, they differ in the degree to which they block host antiviral responses [25] . These responses are composed of two crucial signaling cascades: A) Induction of type I interferon (IFN), following recognition of virusderived elements by pattern recognition receptors (PRRs) and B) IFN signaling through the JAK/STAT pathway, leading to transcription of antiviral effector genes [26, 27] . Most paramyxoviral C proteins can inhibit IFN induction, but only respiroviruses are known to inhibit IFN signaling. Morbillivirus C proteins have two mechanisms to counteract IFN induction: 1) by reducing levels of viral replication, which limits the production of viral patterns recognized by PRRs and prevents them from inducing IFN [17, 21, 28] ; and 2) by inhibiting IFN transcription in the nucleus [29, 30] . An initial study reported that measles virus C protein blocks IFN signaling [31] , but subsequent studies indicated that this effect is not significant [17, 32, 33] . Similarly, although the mechanistic details are less clear, Henipavirus C proteins block IFN induction by decreasing viral RNA synthesis, which indirectly inhibits type I IFN induction; but they have minimal effects on IFN signaling [15, [34] [35] [36] [37] . Like the morbilliviruses, Respirovirus C proteins also counteract IFN induction through two mechanisms: 1) by minimizing production of double-stranded RNA (dsRNA), thereby avoiding PRR activation [16, 38] ; and 2) by inhibiting IRF3-dependent induction of type I IFN [39] . However, the C proteins of respiroviruses differ from those of Morbilliviruses and Henipaviruses in being also able to inhibit IFN signaling [16, 26, [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] . Finally, a new role has been reported recently for the C proteins of respiroviruses: they regulate the levels of viral genomes and antigenomes produced during infection [54] . Interestingly, henipaviruses and morbilliviruses can also block IFN signaling, but do so by proteins encoded by the P frame rather than the C frame (i.e. P, V, or a third protein called W), which interfere with the localization or phosphorylation of STAT1 (Signal Transduction Activator of Transcription 1), among other mechanisms [55] [56] [57] [58] [59] [60] [61] [62] . Overlapping genes, such as those encoding P and C, are of particular interest because they encode proteins originated de novo (in contrast to origination by well-characterized processes such as gene duplication or horizontal gene transfer [63, 64] ). Indeed, overlapping genes are thought to arise by overprinting, a process in which mutations within an existing (''ancestral'') protein-coding reading frame allow the expression of a second reading frame (the de novo frame), while preserving the expression of the first frame [65] [66] [67] . De novo proteins have been little studied but are known to play an important role in viral pathogenicity [68, 69] , for instance by neutralizing the host interferon response [70] or the RNA interference pathway [71] . In addition, de novo proteins characterised so far have previously unknown 3D structural folds [68, 71, 72] and novel mechanisms of action [71] . Thus, this class of proteins may challenge the notion that nature only utilizes a limited number of different protein folds and that this fold space is well mapped [73, 74] . Another particularly interesting feature of overlapping genes is the evolutionary paradox they present, since the overlap imposes sequence constraints which should restrict the ability of the virus to adapt [75] [76] [77] [78] [79] [80] [81] . Our study was divided in three strands. First, we predicted the structural organization of the C proteins, and determined whether they had detectable sequence similarity, which could indicate a common origin, guide experimental studies, and facilitate 3D structure determination [82] . Second, we verified our predictions experimentally, by expressing, purifying and characterizing several C proteins in bacteria. Third, we investigated the evolutionary history of the P/C gene overlap, and tried to determine which, of P and C, is the novel frame. The accession numbers of the sequences of Paramyxovirinae P used in this study, as well as the abbreviations of species names, are in Table 1 . The sequence of the C protein of Pacific salmon paramyxovirus [83, 84] was generously made available by Bill Batts and Jim Winton. We used Psi-Coffee [85, 86] for multiple sequence alignments (MSAs). All alignments are presented using Jalview [87] with the ClustalX colouring scheme (see Figure 2b and 2d in [88] ). The aligned sequences of the C proteins in text format are in File S1. We used two criteria to estimate the reliability of alignments of the C proteins: 1) the CORE reliability index, which is based on the agreement between the different alignment programs used by Psi-Coffee, and is part of the standard output of Psi-coffee [86] ; 2) in the case of the measles and Nipah groups, we also considered the coherence between the alignments of either group separately and the alignment of both groups. We considered as not reliably aligned the positions that either have a low Psi-coffee CORE index, or are not aligned in the same way in these alignments. Finally, we used TranslatorX [89] to generate a nucleotide alignment of the P/C gene corresponding to an amino acid alignment of the C protein. The alignment of the C proteins (not shown) was created using the MUSCLE program [90] built in TranslatorX, and is thus slightly different from that generated by Psi-coffee, mainly in the region between E a2 and S/T a4 . This has no impact on the results presented. The secondary structure of individual sequences was predicted using Jpred [91] , and was verified in the context of multiple alignments using PROMALS [92] . We predicted disordered regions with MetaPrDOS [93] , according to the principles described in [94] . We used HHalign [95] to compare the MSAs of the C proteins of various groups, with a cutoff E-value of 10 25 . To identify and cluster homologous C proteins, we performed iterative sequence searches [96] on the C proteins of each taxon, using csi-blast [97] and HHblits [98] with a cutoff E-value of 10 23 , as described in [99] . We identified 5 subgroups of homologs ( Figure 1 ), formed by the following taxons: 1) the genus morbillivirus and Salem virus; 2) Tupaia Paramyxovirus, Mossman virus, and Nariva virus; 3) the genus henipavirus; 4) the newly proposed genus jeilongvirus; and 5) the two genera respirovirus and aquaparamyxovirus (called ''Sendai group''). Several proteins of subgroups 1 and 3 had a subsignificant (E.10 23 ) similarity with proteins of subgroups 2 and 4, respectively, indicating that these subgroups may be homologous [99] . We confirmed their homology by using HHalign [95] (E = 5.10 211 for the comparison between subgroups 1 and 2, and E = 2.10 29 for the comparison between subgroups 3 and 4). We called the combination of subgroups 1 and 2 ''measles group'' and the combination of subgroups 3 and 4 ''Nipah group''. To maximize our chances of successfully expressing C proteins, we adopted a high-throughput approach. We cloned full-length synthetic cDNAs (obtained from Genscript) of the C proteins of all 24 species in the measles, Nipah and Sendai groups into the vector pOPIN-F [100] using the InFusion procedure, as described in [100, 101] . The resulting fusion proteins have an N-terminal hexahistidine tag followed by a 3C cleavage site immediately upstream of the coding sequence of the C proteins. Proteins were expressed in the bacteria Escherichia coli (E. coli) using the BL21(DE3) Rosetta pLysS strain (Novagen), following the ZYM-5052 auto-induction protocol [102] . Briefly, large scale cultures were inoculated to OD600 of 0.02 and grown for 16 h at 25uC. Cells were harvested and the pellet resuspended 1:3 (w/vol) in lysis buffer (50 mM TrisHCl, 500 mM NaCl, 30 mM Imidazole pH 8.0, 1% vol/vol Protease inhibitor mix (Sigma P8849)) and frozen in liquid nitrogen before storage at 280uC. Purification of the C Proteins We purified both C proteins in two steps: Nickel Immobilized Affinity Chromatography (IMAC) followed by size-exclusion chromatography (SEC). Pellets were thawed and homogenized (Constant Systems homogenizer) at 25 kpi at 4uC. The lysate was cleared at 50,000 g for 30 minutes before batch incubation of the supernatant (i.e. the soluble fraction of bacteria) on Ni-NTA sepharose FF resin (Qiagen) for 2 hrs at 4uC. The material was collected in an Econo-Pac column (Biorad) and washed in 100 Column Volumes (CV) of lysis buffer. Elution was done in 0.5 CV fractions with lysis buffer containing 500 mM Imidazole. Fractions containing protein were pooled and loaded onto a preparative Superdex 75 (GE Healthcare Life Sciences) size exclusion column pre-equilibrated in 20 mM Tris 150 mM NaCl, 1 mM EDTA, pH 7.5. Peak fractions were pooled and concentrated using 15 ml spin concentrators (Millipore). Protein samples were extensively dialyzed into 20 mM NaPhosphate, 20 mM NaCl pH7.5 and then concentrated to 0.2 mg/ml in spin concentrators (0.5 ml, 3KDa MWCO, Millipore). The Circular dichroism (CD) analysis was done on a JASCO 815 CD spectropolarimeter. Data are averages of 5 independent scans in the 190 nm -250 nm range, and were normalized to the baseline of the dialysis buffer. The data were smoothed using the manufacturer's software (Jasco SpectraManager) before interpre-tation. The percentage of a-helix was calculated according to the formula: percentage of a-helix = (h 208-4000)/(-33000-4000)6100, where h 208 is the ellipticity at 208 nm [103] . From 1 mg/ml protease stocks, we made 10-fold serial dilutions in 20 mM Hepes, 50 mM NaCl, 10 mM MgSO4, pH 7.5. Proteins were concentrated to 0.6 mg/ml by spin concentrators (0.5 ml, 3 MWCO, Millipore). For limited proteolysis, 10 ml of protein was mixed with 3 ml of protease and incubated on ice for 30 min, 60 min or 2 hrs. Reactions were stopped by adding 2 ml protease inhibitor mix (Sigma P8849). To each reaction, 5 ml of 4x SDS PAGE sample buffer was added and samples were heated to 95uC for 2 min before loading on a 1 mm 15% SDS-PAGE gel. A subtilisin digest of hPIV1 C and an a-chymotrypsin digest of Tupaia PMV C gave rise to stable fragments which were blotted to PVDF before submitting the samples for N-terminal sequencing (ALTA bioscience, UK). Analytical size exclusion chromatography (SEC) was performed at a flow-rate of 0.5 ml/min using a Superdex 75 10/300 column (GE Healthcare Life Sciences) pre-equilibrated in 20 mM TrisCl, 150 mM NaCl, 1 mM EDTA pH = 7.9. The column was calibrated with a separate run of appropriate globular marker The C Proteins of Paramyxovirinae Cluster in three Groups: the Measles, Nipah and Sendai Groups On the basis of sequence analyses (see Methods), the C proteins of Paramyxovirinae can be divided into three groups: the measles, Nipah and Sendai groups ( Figure 1B ). The measles group is composed of morbilliviruses, of the unclassified Salem virus, and of a subgroup comprising the unclassified Tupaia paramyxovirus, Mossman virus and Nariva virus. The Nipah group comprises henipaviruses and jeilongviruses. Finally, the Sendai group is composed of respiroviruses and of the recently described genus aquaparamyxovirus, composed of fish viruses [83, 84, 104] related to respiroviruses [105, 106] . The classification of C into measles and Nipah groups is supported by an examination of the PNT domain of P, which is encoded by the same region as C but in a different frame ( Figure 1A) . Indeed, the PNT of all species in the Nipah group differ from the PNT of the measles group in having a soyuz2 motif (see Introduction) [14] . We found that other Paramyxovirinae that do not not express a C frame [2, 107] can be classified in two groups based on the phylogeny of their P gene: the mumps group (comprised of the sister genera rubulavirus and avulavirus) and the Fer de lance group (formed by the genus ferlavirus [108] ). This classification corresponds to that of previous analyses [105] . We separately aligned the C proteins of the measles, Nipah and Sendai groups (Figures 2, 3, and 4 respectively; the aligned sequences in text format are in file S1). In these three groups, we observed a similar organization of the C proteins, composed of a variable N-terminus predicted to be disordered, and of a Cterminus predicted to be ordered and a-helical. We compared these alignments to each other using the profile-profile comparison software HHalign [95] (see Methods). Briefly, a sequence profile is a representation of a multiple alignment that contains information about which amino acids (aas) are ''tolerated'' at each position of the alignment, and with what probability. Comparing profiles is much more sensitive than comparing single sequences, because the profiles contain information about how the sequences can diverge and thus can identify weak similarities which remain after both sequences have diverged [99, 109, 110] . HHalign reported that the C proteins of the measles and Nipah groups have statistically significant similarity (E = 4610 26 ) over a region of about 50aa in their C-terminus (shown in Figure 5 ). This high similarity could in theory result either from convergent evolution or from homologous descent. The fact that the measles and Nipah groups are phylogenetically related [105] , and that their C proteins are encoded in the same genomic location makes homologous descent a much more likely explanation. On the other hand, HHalign did not detect any similarity between the C proteins of the Sendai group and those of the measles and Nipah Series of deletions in aa149-157 Loss of nuclear translocation of Y1 by Ran-GTPase pathway [120] K151A/E153L/R157L (Cm*) Increased IFN-b induction and dsRNA production, induction of antiviral state, increased CPE, apathogenic in vivo Inability to bind STAT1, ablated ability to inhibit RNA synthesis, decreased binding to viral polymerase (L protein) [46, 118] K77R/D80A (Cm2'), D80A Increased cytopathic effect, increased nuclear translocation of IRF3, increased IFN-b induction and production of dsRNA [39] K151A/E153A/R154A (Cm5) Attenuated virulence in vivo, inability to block IFN signaling, inability to inhibit replication, inability to skew STAT1/2 phosphorylation and to bind STAT1, decreased binding to L protein [39, 46, 118, 127] Human parainfluenza virus 1 (hPIV1) Increased IFN-b production, increased IRF3 nuclear translocation, reduced plaque sizes, non-temperature sensitive mutation contributing to attenuation in vivo [43, 157] These studies used either recombinant viruses, minigenome systems, or eukaryotic expression systems. Substituted residues that are conserved in a group are in bold. For a more comprehensive list of studies on Paramyxovirinae C, please see Table S1 . doi:10.1371/journal.pone.0090003.t002 groups. Thus, either they are not homologous, despite their similar organization, or they are homologous but have diverged in sequence beyond recognition. The latter scenario is possible, in theory, since the relative frame of C compared to P (+1) is the same in the Sendai group and in the measles/Nipah groups ( Figure 1A ). Figures 2 and 3 present alignments of the C proteins of the measles and Nipah groups, respectively. Above the alignments, we indicated regions of C that overlap conserved motifs of the P frame. The C proteins of the measles and Nipah groups are all composed of a 30-60 amino acid (aa) N-terminus predicted to be at least partially disordered, and of a 90-120 aa C-terminus comprising a predicted a-helix (a1), a loop of 10-20aa (''loop 1-2 ''), and three further a-helices (a2 to a4), followed in some species by C-terminal extensions of at most 20aa (forming helix a5 in some species of the Nipah group). In the C proteins of the measles group, only the region from a2 to a4 is well conserved in sequence; it contains many conserved positions (Figure 2 ), of which six (boxed) are also conserved in the C proteins of the Nipah group (see below). In contrast, the C proteins of the Nipah group contains two additional, conserved regions ( Figure 3 ): 1) a short N-terminus with a-helical potential (a0, aa 2-19 in Nipah virus), containing a hydrophobic region followed by a basic region (boxed in Figure 3) ; and 2) a short region at the C-terminus of a1 (aa 74-83 in Nipah virus) that Figure 2 . Numbering corresponds to the C protein of Sendai virus. Arrows indicate the start of the different isoforms of C. For information, the arrowhead indicates the well-characterized F residue of respiroviruses (F170 in Sendai virus), whose substitution by S reduces innate immune antagonism and attenuation of in vivo pathogenesis by C [39, 53, [153] [154] [155] (see Table S1 ). The N-terminal sequence of the fragment of hPIV1 C obtained after limited proteolysis is underlined. The variable region between basic region 1 and residue G89 is not reliably aligned and is presented for information only. doi:10.1371/journal.pone.0090003.g004 contains two conserved acidic positions (E/D). The apparent conservation of other regions of C, which overlap the soyuz1 and soyuz2 motifs of the P frame (Figure 3 ), should not be overinterpreted, since it may be due to constraints imposed by selection pressures acting in fact on the P frame, which is much more conserved than the C frame in these regions (not shown). An alignment of the C proteins of both groups ( Figure 5 ) revealed four remarkable positions conserved in nearly all viruses (boxed in Figure 5 ): a Tyrosine (Y) upstream of helix a2 (Y a2 ); a Glutamate (E a2 ) at the C-terminus of the same helix; a residue with an alcohol group (Serine/Threonine, S/T a4 ) at the Nterminus of helix a4; and a Glutamate (E a4 ) two residues downstream. Two other positions of hydrophobic nature (indicated by ''h'') are conserved in both groups. These conserved residues are also boxed in Figures 2 and 3 , in the separate alignments of the measles and Nipah groups. Other positions that appear conserved in Figure 5 or in Figures 2 and 3 may in fact not be reliably aligned (see Methods) and are therefore not boxed. Figure 4 shows the alignment of the C proteins of the Sendai group. In Sendai virus and human parainfluenza virus 1 (hPIV1), as many as four products (C', C, Y1, Y2) are expressed from the C reading frame by a combination of alternative initiation codons [6] [7] [8] and proteolytic processing [9] . Their respective N-termini are indicated by arrows. The C proteins of the Sendai group have a similar organization to that of the measles and Nipah groups. They are composed of a variable, disordered N-terminus of about 80aa, rich in Prolines (P), Serines (S) and Threonines (T), followed by a conserved C-terminus composed of four a-helices (aA to aD). The N-terminus contains a basic region (boxed in Figure 4 ) within a predicted a-helix (aZ), like the C protein of the Nipah group ( Figure 3) . In the C protein of Sendai virus, the first half of aZ was reported to act as a membrane-targeting signal, perhaps by forming an amphipathic a-helix [111] . There are 11 residues strictly conserved in C across the Sendai group, clustered predominantly in the C-terminus of aC and in aD. aC is particularly rich in K and R (''basic region 2'' in Figure 4 ), suggesting it might bind a negatively charged partner. We present in Figure 6 a summary of the structural and functional organization of PNT and C in the different taxa of Paramyxovirinae, to scale, with their functional motifs vis-à-vis of each other. PNT contains sequences that bind the protein STAT1 in several morbilliviruses (measles virus [55, 56] , canine distemper virus [57] , Rinderpest virus [60] ) and henipaviruses (Nipah virus [58] and Hendra virus [59] ). The region of PNT that contains these sites is highly variable in sequence (Figure 7) , and thus its alignment is not reliable. In contrast, the overlapping region of C is well conserved, and its alignment reliable ( Figure 5 ). Therefore, we used the C frame to construct a reliable alignment of PNT. We proceeded in two steps (see Methods). First, we used the amino acid alignment of the C proteins ( Figure 8 , top panel) to generate an alignment of the nucleotide sequences of the P/C gene (Figure 8 , middle panel and File S2), using TranslatorX [89] . Second, we translated this nucleotide alignment into an amino acid alignment in the P frame ( Figure 8, bottom panel) . The resulting alignment of PNT of the measles and Nipah groups is presented in Figure 9 . From the reliable alignment of PNT corrected by using the C frame (Figure 9 ), we made three observations: i) The STAT1-binding sites of measles virus and Nipah virus PNT are conserved in sequence only in very closely related species (thick boxes in Figure 9 ). For instance, in PNT of Feline morbillivirus, which is more distantly related to measles virus than other morbilliviruses, only 2 aa out of 11 (E110 and I116) correspond to conservative substitutions with respect to the STAT1-binding motif of measles virus (Figure 9 ). Such a high number of non-conservative substitutions within a short peptide suggests that it may not bind STAT1. ii) The STAT1-binding sites of measles virus and Nipah virus PNT are not aligned together ( Figure 9 ) (although they overlap slightly, by 4aa), which indicates that they are encoded in different locations of the P/C gene. It is thus highly likely that they have originated independently (see Discussion). iii) The STAT1-binding sites of measles virus and Nipah virus PNT have some limited sequence similarity, as reported earlier [58] : 7 and 9 . However, this similarity is unlikely to be due to homologous descent, since the motifs are not aligned together in the reliable alignment of PNT ( Figure 9 ). Likewise, the tyrosine residues immediately upstream of this motif (Y110 in measles virus PNT, critical for STAT1 inhibition [33, 55, 112, 113] , and Y116 in Nipah virus PNT), which were perceived to occur in a similar sequence context [58] , are not aligned together either in the reliable alignment of PNT (Figure 9 ), indicating that they are not homologous either. Finally, we also noticed an 8aa motif (aa 104-111 in Nipah virus) conserved in the PNT of all henipaviruses (Figure 9 , thin box). We called this motif ust1 (for ''upstream of STAT1''). Its function is unknown, though aa 81-113 of Nipah virus P, which include ust1, are required for the synthesis of viral RNA [58] . We cannot exclude, however, the possibility that the conservation of ust1 is due to constraints imposed by the overlapping C frame. We systematically examined mutational studies of Paramyxovirinae C and their phenotypic impact. The most relevant studies are in Table 2 and a more extensive list of studies is in Table S1 . We found that very few conserved positions identified herein have been subjected to targeted mutagenesis; notable substitutions are indicated in bold in Figures 2 and 4 . In the measles group, experimental substitutions have been performed mostly in the C-terminus of C. In a comparison of a temperature-sensitive strain of measles vaccine, AIK-C, with its parental strain, Edmonston [114] , one of several substitutions identified, S134Y, occurs in the S/T a4 position conserved in the measles and Nipah groups (Figures 2 and 5) (Table 2) . Although this particular substitution is not responsible for the temperature sensitive phenotype [114] , we note that it is located within a 12aa peptide (aa 127-138) recently shown to inhibit the viral polymerase by interacting with SHCBP1 (Shc Src homology 2 domain-binding protein 1) [115] . This peptide, underlined in Figures 2 and 5 , contains two other positions conserved in the measles/Nipah groups (a hydrophobic residue and E a4 ). Such conservation suggests that other viruses in the measles/Nipah groups may also bind SHCBP1 to block the viral polymerase. Finally, the role of the disordered N-terminus of measles virus C is poorly known, although it contributes to nuclear localization, which correlates with its ability to block IFN induction [29] ( Table 2 ). In the Nipah group, there are no fine mutational data published, but it is known that both the N-terminus and the Cterminus of Nipah virus C are required to inhibit minigenome replication [116] . In the Sendai group, experimental substitutions have delineated multiple residues in the C-terminus of C responsible for antagonizing both IFN induction and IFN signaling, and for regulating viral transcription and replication [46, 49, 117, 118] ( Table 2 and Table S1 ). For both Sendai virus and hPIV3, the minimal region required for STAT1-binding corresponds to the structured, well-conserved C-terminus of C [117, 119] . Within that domain, aas 149-157 (corresponding roughly to basic region 2, underlined in Figure 4 ) are critical for nuclear translocation of the Y1 isoform of Sendai virus C, and may also play a role in the inhibition of type-I IFN-stimulated gene expression [120] . This region contains several conserved residues, suggesting that its function may be conserved in the Sendai group. Studies of the Nterminus of C in the Sendai group indicate that it also contributes to antagonizing the innate immune response and to regulate viral transcription and replication [121, 122] (Table 2 and Table S1 ). Taken together, these studies suggest that both the N-and Cterminus of Sendai group C proteins may need to act in In order to check our predictions of structural organization, we attempted to characterize biophysically at least one C protein of the measles/Nipah groups and one of the Sendai group. We systematically tested, in the bacteria E. coli, the expression and solubility of the C proteins of all species in the measles, Nipah and Sendai groups (see Methods). We found that the C proteins of tupaia paramyxovirus (Tupaia PMV) and of hPIV1 were by far the best candidates, for the measles/Nipah groups and Sendai group respectively, in terms of yield and solubility (not shown). We expressed both proteins as hexahistidine-tagged N-terminal fusion proteins in Escherichia coli and purified them from the soluble fraction by immobilized metal affinity chromatography (IMAC) and size exclusion chromatography (SEC) (see Methods). Mass spectrometry confirmed that the C proteins had the exact expected mass. In SDS-PAGE analysis ( Figure S1 ), hPIV1 C migrated at a notably larger size (,31kD) than expected (25.9kD), while Tupaia PMV C migrated at ,21kD, only slightly above the expected size (19.7kD ). This anomalous migration may be caused by regions that are disordered or have a biased aa composition [123] . Accordingly, the N-terminus of both proteins is predicted disordered, and has a biased composition in the case of hPIV1 C. We analyzed the secondary structure of the C proteins by Circular Dichroism (CD). The CD spectrum of both proteins ( Figure 10 ) is typical of a-helical content [124] , with two dips in ellipticity at around 208 and 222 nm. The estimated a-helical content was 57% for hPIV1 C and 33% for Tupaia PMV C (see Methods). We also examined the C proteins by analytical SEC (Figure 11 ). Tupaia PMV C elutes at an apparent molecular mass of 21.4 kDa, close to its theoretical mass of 19.7 KDa. In contrast, hPIV1 C elutes at a much larger MW (38.7 kDa) than expected (25.9 kDa). This discrepancy could correspond to an extended shape, or to self-association in a fast equilibrium between a monomeric and dimeric form (see below). We used limited proteolysis combined with N-terminal sequencing to probe the structural organization of the C proteins of hPIV1 and Tupaia PMV. We tested a range of proteases with different substrate requirements (see Methods), and identified fragments resistant to proteolysis, indicative of folded domains. Digestion of hPIV1 C by subtilisin yielded a stable degradation product of around 14 kD (Figure 12 , left panel), whose N-terminal sequence, starting at aa 104, is underlined in Figure 4 . The size of this fragment indicates that it comprises the whole C-terminus of C (expected size 14.16 kD), which corresponds well to our sequence predictions (Figure 4) . These results are also coherent with cellular experiments that identified a proteolysis-sensitive N- terminus in the C' proteins of Sendai virus [125] . We note that the presence of a long, disordered region in hPIV1 C is compatible with its high apparent molecular weight observed in SEC (see above) [126] . Digestion of the C protein of Tupaia PMV by a-chymotrypsin yielded a series of bands ranging from 14 kD to 6 kD ( Figure 12 , right panel); further digestion (not shown) yielded a single 6kD fragment. We obtained N-terminal sequences of the three most abundant fragments, of ,14.4, 13, and 6 kD (arrows to the right of Figure 12 ). They start respectively at aa 30, 43 and 84. This pattern of proteolytic digestion indicates that Tupaia PMV C is composed of a disordered N-terminus and of an ordered Cterminus. This is compatible with our predictions, in which aa 1-56 are devoid of secondary structure ( Figure 2 ) and aa 1-42 disordered, and in which a predicted loop, a 1-2 (aa 81-92), could be accessible to proteolysis. The observed fragments of 14.4 and 13kD correspond exactly to C proteins where aa 1-29 and 1-43, respectively, have been digested, whereas the size of the smaller fragment (6kD) corresponds to aa 81-135, indicating that the last 18 C-terminal aa are digested upon extended proteolysis. In summary, our experiments confirm that in vitro, the C proteins of hPIV1 and Tupaia PMV are predominantly a-helical and contain a disordered N-terminus, whose boundaries are in good agreement with our sequence-based predictions. Substituting the conserved, charged residues we have identified herein should be a powerful way to dissect the function of C. Indeed, charged residues are often on the surface of proteins and thus their conservation is generally the result of functional constraints, rather than constraints imposed by a mere structural role. The power of this approach has been shown by studies on several regions of respirovirus C [39, 46, 127] , and our thorough sequence analysis of the full-length C proteins of all Paramyxovirinae should greatly extend its applicability. In addition, knowing the structural organization of C will allow the design of deletions that have less risk of disrupting its three-dimensional structure. The C proteins of the Sendai group have no detectable sequence similarity with those of the measles/Nipah groups. However, we consider it unlikely that they have an independent origin, because they are located in the same region of the P gene, in the same frame relative to P, and have a similar structural organization and several similar functions [118, 128, 129] . Thus we consider that all C proteins most probably have a common origin, as proposed earlier [67, 130] . The absence of a C protein in the mumps group is probably due to a loss in the ancestor of that group, since the Sendai group, which has a C protein, is basal in a phylogeny of the P gene [105] . This common origin would imply that in Sendai virus, it is the Y1 isoform of C that is the equivalent of C of the measles/Nipah groups, because their start codon have the same location immediately upstream of the soyuz1 motif of the P frame ( Figure 6 ; compare also Figure 4 and Figure 3 ). Therefore, the C and C' proteins of Sendai virus would have presumably originated by mutations creating new, alternative start codons upstream of Y1. A common origin of Paramyxovirinae C proteins would also imply that the basic regions in the N-terminus of C have originated independently in the Sendai and Nipah groups, since they occupy different positions with respect to soyuz1 ( Figure 6 ). Which Frame Originated Earlier, PNT or C? Overlapping genes typically encode an ancestral frame and a novel frame originated by overprinting it (see Introduction). Our analyses in this work and in an earlier study [14] suggest that the C and PNT frames were probably both present in the ancestor of Paramyxovirinae, making it impossible to conclude which frame is ancestral on the basis of phylogeny. Analysis of codon usage [131] cannot determine which frame is ancestral either, because the codon usages of PNT and C are indistinguishable in Paramyxovirinae (Angelo Pavesi, personal communication). However, functional considerations suggest that the PNT frame originated earlier, since it is indispensable to viral replication in vitro [2, 132] , unlike C [20, 133, 134] . The ancestry of PNT is supported by a comparison with families related to Paramyxovirinae (Mononegavirales). Most Mononegavirales also encode P proteins with a disordered Nterminus [14, 135] ; at least in Rhabdoviridae, this N-terminus has the same function as Paramyxovirinae PNT, i.e. preventing the nucleoprotein from self-assembling illegitimately [136] [137] [138] [139] . Thus, it is reasonable to speculate that the P of the ancestral Mononegavirales already had a disordered N-terminus, which was overprinted by C in the ancestor of Paramyxovirinae. The STAT1-binding sites of measles virus and Nipah virus do not align together in the reliable alignment of PNT, generated using the C frame ( Figure 9 ). This strongly suggests that they have originated independently. Alternatively, since they overlap by 4aa (Figure 9 ), these STAT1-binding sites might, in theory, have originated from a common, short peptide, providing some STAT1-blocking capability, and later have extended respectively upstream and downstream of PNT. However, this scenario is not parsimonious because it would imply several losses in the lineages separating measles virus and Nipah virus. Also, the common 4aa stretch is chemically very different in both viruses (G 117 EAV in measles virus and V 115 YHD in Nipah virus, Figure 9 ). We thus consider it most likely that the STAT1-binding sites of measles virus and Nipah virus have originated independently. Their limited sequence similarity (they share an [Y/H]DH[S/ G]GE motif, underlined in Figure 9 ) would thus not be the result of homologous descent, but could instead result either from convergent evolution (owing to a common mechanism), or from random chance. Convergent evolution seems a definite possibility, since the mechanisms by which PNT acts are somewhat similar in both viruses (PNT interferes with the phosphorylation of cytoplasmic STAT1) [55, 58, 112, 140] , and since the PNT of both viruses bind a similar part of STAT1 [141] . Overlapping genes are an evolutionary paradox, because they simultaneously encode two proteins whose freedom to mutate is constrained by each other, which should severely reduce the ability of the virus to adapt [75] [76] [77] [78] [79] [80] [81] . A first key to the paradox has been suggested earlier [67, 77, 78, [142] [143] [144] [145] [146] : overlapping genes frequently encode an ''ancillary'' frame that can tolerate a higher substitution rate than the other, ''dominant'' frame; the ancillary frame is often Figure 13 . Three patterns of sequence constraints in the overlapping frames P and C. PNT and C are represented vis-à-vis of each other with same conventions as in Figure 6 . Sequence constraints of PNT and C were estimated by their sequence variability. doi:10.1371/journal.pone.0090003.g013 structurally disordered [68] . Accordingly, a previous sequence analysis of Sendai virus indicated that PNT and C are generally not both under strong constraint [142] ; rather, the N-terminus of PNT is markedly more conserved than that of C, whereas the Cterminus of PNT is markedly more conserved than that of C [142] . This is also the case for most of the PNT and C of measles and Nipah virus (Figure 13 , evolutionary pattern 1 or 2), with the exception of the region corresponding to the STAT1-binding sites of PNT (see below). A second key to the paradox of overlapping genes is that it may be beneficial for a virus, under certain conditions, to encode functional motifs simultaneously by using overlapping frames [147] . Initially, we were very surprised to discover that a region of the P/C gene encodes simultaneously, in different frames, two well-conserved regions: the STAT1-binding motif of PNT, and the a2-a4 region of C ( Figure 13, evolutionary pattern 3) . Intuitively, this arrangement seems to dramatically restrict the capacity of the virus to mutate and to escape host defenses. We were all the more surprised that this arrangement originated twice independently, in measles virus and in Nipah virus (see Figure 6 ). This seems beyond coincidence, and strongly suggests that the loss of fitness of the virus due to its reduced ability to mutate is compensated by an evolutionary advantage. In fact, this phenomenon had been predicted on the basis of mathematical modeling [147] . Given a high mutation rate, it may be advantageous to encode crucial functional motifs in overlapping frames (provided that they are short), because the superposition of critical amino acids reduce the number of vulnerable positions in the genome. The conditions of application of the model are met here: RNA viruses have one of the highest mutation rate of all organisms [148] , and the STAT1binding sites are short (10-26aa). It will be interesting to investigate whether this evolutionary pattern, in which two reading frames are both under strong constraint, is common in viruses, and whether it does entail a selective advantage. The genome of Hepatitis B virus, for instance, also contains short regions where both the overlapping Polymerase and Glycoprotein frames are under strong constraint [149, 150] . A recent innovative methodology that combines experimental and computational approaches [151] could help to tease out the different factors (structural, functional and co-evolutionary) constraining overlapping motifs. Finally, a third key to the paradox of overlapping genes is that they provide a regulatory advantage that may offset the increased constraints they impose on the virus, by encoding two proteins that are co-regulated and have complementary functions [131] . For instance, the expression levels of the C and V proteins of Nipah or measles viruses are co-regulated, since they are transcribed from the same gene transcription unit; in addition, their roles are complementary, since together they inhibit both viral RNA synthesis and type I IFN induction, enabling an efficient block of the first stage of the host antiviral response [15, 17, 20, 24, 152] . In the same vein, the expression of C and P is also co-regulated and they have complementary effects on viral transcription, mediated by binding the same cellular protein, SHCBP1 [115] . In conclusion, we predict that the C proteins of the Sendai group and of the measles/Nipah groups will have the same structural fold, testifying to a common origin, and that this fold will be a previously unobserved one, in keeping with their de novo origin [68] . File S1 Multiple sequence alignment of the C proteins of the measles, Nipah, and Sendai groups. File S2 Multiple sequence alignment of the P/C genes of the measles and Nipah groups, based on an alignment of the C proteins (DOC) A summary of taxonomic changes recently approved by ICTV Paramyxoviridae: the viruses and their replication Measles virus P gene codes for two proteins Sendai virus contains overlapping genes expressed from a single mRNA Determination of the henipavirus phosphoprotein gene mRNA editing frequencies and detection of the C, V and W proteins of Nipah virus in virus-infected cells Ribosomal initiation from an ACG codon in the Sendai virus P/C mRNA The parainfluenza virus type 1 P/C gene uses a very efficient GUG codon to start its C' protein Sendai virus Y proteins are initiated by a ribosomal shunt Proteolytic processing and translation initiation: two independent mechanisms for the expression of the Sendai virus Y proteins The N-terminal domain of the phosphoprotein of Morbilliviruses belongs to the natively unfolded class of proteins Structural disorder within Henipavirus nucleoprotein and phosphoprotein: from predictions to experimental assessment Inhibition of RNA synthesis following proteolytic cleavage of Newcastle disease virus P protein Functions of Sendai virus nucleocapsid polypeptides: enzymatic activities in nucleocapsids following cleavage of polypeptide P by Staphylococcus aureus protease V8 Detecting remote sequence homology in disordered proteins: discovery of conserved motifs in the N-termini of Mononegavirales phosphoproteins Distinct and overlapping roles of nipah virus p gene products in modulating the human endothelial cell antiviral response Sendai virus C protein plays a role in restricting PKR activation by limiting the generation of intracellular double-stranded RNA Measles virus circumvents the host interferon response by different actions of the C and V proteins Translational inhibition and increased interferon induction in cells infected with C proteindeficient measles virus The regulation of type I interferon production by paramyxoviruses The C, V and W proteins of Nipah virus inhibit minigenome replication Identification of naturally occurring amino acid variations that affect the ability of the measles virus C protein to regulate genome replication and transcription The Sendai virus nonstructural C proteins specifically inhibit viral mRNA synthesis The Sendai paramyxovirus accessory C proteins inhibit viral genome amplification in a promoter-specific fashion Mutations in the measles virus C protein that up regulate viral RNA synthesis Paramyxovirus evasion of innate immunity: Diverse strategies for common targets Innate immune response to viral infection Antagonism of innate immunity by paramyxovirus accessory proteins Mechanisms of protein kinase PKR-mediated amplification of beta interferon induction by C protein-deficient measles virus Measles virus C protein interferes with Beta interferon transcription in the nucleus The rinderpest virus non-structural C protein blocks the induction of type 1 interferon The C protein of measles virus inhibits the type I interferon response Inhibition of interferon induction and signaling by paramyxoviruses Regulation of interferon signaling by the C and V proteins from attenuated and wild-type strains of measles virus Nonstructural Nipah virus C protein regulates both the early host proinflammatory response and viral virulence The nonstructural proteins of Nipah virus play a key role in pathogenicity in experimentally infected animals Newcastle disease virus (NDV)-based assay demonstrates interferon-antagonist activity for the NDV V protein and the Nipah virus V, W, and C proteins The emergence of Nipah virus, a highly pathogenic paramyxovirus The C proteins of human parainfluenza virus type 1 limit doublestranded RNA accumulation that would otherwise trigger activation of MDA5 and protein kinase R Conserved charged amino acids within Sendai virus C protein play multiple roles in the evasion of innate immune responses Mutations within the human parainfluenza virus type 3 (HPIV 3) C protein affect viral replication and host interferon induction The C proteins of human parainfluenza virus type 1 block IFN signaling by binding and retaining Stat1 in perinuclear aggregates at the late endosome The C proteins of human parainfluenza virus type 1 (HPIV1) control the transcription of a broad array of cellular genes that would otherwise respond to HPIV1 infection Attenuating mutations in the P/C gene of human parainfluenza virus type 1 (HPIV1) vaccine candidates abrogate the inhibition of both induction and signaling of type I interferon (IFN) by wild-type HPIV1 Inhibition of STAT 1 phosphorylation by human parainfluenza virus type 3 C protein C and V proteins of Sendai virus target signaling pathways leading to IRF-3 activation for the negative regulation of interferon-beta production Characterization of the amino acid residues of sendai virus C protein that are critically involved in its interferon antagonism and RNA synthesis downregulation The STAT2 activation process is a crucial target of Sendai virus C protein for the blockade of alpha interferon signaling The amino-terminal extensions of the longer Sendai virus C proteins modulate pY701-Stat1 and bulk Stat1 levels independently of interferon signaling All four Sendai Virus C proteins bind Stat1, but only the larger forms also induce its mono-ubiquitination and degradation Longer and shorter forms of Sendai virus C proteins play different roles in modulating the cellular antiviral response Knockout of the Sendai virus C gene eliminates the viral ability to prevent the interferonalpha/beta-mediated responses Sendai virus C proteins counteract the interferon-mediated induction of an antiviral state Human parainfluenza virus type 1 C proteins are nonessential proteins that inhibit the host interferon and apoptotic responses and are required for efficient replication in nonhuman primates Sendai virus C proteins regulate viral genome and antigenome synthesis to dictate the negative genome polarity Measles virus V protein blocks Jak1-mediated phosphorylation of STAT1 to escape IFN-alpha/beta signaling A recombinant measles virus unable to antagonize STAT1 function cannot control inflammation and is attenuated in rhesus monkeys Two Domains of the V Protein of Virulent Canine Distemper Virus Selectively Inhibit STAT1 and STAT2 Nuclear Import Nipah virus sequesters inactive STAT1 in the nucleus via a P gene-encoded mechanism Hendra virus V protein inhibits interferon signaling by preventing STAT1 and STAT2 nuclear accumulation Rinderpest virus blocks type I and type II interferon action: role of structural and nonstructural proteins Morbillivirus v proteins exhibit multiple mechanisms to block type 1 and type 2 interferon signalling pathways Different functions of the common P/V/W and V-specific domains of rinderpest virus V protein in blocking interferon signalling The origin of new genes: glimpses from the young and old Duplication and divergence: the evolution of new genes and old ideas Origins of genes: ''big bang'' or continuous creation? Identification of an overprinting gene in Merkel cell polyomavirus provides evolutionary insight into the birth of viral genes Evolution of viral proteins originated de novo by overprinting Overlapping genes produce proteins with unusual sequence properties and offer insight into de novo protein creation Virus counterdefense: diverse strategies for evading the RNA-silencing immunity The N-terminus of Bunyamwera orthobunyavirus NSs protein is essential for interferon antagonism Size selective recognition of siRNA by an RNA silencing suppressor The crystal structure of ORF-9b, a lipid binding protein from the SARS coronavirus On the origin and highly likely completeness of single-domain protein structures Further Evidence for the Likely Completeness of the Library of Solved Single Domain Protein Structures Evolution of overlapping genes Degeneracy of the information contained in amino acid sequences: evidence from overlaid genes Constrained evolution with respect to gene overlap of hepatitis B virus Simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus Immune-induced evolutionary selection focused on a single reading frame in overlapping hepatitis B virus proteins Stability and evolution of overlapping genes The effect of gene overlapping on the rate of RNA virus evolution FFAS server: novel features and applications Genetic Analysis of Paramyxovirus Isolates from Pacific Salmon Reveals Two Independently Co-circulating Lineages Isolation of a new virus from chinook salmon (Oncorhynchus tshawytscha T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures Jalview Version 2-a multiple sequence alignment editor and analysis workbench Visualization of multiple alignments, phylogenies and gene family evolution TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations MUSCLE: a multiple sequence alignment method with reduced time and space complexity The Jpred 3 secondary structure prediction server PROMALS web server for accurate multiple protein sequence alignments Prediction of disordered regions in proteins based on the meta approach A practical overview of protein disorder prediction methods The MPI Bioinformatics Toolkit for protein sequence analysis Improved Detection of Remote Homologues Using Cascade PSI-BLAST: Influence of Neighbouring Protein Families on Sequence Coverage Sequence context-specific profiles for homology searching HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently ''orphan'' viral proteins A versatile ligation-independent cloning method suitable for high-throughput expression screening applications Protein production by auto-induction in high density shaking cultures Computed circular dichroism spectra for the evaluation of protein conformation Isolation and partial characterization of a novel paramyxovirus from the gills of diseased seawater-reared Atlantic salmon (Salmo salar L) Reassessing conflicting evolutionary histories of the Paramyxoviridae and the origins of respiroviruses with Bayesian multigene phylogenies Phylogenetic position of a paramyxovirus from Atlantic salmon Salmo Salar The P gene of Newcastle disease virus does not encode an accessory X protein Complete genome sequence of Fer-de-Lance virus reveals a novel gene in reptilian paramyxoviruses Protein sequence comparison and fold recognition: progress and good-practice benchmarking Sequence comparison and protein structure prediction Targeting of the Sendai virus C protein to the plasma membrane via a peptide-only membrane anchor Tyrosine 110 in the measles virus phosphoprotein is required to block STAT1 phosphorylation Dissection of measles virus V protein in relation to its ability to block alpha/beta interferon signal transduction The phosphoprotein of attenuated measles AIK-C vaccine strain contributes to its temperature-sensitive phenotype Measles virus non-structural C protein modulates viral RNA polymerase activity by interacting with a host protein SHCBP1 The C, V and W proteins of Nipah virus inhibit minigenome replication The aminoterminal half of Sendai virus C protein is not responsible for either counteracting the antiviral action of interferons or down-regulating viral RNA synthesis Sendai virus wild-type and mutant C proteins show a direct correlation between L polymerase binding and inhibition of viral RNA synthesis Differential regulation of type I interferon and epidermal growth factor pathways by a human Respirovirus virulence factor Clustered Basic Amino Acids of the Small Sendai Virus C Protein Y1 Are Critical to Its Ran GTPase-Mediated Nuclear Localization Domain within the C protein of human parainfluenza virus type 3 that regulates interferon signaling N-terminally truncated C protein, CNDelta25, of human parainfluenza virus type 3 is a potent inhibitor of viral replication Aberrant mobility phenomena of the DNA repair protein XPA How to study proteins by circular dichroism Intracellular processing of the Sendai virus C' protein leads to the generation of a Y protein module: structurefunctional implications Hydrodynamic radii of native and denatured proteins measured by pulse field gradient NMR techniques Importance of the anti-interferon capacity of Sendai virus C protein for pathogenicity in mice Rinderpest virus C and V proteins interact with the major (L) component of the viral polymerase An antiinterferon activity shared by paramyxovirus C proteins: Inhibition of Toll-like receptor 7/9-dependent alpha interferon induction Molecular evolution of the Paramyxoviridae and Rhabdoviridae multiple-protein-encoding P gene Viral proteins originated de novo by overprinting can be identified by codon usage: application to the ''gene nursery'' of deltaretroviruses The Sendai virus P gene expresses both an essential protein and an inhibitor of RNA synthesis by shuffling modules via mRNA editing Sendai virus C proteins are categorically nonessential gene products but silencing their expression severely impairs viral replication and pathogenesis The nonstructural C protein is not essential for multiplication of Edmonston B strain measles virus in cultured cells Structural disorder in proteins of the rhabdoviridae replication complex An N-Terminal Domain of the Sendai Paramyxovirus P-Protein Acts as a Chaperone for the Np Protein during the Nascent Chain Assembly Step of Genome Replication Interaction of vesicular stomatitis virus P and N proteins: identification of two overlapping domains at the N terminus of P that are involved in N0-P complex formation and encapsidation of viral genome RNA Rabies virus chaperone: Identification of the phosphoprotein peptide that keeps nucleoprotein soluble and free from non-specific RNA Domains of Rinderpest virus phosphoprotein involved in interaction with itself and the nucleocapsid protein Nipah virus V and W proteins have a common STAT1-binding domain yet inhibit STAT1 activation from the cytoplasmic and nuclear compartments, respectively The measles virus phosphoprotein interacts with the linker domain of STAT1 Conserved and nonconserved regions in the Sendai virus genome: evolution of a gene possessing overlapping reading frames Sequence analysis of Potato leafroll virus isolates reveals genetic stability, major evolutionary events and differential selection pressure between overlapping reading frame products Overlapping reading frames in closely related human papillomaviruses result in modular rates of selection within E2 Origin and evolution of overlapping genes in the family Microviridae Influence of overlapping genes on the evolution of human hepatitis B virus Overlapping messages and survivability Belshaw R (2010) Viral mutation rates Computational evolutionary analysis of the overlapped surface (S) and polymerase (P) region in hepatitis B virus indicates the spacer domain in P is crucial for survival Overlapping structure of hepatitis B virus (HBV) genome and immune selection pressure are critical forces modulating HBV evolution An experimental and computational evolution-based method to study a mode of co-evolution of overlapping open reading frames in the AAV2 viral genome Role of V protein RNA binding in inhibition of measles virus minigenome replication Introducing point and deletion mutations into the P/C gene of human parainfluenza virus type 1 (HPIV1) by reverse genetics generates attenuated and efficacious vaccine candidates A point mutation in the sendai virus accessory C proteins attenuates virulence for mice, but not virus growth in cell culture Mutations in the C, D, and V open reading frames of human parainfluenza virus type 3 attenuate replication in rodents and primates The C protein of wild-type measles virus has the ability to shuttle between the nucleus and the cytoplasm Human parainfluenza virus type I (HPIV1) vaccine candidates designed by reverse genetics are attenuated and efficacious in African green monkeys We thank B Bankamp, JM Bourhis, P Devaux, M Jamin, R Neme and A Vianelli for comments on the manuscript. We thank the OPPF-UK for help with expression of the C proteins, and the organizers of the EMBO training ''High-throughput methods for protein production and crystallization''.Disclaimer: The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.