key: cord-0009924-92xa54hf authors: Koonin, Eugene V.; Gorbalenya, Alexander E. title: An insect picornavirus may have genome organization similar to that of caliciviruses date: 1992-02-03 journal: FEBS Lett DOI: 10.1016/0014-5793(92)80332-b sha: 5b7c0cc1edbdaea9d8a616df8db5a0a40393d1e1 doc_id: 9924 cord_uid: 92xa54hf Computer‐assisted analysis of the amino acid sequence of the product encoded by the sequenced 3′ portion of the cricket paralysis virus (CrPV), an insect picornavirus, genome showed that this protein is homologous not to the RNA‐directed RNA polymerases, as originally suggested, but to the capsid proteins of mammalian picornaviruses. Alignment of the CrPV protein sequence with those of picornavirus and calicivirus capsid proteins demonstrated that the sequenced portion of the insect picornavirus genome encodes the C‐terminal part of VP3 and the entire VP1. Thus CrPV seems to have a genome organization distinct from that of other picornaviruses but closely resembling that of caliciviruses, with the capsid proteins encoded in the 3′ part of the genome. On the other hand, the tentative phylogenetic trees generated from the VP3 alignment revealed grouping of CrPV with hepatitis A virus, a true picornavirus, not with caliciviruses. Thus CrPV may be a picornavirus with a calicivirus‐like genome organization. Different options for CrPV genome expression are discussed. CrPV is classified as an insect picornavirus (reviewed by Moore et al. [l] ). CrPV has a number of picornavirus-like characteristics, i.e. icosahedral virions of about 27 nm in diameter, .S,.,zo of 167 S and buoyant density of 1.34 in CsCI, 3 species of capsid proteins with M, of about 30 kDa, and single-stranded (ss) virion RNA which has approximately 8,000 nucleotidcs in length. The RNA is polyadenylated at its 3' end and apparently contains a covalently linked protein (VPg) at its 5' end [2] . Also, it has been shown that the three capsid proteins of CrPV are generated by proteolytic processing of a precursor polypeptide, in line with the picomavirus expression strategy [3] . The sequence of 1600 3' terminal nucleotides of CrPV genomic RNA has been reported [2] . It has been speculated that, similarly to mammalian picornaviruscs, this portion of CrPV genome might encode the RNA-dependent RNA polymerase, and counterparts to the three most conserved sequence motifs of positive strand RNA virus polymerases [4, 5] have been tentatively identified [2] . Surprisingly, however, no significant overall sequence similarity could be detected between the hypothetical CrPV polymerase and those of mammalian picornaviruses (or any other positive strand RNA viruses). Here we report results of detailed amino acid sequence comparisons indicating that the polypeptide product of the 3'-terminal part of the CrPV genome is in fact homologous to the capsid proteins of picornaviruses, not to the RNA polymerascs. The sequences were from Swissprot database, except for FcCV [6] and RHDV [?I. Comparison of query amino acid sequences with the Swissprot data bank (Release 18) was performed using the program Quick of the Genebee program package for biopolymer sequence analysis [B] . Local similarity plots for pairs of sequences wcrc gcncrated using the program Dothelix of the Cienebee package [9], Align ments produced using the program OPTAL as previously described, with the adjusted alignment scores compuled as the number ofstandard deviations (SD) above the mean of 25 random simulations [IO] . Tentative phylogenetic tree5 were generated either by a clustering procedure (UPGMA algorithm; [I I] : : CfaplyhaMDvttq----vgddsggFs--tTVsteqnV---pdpQvgittmrdLkgkanRgkmDVsGvQapVg * :: r. d :#I * : 10 CeVNggPDLEfagp----tcpsmyptagDDTtaDTrK~eaertq~ysnnednriCttqcsRIvAQVmGeDhE2P * 11 fhtlkpPgsmlth-----12 fvMiraPssktvd----- product of the 3' part of the CrPV genome was compared to all sequences in the Swissprot bank. As perhaps could have been anticipated, the sequence most similar to that of CrPV belonged to the polyprotein of a picornavirus, HAV. Surprisingly, however, this sequence was not from the polymerase region of the HAV polyprotein, but rather comprised a 30 residue segment of the capsid protein VP3. The next highest scores were with the homologous sequences ofother picorriaviruses. To further explore the nature of the product encoded by the 3' part ofCrPV genome, we generated the plots of local similarity between its sequence and picornavirus capsid proteins. This led to delineation of additional regions of significant similarity in picornavirus VP3 and VP1 (not shown), suggesting that the CrPV sequence might correspond to the C-terminal one-half of VP3 and the entire VPl. Indeed, aligning the entire published CrPV protein sequence with the respective portion of HAV polyprotein yielded an alignment with the convincing score of 10.1 SD. Using this alignment, the CrPV sequence was fitted into the modified alignment of picornavirus capsid proteins based on the results of X-ray analysis of the virions of four virus species [ 14, IS] . Conservation between CrPV and mammalian picornaviruses was much more pronounced in the (putative) VP3 than in VPl; this was not surprising as VP3 is the least diverged of the picornavirus capsid proteins [14] . Inspection of this alignment revealed reasonable sequence conservation in the segments of the putative CrPV capsid proteins aligning with the known structural elements of picornavirus proteins (Fig. 1) . In 84 particular, all /?-strands in the C-end half of the (putative) VP3 proteins and the strands D, E and G in VP1 could be identified more or less confidently. Recently, the sequences of the capsid protein of two caliciviruses have been reported [6, 7, 16] , and a region of similarity with picornavirus VP3 has been delineated in one of them [6, 16] . We added the calicivirus sequences to the picornavirus capsid protein alignment and found that all the sequence elements typical of VP3 indeed appeared to be conserved in the calicivirus proteins (Fig. 1) . On the other hand, very little (if any) sequence similarity could be detected between the calicivirus capsid proteins and the aligned sequences of the picornavirus VPls {not shownj. Identification of the VP3NPl cieavap site of CrPV did not readily arise from the alignment. Several dipeptides resembling picornavirus 3C protease cleavage The branoh lengths are proportional to evolutionary rates of the r~sp¢ctiv~ species, sites could be found within approximately 30 residues around the probable boundary between the two proteins. Cleavage at each of them would result in predicted VP1 size of about 240-270 amino acid residues, in good agreement with both the typical values for mammalian picornaviruses (274-302 residues, with the exception of aphthoviruses having large deletions in VP1), and with the eleetrophoretic estimates for CrPV. In common with HAV, CrPV probably bears an extension at the N-terminus of the putative VP1. On the other hand, it appears to lack the C-terminal disordered region found in VPls of all mammalian picornaviruses (Fig. 1) . In the latter, this region appears to constitute the hinge between the structural and non-structural domains or the polyprotein. Thus its absence in CrPV may be related to the alternative genome organization of this virus. Serological cross-reaction between CrPV and encephalomyocarditis virus has been reported [17] . Although the present observations do not allow one to delineate the common antigenic determinant, they show that conserved segments do exist in the (putative) capsid proteins of these viruses, one of them probably conferring the cross-reactivity. The resulting alignment of VP3, which was the best conserved portion of the picornavirus capsid protein alignment, and the only sequence showing significant conservation between calieivirus and pieornavirus capsid proteins, was used to generate tentative phylogenetic trees using two different methods. Fig. 2 shows the rootless tree produced by the maximum topological similarity algorithm which has been designed specifically to overcome the unequal evolutionary rate effects [12] . Obviously, the tree split in three major subdivisions, one of which included the caliciviruses, the other HAV and CrPV, and the third one all the remaining mammalian picornaviruses. The dendrogram generated by the clustering procedure separated HAV and CrPV as an outgroup, while grouping together ealiciviruses and other picornaviruses (not shown). Phylogenetic analysis of viral RNA-dependent RNA polymerases and putative RNA helicases clearly indicated that caliciviruses are related to but distinct from picornaviruses (A.E.G., unpublished observations). Thus we believe that the result of cluster analysis of the VP3s was probably due to an anomalously high rate of evolution of HAV and CrPV. We showed here that the sequenced portion of an insect picornavirus genome encodes putative viral capsid proteins. As the assignment of this sequence to the 3' part of virus RNA by King et al. [2] seems to be quite solid, confirmed by direct sequencing of a part of the poly(A) tail, this observation implies a genomic organization for CrPV that is basically different from the picornavirus one. Apparently, CrPV encodes the proteins mediating genome replication and expression in its 5' part, and capsid proteins in the 3' part. This gone arrangement obviously resembled that found for the caliciviruses [7, 16, 18] , as well as for several other groups of positive-strand RNA viruses (reviewed by Strauss and Strauss [19], and Lai [20] ). On the other hand, the results of phylogenetic analysis suggested an evolutionary relationship between CrPV and HAV, which is obviously a 'true' pieomavirus. Also, the classification of CrPV as a picornavirus, not a calicivirus, is supported by the absence of substantial similarity between the VPl-related sequence of CrPV and the calicivirus capsid protein, and by the fact that CrPV produces three capsid protein species, like picot-naviruses, and not one. like caliciviruses. This discrepancy raises the important 'tempo and mode' problem in virus evolution, suggesting that similarities in genome organization do not necessarily mirror the phylogenetic grouping of viruses [21] . Determination of the portion of CrPV genome encoding non-structural proteins will allow a better assessment of the phylogenetic position of this interesting virus. There seem to exist three options for CrPV genome expression: (i) capsid protein precursor is generated by translation of a subgenomic RNA; this expression strategy would be analogous to that exploited by caliciviruses [6, 7] ; (ii) the capsid protein precursor constitutes the C-terminal portion of the single polyprotein product of the virus genome, resembling the expression strategy of potyviruses [22] ; (iii) translation of the capsid protein precursor is initiated independently. The first hypothesis, though attractive because of the analogy with caliciviruses, seems to contradict the results of radiolabelling of virus-specific RNA in CrPVinfected sells that has not revealed subgenomic species [3] . The second hypothesis, on the other hand, is not readily compatible with the pactamycin mapping data suggesting that the capsid proteins of CrPV are encoded immediately downstream of the ribosome entry site [23] . The third possibility is apparently consistent with all the available experimental data. Clearly, internal initiation of translation is not a typical expression strategy of eukaryotic positive strand RNA viruses. On the other hand, growing evidence shows that initiation of translation of picornavirus mRNAs occurs internally, and their 5'-non-coding sequences retain the ability to direct initiation when placed downstream of a 5' gene in a genetically engineered mRNA (for review see [24] ). Moreover, artificial dicistronic construct has been recently described, in which EMC virus 5'-non-coding sequence was introduced between the poliovirus genes encoding capsid and nonstructural proteins; strikingly, such RNA produced infectious virus [253. Thus it cannot be ruled out that CrPV is a natural 'monster' with a similar, 'mixed' procaryotic/eucaryotic expression strategy. Further experiments on CrPV genome structure and expression will show whether this rather bold speculation will hold true, or this virus actually XTTERS February 1992 exploits one of the more conventional strategies, whereas some of the available data require reevaluation. Nucleic Acids Rcs. 12 Virus Rcs. I I Principles of Numerical Taxonomy ~olccular Aspects of Picornavirus Infection and Detection The Togaviridae and Flaviviridne 1381) J. Gcn, Viral. 55, 429438 Workshop on the Regulation of Translation in Animal Virus-Infected Cells