key: cord-0002217-ukzpcmhs authors: Wang, Jun; Zhu, Zheng; Zhang, Lei; Hou, Dianhai; Wang, Manli; Arif, Basil; Kou, Zheng; Wang, Hualin; Deng, Fei; Hu, Zhihong title: Genome Sequencing and Analysis of Catopsilia pomona nucleopolyhedrovirus: A Distinct Species in Group I Alphabaculovirus date: 2016-05-11 journal: PLoS One DOI: 10.1371/journal.pone.0155134 sha: e6e00bfca850f42e1b2baa9ce190d37554f4a42e doc_id: 2217 cord_uid: ukzpcmhs The genome sequence of Catopsilia pomona nucleopolyhedrovirus (CapoNPV) was determined by the Roche 454 sequencing system. The genome consisted of 128,058 bp and had an overall G+C content of 40%. There were 130 hypothetical open reading frames (ORFs) potentially encoding proteins of more than 50 amino acids and covering 92% of the genome. Among all the hypothetical ORFs, 37 baculovirus core genes, 23 lepidopteran baculovirus conserved genes and 10 genes conserved in Group I alphabaculoviruses were identified. In addition, the genome included regions of 8 typical baculoviral homologous repeat sequences (hrs). Phylogenic analysis showed that CapoNPV was in a distinct branch of clade “a” in Group I alphabaculoviruses. Gene parity plot analysis and overall similarity of ORFs indicated that CapoNPV is more closely related to the Group I alphabaculoviruses than to other baculoviruses. Interesting, CapoNPV lacks the genes encoding the fibroblast growth factor (fgf) and ac30, which are conserved in most lepidopteran and Group I baculoviruses, respectively. Sequence analysis of the F-like protein of CapoNPV showed that some amino acids were inserted into the fusion peptide region and the pre-transmembrane region of the protein. All these unique features imply that CapoNPV represents a member of a new baculovirus species. Members of the family Baculoviridae are rod-shaped, insect-specific viruses with doublestranded large circular DNA genomes of 80-180 kb [1, 2] . Lepidopteran baculoviruses synthesize two progeny phenotypes, the budded virus (BV) and occlusion-derived virus (ODV). Virus particles of the latter phenotype are embedded into occlusion bodies (OBs) [3] , which offer some protection against environmental inactivating conditions such as UV light, heat and desiccation. Baculoviridae contains four genera: Alphabaculovirus [nucleopolyhedroviruses (NPVs) of lepidopteran insects], Betabaculovirus [granuloviruses (GVs) of Lepidoptera], Gammabaculovirus (NPVs of Hymenoptera) and Deltabaculovirus (NPVs of Diptera) [4, 5] . The alphabaculoviruses can be further divided into Group I and Group II, based on phylogenetic analysis and their membrane fusion proteins. Group I viruses use GP64 as the fusion protein while Group II viruses use F-protein instead [6] [7] [8] . Phylogeny analysis suggested that Group I fall into two clades, "a" and "b" [9] . Despite the diversity in gene content of baculovirus genomes, 37 have been identified as core genes present in all sequenced baculoviral genomes and play very important roles in the viral replication cycle [10] . In addition, there are 23 genes conserved in all sequenced lepidopteran baculoviruses (NPVs and GVs) and 11 are specific to Group I [10] [11] [12] [13] . Catopsilia pomona (Lepidoptera: Pieridae) is distributed in Asia and Australia. In Mainland China, it is present mainly in the provinces of Hainan, Guangdong, Guangxi, Yunnan, and Fujian. It is harmful to Kassod tree, Wing-podded Senna, golden shower, pockwood and other tropical plants [14] . Larvae feed on young leaves and during outbreaks, the trees are stripped of foliage totally. In Hainan Province, the insect has 13-14 generations a year, causing damage all year round [15] . CapoNPV was isolated from dead Catopsilia pomona larvae in Hainan in 1990 [15] . So far, 78 baculoviruses have been fully sequenced, including 19 Group I alphabaculoviruses, 35 Group II alphabaculoviruses, 20 betabaculoviruses, 3 gammabaculoviruses and 1 deltabaculovirus (http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=10442, and S1 Table) . In this study, the complete genomic sequence of CapoNPV was determined and analyzed. Phylogenetic analysis suggested that this virus is a distinct species in Group I Alphabaculovirus. The complete nucleotide sequence of CapoNPV genomic DNA was determined using 454 pyrosequencing method. The sequences were assembled using the Roche GS De Novo Assembler version 2.7. The genome was covered 350 times by 123,698 reads. It consists of 128,058 bp in length and contains 130 predicted ORFs with a G+C content of 40% (S2 Table) . The adenine residue of the translation initiation codon of polyhedrin with a forward orientation was designated as the zero point on the circular genome map. Sixty-nine ORFs were in a clockwise direction and 61 in a counterclockwise direction with respect to the transcriptional orientation of polyhedrin. The 37 core genes (red), 23 lepidopteran baculovirus conserved genes (blue) and 10 Group I specific genes (green) are illustrated on the genome map (Fig 1) . Another 56 baculoviral genes and 4 hypothetical CapoNPV unique genes are shown in grey and open arrows, respectively (Fig 1) . A phylogenetic tree built with linked 37 core genes from 79 sequenced baculoviruses (S1 Table) classified CapoNPV into clade "a" of Group I (Fig 2) . It is located on a distinct branch in clade "a"alphabaculoviruses, which is consistent with a previous phylogenetic analysis based on polyhedrin/granulin, lef-8 and lef-9 [9] . CapoNPV appeared to have diverged shortly after the separation of clades "a" and "b" and may be closer to an ancestral virus than most species in the two clades. This situation is similar to a newly sequenced Cyclophragma undans nucleopolyhedrovirus (CyunNPV) (data not shown). CapoNPV genes were compared to homologues in 7 other well characterized baculoviruses; Autographa californica MNPV(AcMNPV, belonging to Group I, clade "a"), Orgyia pseudotsugata MNPV, (OpMNPV, Group I, clade "b"), Helicoverpa armigera NPV(HearNPV, Group II), Spodoptera exigua MNPV (SeMNPV, Group II), Cydia pomonella GV (CpGV, a betabaculovirus), Neodiprion lecontei NPV (NeleNPV, a gammabaculovirus) and Culex nigripalpus NPV (CuniNPV, a deltabaculovirus) (S2 Table) . Most of the CapoNPV genes shared nt identity lower than 63% with the alphabaculoviruses and lower than 35% with that of beta-, gamma-and deltabaculoviruses (S2 Table) . Gene order of CapoNPV was compared to the above baculovirus genomes using gene parity plots [16] . Although CapoNPV is a distinct species in Group I, its gene order is substantially collinear with representatives of Group I alphabaculoviruses and partially collinear with those from Group II alphabaculoviruses. However, its gene arrangement was significantly different from that of gamma-and deltabaculoviruses (Fig 3) . A collinearly conserved region of lepidopteran baculoviruses was also found in CapoNPV between capo43 to capo75 (Fig 1) . It contains 20 core genes and five additional lepidopteran baculovirus conserved genes, and also includes two Group I specific genes, ac73 (capo69) and ac72 (capo70), and six other genes ac91 (capo58), cg30 (capo57), ac87 (capo58), ac79 (capo63), ac74 (capo68) and iap-2 (capo71) (Fig 1) . Homologous repeated sequences (hrs) of baculoviruses consist of a number of repeated sequences with an imperfect palindrome, interspersed at different locations in a genome. Hrs are highly variable, and although they are closely similar within the same genome, they may show very limited homology among different viruses. Sixty-four of the 79 completely sequenced baculoviral genomes contain 2-17 hrs (S1 Table) . Previous studies suggested that hrs may act as origins of DNA replication [17, 18] . However, deletion of individual hrs from the AcMNPV genome does not appear to affect genome replication [19] . The hrs also acted as enhancers of gene expression and appeared to up-regulate the expression of the AcMNPV immediate early gene-1 (ie-1) [20] [21] [22] . The locations and the sequences of the 8 CapoNPV hrs are summarized in Figs 1 and 4, respectively. CapoNPV contains 12 replication associated genes, 12 transcription associated genes, 8 genes essential for oral infection, 34 structure related genes and 15 auxiliary genes ( Table 1 ). The rest are 45 of unknown function including 4 hypothetical unique genes of CapoNPV. CapoNPV lacks fibroblast growth factor gene (fgf). FGF plays an important role in developmental processes affecting cell growth, differentiation, and motility and is one of the conserved proteins in vertebrates and invertebrates [23] . Lepidopteran baculoviruses also encode fgf, and it was previously found conserved in all the lepidopteran baculoviruses [9] except in Maruca vitrata nucleopolyhedrovirus (MaviNPV) [12] . Although deletion of fgf from AcMNPV had no effect on replication in tissue culture cells, bioassays showed that time of death in larvae was delayed [24] . It has been suggested that FGF may play a role in dissemination of the virus within the host insect [25] . Recent evidence suggests that FGF initiates a cascade of events that may accelerate the establishment of systemic infections [26] . In our study, fgf was not found in the CapoNPV genome. CapoNPV lacks ac30, a gene specific to Group I. In the previous report, 11 genes (gp64, tyrosine phosphatase gene (ptp), ie2, odv-e26, ac5, ac30, ac73, ac72, ac114, ac124, ac132) have been identified as specific to Group I viruses and are absent from all other baculoviruses [13] . These genes might have had an evolutionary role in the emergence of Group I viruses [13, 27] . Notably absent from CapoNPV is a homologue to ac30. This gene seems to be nonessential because deletion thereof did not affect the production of BmNPV [28] . Interestingly, CyunNPV, a member of Group I also lacks ac30 (data not shown). CapoNPV lacks lef-7, a gene involved in DNA replication. lef-7 had stimulatory effects on transient DNA replication [29] . It is present in all previously identified Group I viruses, several Group II viruses and many betabaculoviruses. Deletion of lef-7 from AcMNPV had no impact on virus infection in Tn368 cells, but in SF21 and SE1c cells the viral DNA replication was reduced to only 10% of the wild-type virus [30] , suggesting the function of LEF7 is host dependent. lef-7 was also found to be involved in the regulation of the DNA damage response (DDR). Deletion of lef-7 from the AcMNPV genome caused the activation of the DDR, and progeny infectious virus decreased about 99% [31] . CapoNPV is the first reported group I virus that does not contain a lef-7 gene. CapoNPV lacks ODV-E66, a structure protein of ODV involved in oral infection. ODV-E66 was identified as a component of ODV envelopes [32] . AcMNPV ODV-E66 was shown to have chondroitinase activity [33] and its crystal structure was determined [34] . It was suggested that ODV-E66 may function in midgut infection by degrading the peritrophic membrane, which contains a low level of chondroitin sulfate [33] . In fact deletion of odv-e66 in AcMNPV increased the oral infection dose about 1000 times while did not changed the infectivity of BV, suggesting ODV-E66 is an important oral infectivity factor [35] . Odv-e66 is present in most alphabaculoviruses and betabaculoviruses, however, it was not found in CapoNPV genome. A characteristic feature of Group I viruses is the presence of GP64 and the loss of fusion function of F. Except for gammabaculoviruses and Group I viruses, the F protein functions as the envelope fusion protein of BV. In AcMNPV, the F-like protein is also associated with BV membranes and its deletion from the genome results in infectious virus with titers similar to the parental virus in cell cultures, but the time to kill larvae is somewhat extended [36] . Previous studies showed the importance of the furin cleavage site in the fusion process. Furin protease digests F into two components, a small N-terminus membrane-anchored F2 and a large domain F1 at the C-terminus. Both are needed for viral-host membrane fusion [7, 37] . The F-like protein in Group I viruses lacks the furin cleavage site and, therefore, lost its fusion function. Instead, GP64 functions as an efficient envelope fusion protein [38] [39] [40] . In our study on F-like protein in CapoNPV, an insertion was found in the region equivalent to the fusion peptide (Fig 5) . We also found another stretch of amino acids are inserted ahead of the pre-transmembrane domain (pre-TM) of CapoNPV (Fig 5) . Sometimes, pre-TM domain, which is rich in aromatic amino acids, plays an important role in membrane fusion [41] [42] [43] [44] . Similar insertions into the fusion peptide region and the pre-TM were also found in CyunNPV (data not shown). According to phylogeny (Fig 2) , CapoNPV evolved relatively earlier than other Group I alphabaculoviruses. Thysanoplusia orichalcea nucleopolyhedrovirus (ThorNPV), another relatively early member of Group I (Fig 2) also has an insertion at the fusion peptide region (Fig 5) . The change of viral fusion ability mediated by the presence of GP64 and the inactivation of F are considered critical events in the origination of Group I [13] . Our results provide new evidence in the understanding of the process of F inactivation and, therefore, the early evolutionary events of Group I alphabaculoviruses. CapoNPV infected Catopsilia pomona larvae have been preserved in the ''Chinese general virus collection center" (CGVCC) with collection number IVCAS 1.0228. OBs were purified from homogenized larvae by differential centrifugation [46] and DNA was extracted as described previously [47] . The genome of CapoNPV was sequenced with the Roche 454 GS FLX+ system by using a shotgun strategy. The determined nucleotide sequences were assembled with GS De Novo Assembler software version 2.7. The complete genome sequence and annotation information were submitted to GenBank (accession number: KU565883). Putative ORFs were analyzed using the FGENESV0 program (http://www.softberry.com/ berry.phtml) [48] and the NCBI ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). ORFs potentially encoding more than 50 amino acids were designated as putative genes with minimal overlaps. Gene parity plot analysis was performed as previously described [17, 49] . The Tandem Repeats Finder (http://tandem.bu.edu/trf/trf.html) was used to locate hrs. Gene annotation, comparisons were done with the aid of NCBI BLAST algorithm (http://blast.ncbi. nlm.nih.gov/Blast.cgi). A phylogenetic tree was generated based on amino acid sequences encoded by the 37 core genes from CapoNPV and that of the other 79 reference genome sequences of baculoviruses in NCBI (S1 Table) . All the sequences were joined together in the same order and the alignments were generated using muscle method of MEGA6 with default settings. A phylogenetic tree was constructed by MEGA6 using Maximum Likelihood method based on the JTT matrix-based model [-50] . Phylogeny tested by Bootstrap method with a value of 1000 [51] . The amino acid alignment of F and F-like proteins. The alignment was performed using ClusterW method. A schematic figure of SeMNPV F protein was adapted from a previous publication [45] and is shown at the bottom, and two enlarged regions with sequence alignments are also shown. Viral names and categories are on the left. The predicted regions of furin cleavage site, fusion peptide, pre-TM and transmembrane domains are indicated below the alignment. The red square shows the aromatic amino acids (F, Y, W and H) in the pre-TM region. The arrows point to the insertion regions in CapoNPV. Black background shows greater than 80% identity among compared regions, dark gray and light gray shows greater than 50% and 30% identity, respectively. doi:10.1371/journal.pone.0155134.g005 Supporting Information S1 Sequence and organization of the Neodiprion lecontei nucleopolyhedrovirus genome Sequence analysis of the Xestia c-nigrum granulovirus genome The pathway of infection of Autographa californica nuclear polyhedrosis virus in an insect host On the classification and nomenclature of baculoviruses: a proposal for revision Ratification vote on taxonomic proposals to the International Committee on Taxonomy of Viruses Host cell receptor binding by baculovirus GP64 and kinetics of virion entry A novel baculovirus envelope fusion protein with a proprotein convertase cleavage site The GP64 envelope fusion protein is an essential baculovirus protein required for cell-to-cell transmission of infection Molecular identification and phylogenetic analysis of baculoviruses from Lepidoptera The ac53, ac78, ac101, and ac103 genes are newly discovered core genes in the family Baculoviridae The genome sequence and evolution of baculoviruses Genomic and host range studies of Maruca vitrata nucleopolyhedrovirus Evidence of a major role of GP64 in group I alphabaculovirus evolution Catopsilia pomona seriously harm to plants of the genus Cassia. Plant Protection Preliminary Study on Catopsilia pomana NPV Distinct gene arrangement in the Buzura suppressaria single-nucleocapsid nucleopolyhedrovirus genome. The Journal of general virology Identification of seven putative origins of Autographa californica multiple nucleocapsid nuclear polyhedrosis virus DNA replication. The Journal of general virology The origins of replication of granuloviruses No single homologous repeat region is essential for DNA replication of the baculovirus Autographa californica multiple nucleopolyhedrovirus. The Journal of general virology Complete Sequence and Enhancer Function of the Homologous DNA Regions of Autographa californica Nuclear Polyhedrosis Virus The baculovirus transactivator IE1 binds to viral enhancer elements in the absence of insect cell factors Transcriptional enhancer activity of hr5 requires dual-palindrome half sites that mediate binding of a dimeric form of the baculovirus transregulator IE1 Stimulation of cell motility by a viral fibroblast growth factor homolog: proposal for a role in viral pathogenesis Analysis of a baculovirus lacking a functional viral fibroblast growth factor homolog The Autographa californica M nucleopolyhedrovirus fibroblast growth factor accelerates host mortality Viral fibroblast growth factor, matrix metalloproteases, and caspases are associated with enhancing systemic infection by baculoviruses Use of whole genome sequence data to infer baculovirus phylogeny Open reading frame Bm21 of Bombyx mori nucleopolyhedrovirus is not essential for virus replication in vitro, but its deletion extends the median survival time of infected larvae. The Journal of general virology The roles of eighteen baculovirus late expression factor genes in transcription and DNA replication Differential Infectivity of Two Autographa californica Nucleopolyhedrovirus Mutants on Three Permissive Cell Lines Is the Result of lef-7 Deletion Baculovirus F-Box Protein LEF-7 Modifies the Host DNA Damage Response To Enhance Virus Multiplication Transcription, Translation, and Cellular Localization of PDV-E66: A Structural Protein of the PDV Envelope of Autographa californica Nuclear Polyhedrosis Virus Baculovirus Envelope Protein ODV-E66 Is a Novel Chondroitinase with Distinct Substrate Specificity Crystallization and X-ray diffraction analysis of chondroitin lyase from baculovirus: envelope protein ODV-E66 Autographa californica multiple nucleopolyhedrovirus odv-e66 is an essential gene required for oral infectivity Ac23, an envelope fusion protein homolog in the baculovirus Autographa californica multicapsid nucleopolyhedrovirus, is a viral pathogenicity factor Furin is involved in baculovirus envelope fusion protein activation Monoclonal antibodies to baculovirus structural proteins: determination of specificities by Western blot analysis Identification and sequence analysis of a gene encoding gp67, an abundant envelope glycoprotein of the baculovirus Autographa californica nuclear polyhedrosis virus The 64K envelope protein of budded Autographa californica nuclear polyhedrosis virus. Current topics in microbiology and immunology The pre-transmembrane domain of the Autographa californica multicapsid nucleopolyhedrovirus GP64 protein is critical for membrane fusion and virus infectivity Pre-transmembrane sequence of Ebola glycoprotein. Interfacial hydrophobicity distribution and interaction with membranes Interaction of a peptide from the pretransmembrane domain of the severe acute respiratory syndrome coronavirus spike protein with phospholipid membranes. The journal of physical chemistry B Interfacial pre-transmembrane domains in viral proteins promoting membrane fusion and fission Functional analysis of the putative fusion domain of the baculovirus envelope fusion protein F The research of Cyclophragma undans nucleopolyhedrovirus. journal of central south forestry institute Genetic organization of the Hin-dIII-I region of the single-nucleocapsid nucleopolyhedrovirus of Buzura suppressaria. Virus research INFOGENE: a database of known gene structures and predicted genes and proteins in sequences of genome sequencing projects Genome sequence and analysis of Buzura suppressaria nucleopolyhedrovirus: a group II Alphabaculovirus MEGA6: Molecular Evolutionary Genetics Analysis version 6.0 Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae) The authors thank the 454 services from the core facility center of Wuhan Institute of Virology. This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB11030400) to ZH, and by grants from the National Science Foundation of China (No. 31321001 and 31130058) to ZH.