key: cord-0043463-v3uf0785 authors: Zhirnov, O. P. title: Unique Bipolar Gene Architecture in the RNA Genome of Influenza A Virus date: 2020-03-22 journal: Biochemistry (Mosc) DOI: 10.1134/s0006297920030141 sha: 86357038b6fa96e207ec20f4177e79b2e87bc2a0 doc_id: 43463 cord_uid: v3uf0785 The genome of influenza A virus consists of eight single-stranded negative-polarity RNA segments. The eighth segment (NS) encodes the anti-interferon protein NS1 (27 kDa) and the nuclear export protein NEP (14 kDa) via the classic negative-sense strategy. It also contains an additional positive-sense open reading frame that can be directly translated into the negative strand protein 8 (NSP8; 18–25 kDa in different strains). The existence of three or more genes of the opposite polarity in the same locus of a single-stranded RNA appears to be a unique (“economical”) type of gene architecture in living organisms. In silico analysis of genomes of human and animal influenza A viruses revealed that the NSP8 gene had emerged in the influenza A virus population about 100 years ago (“young” gene) and is highly evolutionary variable. The obtained experimental data suggest that NSP8 gene is expressed in the infected animals, which strengthens the concept of bipolar (ambisense) strategy of the influenza A virus genome. The high variability of the NSP8 protein suggests that the “young” NSP8 gene is in the process of functional optimization. Further accumulation of mutations may alter the functions of mature NSP8 protein and lead to the emergence of mature bipolar influenza A virus with unexpected properties that would be threatening for humans and animals. Influenza A virus is enveloped with a lipid membrane (enveloped virus). Its genome consists of eight single stranded RNA segments ranging from 0.9·10 3 to 3.2·10 3 nucleotides. The segments have the negative polarity, i.e., they cannot be directly translated by the ribosomes. Instead, they are transcribed by viral polymerase resulting in mRNAs further translated into viral proteins (Fig. 1, a and b) . Transcription of some viral RNA segments is asso ciated with mRNA splicing, while translation can start at different AUG codons and might be accompanied by the frameshift. As a result, 16 unique viral proteins are expressed via the negative sense mechanism [1, 2] . NS is the smallest influenza virus genome segment that encodes two viral proteins: non structural interferon antagonist protein 1 (NS1) that counteracts the activity of the inter feron system, and nuclear export protein (NEP) that reg ulates nuclear export of viral ribonucleoproteins in the infected cells. mRNA for the NEP protein undergoes splicing that results in the frameshift relative to the NS1 gene (Fig. 1a) . In most human influenza A virus strains, the NS seg ment has an extended positive strand open reading frame (ORF) (Fig. 1a ) [3 9 ] that was identified as a gene encod ing the NSP8 protein (negative strand protein; segment 8) [4, 5] . The presence of extended reading frame contain ing AUG initiation and UGA termination codons and structured 5′ terminal sequence that includes IRES like motifs [10] in the viral genomic RNA suggests that that the NS segment RNA may serve as a template and be translated into the extended NSP8 protein. It should be noted that the lack of stop codons in the NSP8 gene can not be explained by a mere structure based impossibility of their emergence via a coupled termination of the NS1 and/or NEP genes, as corroborated by the presence of multiple stop codons in the NSP8 gene that do not affect the NS1 and NEP genes (as in many of avian and animal ZHIRNOV BIOCHEMISTRY (Moscow) Vol. 85 No. 3 2020 influenza A viruses [5] ) and stop codons in the NSP8 gene in the pandemic human influenza A virus H1N1pdm09 (see below). Moreover, the genes in the seg ment NS colocalize in a way that the third nucleotide in the codons of the NSP8 frame is the second and third nucleotides in the codons of the NS1 and NEP reading frames, respectively (Fig. 1a) , which allows synonymous substitutions at position 3 in the codons of the NSP8 gene without affecting the NS1 gene. The ORF for the human influenza A virus NSP8 gene has emerged around 100 years ago and might be considered as a relatively new evolutionary trait. This observation was made based on phylogenetic comparison of RNA sequences of the NS segments from available human and animal influenza A viruses by using the multimetric algorithm (see Supplement to Zhirnov et al. [5] ). Still, the function of NSP8 remains unknown. Preliminary studies have demonstrated that baculoviral expression of the NSP8 gene in insect cells produced the full size gene product that localized to the perinuclear zone [11] . In silico analysis revealed no complete homolo gy between NSP8 and other proteins available in the data bases and demonstrated a partial similarity of its amino acid sequence to the Q motif in the DEAD box RNA helicase from the parasitic plant fungus Ceraceosorus bom bacis [12] , a fragment of the protease domain in class C19 hydrolases (including ubiquitin hydrolase from the white button mushroom Agaricus bisporus var. Burnettii [13] ), b and the transmembrane domain of the voltage gated cal cium ion channel from the unicellular eukaryote Salpingoeca rosetta [14] (Table 1 and Fig. 2 ). It cannot be ruled out that NSP8 has multiple functions. Its structural resemblance to other proteins indicates that NSP8 either plays the same role as these proteins or interferes with their functions or inhibits their activity in vivo. Further studies are necessary to uncover the functions of this new class of negative strand gene products. The discovery of a new gene in the influenza A virus es has raised a number of important questions regarding its origin, functions, and evolutionary variability. One of the essential questions is how the new positive sense NSP8 gene has emerged in the genomic region encoding two negative sense genes. The appearance of the ambipo lar gene suggests the existence of yet unknown correspon dence principle (or reverse determination rule) for the expression of ambipolar genes residing in the same region of RNA molecule. It is possible that such principle implies that a certain pre existing gene can predetermine the emergence mechanism and the properties of a new ambipolar gene. It should be presumed that in the absence of determination mechanism, chaotic accumula tion of mutations will result in the appearance of a new functional gene and its further evolutionary selection. However, the probability for such event is low, consider ing the ambipolar overlapping of several preexisting genes, when changes in one of them would cause changes in the coupled ambipolar genes. In this case, gene vari ability and selection of mutations should be interconnect ed in all three viral genes (NS1, NEP, and NSP8). It is possible that other RNA segments of the influenza A virus genome also contain ambisense genes, as it might be suggested from the presence of extended ORFs in segments PB1, PB2, PA, NP, and M (Fig. 1b) . The protein products of these ORFs have not been iden tified yet in biological systems; however, the in silico analysis points out that the proteins of the positive sense genes have the properties of transmembrane ionic chan nels [4, 7] . The occurrence of ambisense genes in multi ple RNA segments implies that a universal mechanism might exist in human and animal influenza A viruses that allows to implement both positive and negative sense genome strategies in the target cells. Another important issue is related to the high vari ability of the NSP proteins in human influenza A virus es. Our data suggest that the variability of NSP is similar to that of the most variable glycoproteins, such as hemagglutinin (HA) and neuraminidase (NA), located at the virion surface and representing major target mole cules for the antiviral factors of the host adaptive immu nity ( Table 2 ). The negative sense NS1 and NEP genes in the NS segment are less variable [17] . There are two pos sible reasons that may account for this phenomenon. First, the NSP8 protein could be exposed on the surface of either virions or infected cells (similar to HA and NA) to be targeted by the factors of the host innate and adap tive immunity, which promotes its variability. Second, the NSP protein, as a newly emerged viral product, may be at the stage of optimization of its expression level and functions toward its viral or cell partners, which might explain its adaptation to these partners. Function of domain in the compared protein*** protection of proteins from proteasomal degra dation via cleaving off ubiquitin transmembrane Ca 2+ ion transport, regulation of apoptosis, autophagy, and carcinogenesis ATP binding and hydro lysis in active RNA heli case * Positions of amino acid residues in the homology region in NSP8. The total number of amino acid residues in NSP8 is shown in parentheses. ** Positions of amino acid residues in the homology region in the compared protein. The total number of amino acid residues in the compared protein is shown in parentheses. The number of amino acid residues in the protein is shown in parentheses. *** The similarity was calculated based on the paired comparison between protein amino acid sequences with the UniProt software (Fig. 2) . The expectation value (E) threshold is a statistical measure of the number of expected matches in a random database; id. and pos., percentage of identical and positive amino acid residues, respectively. (Table 3) , which is in a good agreement with the concept on the elevated variability of the NSP8 gene and the impact of factor(s) of positive selection (adapta tion) involved in its evolution [18] . Accumulation of mutations in the course of evolution may result in quali tative leap in the virus functional activity, leading to the appearance of a novel subtype of influenza virus with unexpected and unpredictable properties that could pose a threat to both humans and animals. Studying the new gene and deciphering its functions linked to the newly emerged NSP8 protein will help to neutralize such threat either by specific preventive vaccination or development of pharmaceutical inhibitors against this protein or its gene. The evolutionary dynamics of the NS segment from the human influenza virus H1N1pdm09 that caused the 2009 influenza pandemics in Mexico correlates with the predicted virus adaptive variability [19] . This strain was a triple reassortant of the human, swine, and avian viruses and has spread to humans from a natural reservoir. It car ried the avian NS segment that lacked the full size NSP gene and contained two stop codons in the NSP gene body. However, after a 10 year evolution in the human population, it lost one of the stop codons, resulting in the sequence extension to 240 bp. It is reasonable to expect that if this virus continues to circulate in the human pop ulation in the next decade, it will lose the remaining stop codon, so that the full size NSP will appear that may acquire unexpected and life threatening properties upon its further optimization. Evolutionary optimization of the NSP8 gene may also affect its regulatory motifs. In particular, during more than a fifty year long evolution, human influenza A virus subtype H3N2 has accumulated a significant number of AUG codons in its 5′ start region: from one identified in 1968 to four observed at present [5] . An increased number of AUG codons in the translation ini tiation region can facilitate recognition of viral RNA by translation initiation factors and ribosomes, resulting in the upregulation of its translation in the infected cells. Moreover, an increase in the number of start codons in the viral RNA template may contribute to an increased rate of defective translation of the NSP8 gene, including translation initiation at the alternative AUG codons that Table 1 (the numbers correspond to the amino acid positions in the protein). b results in the emergence of defective ribosomal products (DRiPs), which may be involved in the downregulation of the MHC I dependent antiviral immune response [20] . However, taking into consideration that expression in the baculoviral system [11] and translation in vitro [10] resulted in the synthesis of the full size stable NSP8 protein, the defective translation of the NSP8 gene seems to be unlikely as a dominant mechanism of its expression. The uniqueness of gene architecture of the bipolar genes deserves special attention. The existence of three or more ambisense genes in the same region of a single stranded RNA molecule is a unique type of gene archi tecture that ensures the "saving" of genetic information at the RNA level and its efficient use in the synthesis of different protein products and provides the highest pro tein diversity per unit of genetic information in the RNA genome. The unique features of the bipolar gene archi tecture in the NS segment of the influenza A virus differs from the organization of ambisense gene in RNA genomes of viruses infecting animals (phleboviruses, tospoviruses, and arenaviruses) and plants (tenuiviruses) [21] . Compared to the aforementioned viruses, in which two ambipolar genes locate in different regions of RNA molecule and do not overlap, human influenza A viruses contain three overlapping ambipolar genes (the so called stacking) within the same RNA region (Fig. 1, a and b) . When bipolar genes do not overlap, their architecture is not directly coupled with their variability, and they can change independently of each other. Although the protein product of the NSP gene has not been yet identified in biological systems, e.g., virus infected cell lines or animal models, the data obtained confirm the possibility of expression of positive sense viral RNA genes. In particular, the NSP8 protein can be synthesized in the in vitro translation system by mam malian ribosomes on the full size viral RNA template [10] . Specific immune lymphocytes against this protein or its components have been found in the infected ani mals [17, 22, 23] . The properties of NSP8 as a "invisible protein" may be accounted by its low synthesis and/or accumulation levels, high lability, short half life, and pos sible tissue specific expression of the NSP8 gene in cer tain cell types containing factors necessary for the regula tion of the positive strand NSP gene expression. Further studies might help in elucidating these issues. Finally, the discovery of the NSP8 might call for the revision of Orthomyxoviridae classification and creation of a separate genus of ambipolar influenza viruses (Ambisense Alfainfluenzavirus). However, this issue will remain open until the protein product of the NSP8 gene is unambiguously identified in biological settings. Meanwhile, the fact that the full size NSP8 gene has been retained for more than a hundred years of evolution in the population of human influenza viruses points at its func tional importance in the yet putative group of ambipolar influenza viruses. It should be noted that no stop codons altering the NSP8 gene have been identified despite pro nounced evolutionary variability of the NSP8 protein, further reinforcing the idea on the biological determina tion of this gene. Influenza A virus cell entry, replication, virion assembly and movement Molecular mechanisms enhancing the proteome of influenza A virus es: an overview of recently discovered proteins Nucleotide sequence of the influenza A/duck/Alberta/60/76 virus NS RNA: conservation of the NS1/NS2 overlapping gene structure in a divergent influenza virus RNA segment Segment NS of influenza A virus contains an additional gene NSP in positive sense orientation Structural and evolutionary characteristics of HA, NA, NS and M genes of clinical influenza A/H3N2 viruses passaged in human and canine cells Evidence for a novel gene associated with human influenza A viruses Computational analysis and mapping of novel open reading frames in influenza A viruses Uncovering the potential pan proteomes encoded by genomic strand RNAs of influenza A viruses Is there a twelfth protein coding gene in the genome of influenza A? A selection based approach to the detection of overlapping genes in closely related sequences Negative sense virion RNA of seg ment 8 (NS) of influenza a virus is able to translate in vitro a new viral protein Integration of influenza A virus gene NSP into baculovirus and its expres sion in insect cells Two new genera of leaf parasitic fungi (Basidiomycetidae: Brachybasidiaceae) The Q motif: a newly identified motif in DEAD box helicases may regulate ATP binding and hydrolysis Cloning and enzymatic analysis of 22 novel human ubiquitin specif ic proteases Modulation of Ca 2+ signaling by anti apoptotic B cell lymphoma 2 proteins at the endoplas mic reticulum mitochondrial interface T type calci um channels in cancer Cellular immune response in infect ed mice to NSP protein encoded by the negative strand NS RNA of influenza A virus Statistical methods for detecting molecular adaptation The 2009 pandemic influenza virus: where did it come from, where is it now, and where is it going? Flu DRiPs in MHC class I immunosurveillance Expression strate gies of ambisense viruses Genome wide characterization of a viral cytotoxic T lymphocyte epitope repertoire Influenza A virus negative strand RNA is translated for CD8 + T cell immunosurveillance