key: cord-0941949-07u2kpe5 authors: Muñoz-Chimeno, M.; Forero, J. E.; Echevarría, J. M.; Muñoz-Bellido, J. L.; Vázquez-López, L.; Morago, L.; García-Galera, M. C.; Avellón, A. title: Full coding hepatitis E virus genotype 3 genome amplification method date: 2016-04-30 journal: Journal of Virological Methods DOI: 10.1016/j.jviromet.2016.01.004 sha: 62994c0dd8622eb7c6515873761e4775ad5e7a0d doc_id: 941949 cord_uid: 07u2kpe5 Abstract Hepatitis E virus (HEV) genotype 3 produces zoonotic infection associated with the consumption of infected animals. HEV infections can become chronic in immunocompromised (IC) patients. The viral genome has three well defined open reading frames (ORF1, ORF2 and ORF3) within which various domains and functions have been described. This paper (i) describes a new method of complete sequencing of the HEV coding region through overlapping PCR systems, (ii) establishes a consensus sequence and polymorphic positions (PP) for each domain, and (iii) analyzes the complete coding sequence of an IC patient. With regard to the consensus, a high percentage of PP was observed in protease (PP=19%) and the X domain (PP=22%) within ORF1, the N-terminal region of the S domain (PP=22%) in ORF2, and the P1 (PP=35%) and P2 (PP=25%) domains in ORF3. In contrast, the ORF1 Y, ORF2 S, ORF2 M and ORF3 D1 domains were conserved in the reference sequences (0.40, 1, 0.70 and 0% of PP, respectively). The sequence from the IC patient had more mutations in the RpRp (D1235G, Q1242R, S1454T, V1480I, I1502 V, K1511R, G1373 V, E1442D, V1693 M), the terminal ORF2 S- domain (F10L, S26T, G36S, S70P, A105 V, I113 V), the X domain (T938 M, T856 V, S898A) and the helicase (S1014N, S975T, Q1133 K). Hepatitis E virus (HEV) (Hepeviridae family) is one of a number of divergent isolates from humans and other animals (Song et al., 2014 ) whose taxonomic status is unresolved. This infection is endemic in many parts of the US, Central and parts of South America, Mediterranean regions of Europe, Africa and Asia Pacific region (Perez-Gracia et al., 2013) . The clinical course of genotype 3 (G3) infection is usually self-limited in immunocompetent individuals (Rodriguez-Frias et al., 2012) . In contrast, chronic and fulminant infections in patients coinfected with human immunodeficiency virus (Dalton et al., 2011) , with iatrogenic immunosuppression or with solid organ transplantation (Kamar et al., 2008) have been described as well as severe acute infections in pregnant women (Mateos Lindemann et al., 2010) . In fact, 60% of HEV infections in immunocompromised patients after solid organ transplantation evolve to chronic infection without antiviral treatment (Fujiwara et al., 2014) . HEV infection has been reported to be associated with allogeneic hematopoietic transplantation (Versluis et al., 2013) , or with transplantation of solid organs such as the liver (Te et al., 2013) , kidney (Passos et al., 2013; Moal et al., 2013) and heart (Koning et al., 2013) . Cases of HEV transmission due to transfusions (Price, 2014) have also been noted. A case has recently been described from Germany in which a low HEV G3 viral load was transmitted by platelets to an immunocompromised patient who subsequently developed chronic hepatitis (Huzly et al., 2014) . The HEV genome is an ss+RNA of approximately 7.2 kb, including three partially overlapping open reading frames, from 5 to 3 : ORF1, ORF2 and ORF3, these structures are thought to be important for HEV RNA replication . ORF1 encodes a 1693 amino acids (AAs) long polyprotein that has structural and functional motifs responsible for viral replication: methyltransferase, papain-like cysteine protease (PCP), helicase and RNA-dependent RNA polymerase (RdRp) (Ahmad et al., 2011) domain) and a polyproline or hypervariable region (Holla et al., 2013) . ORF2 encodes a 660 AAs protein which is the main component of the viral capsid (Zafrullah et al., 1999) . The capsid is formed by the interaction of ORF2 protein dimers with the 5 end of the viral genome through its N-terminal region, which is rich in arginine (AA 101; 180 molecules of proORF2 per virus particle) (Xing et al., 2010) . This protein has three linear domains: the S domain, which forms the capsid, and the M and P domains, which are involved in the interaction of the virus with the host cell (Perez-Gracia et al., 2014) . It has been described an homologous to alphavirus region (46 AAs long) of unknown function (Yarbough et al., 1991; Purdy et al., 1993; Rodriguez-Frias et al., 2012) . The ORF3 protein probably regulates the host cell environment through its interaction with various intracellular pathways. Firstly, it activates the ERK by binding and inhibiting their phosphatase. Prolonged activation of ERK would generate a survival and proliferative cellular sign that can favor viral replication and extend the life of infected cells by attenuating the intrinsic pathway of death. Secondly, the innate immune response is downregulated by reducing acute phase protein expression and increasing secretion of ␣1-microglobulin with immunosuppressive capacity (Chandra et al., 2008) . The ORF3 protein interacts with ORF2. This interaction is dependent upon the phosphorylation of serine 71, and suggests a role for ORF3 protein in regulating the assembly of HEV and the possible location of the virus capsid (Tyagi et al., 2002) . Reports of molecular analysis of HEV in clinical case samples are scarce and they are focused in specific genome regions (Lhomme et al., 2014b) . The aim of this work is to analyze the complete coding genome of the HEV genotype 3 in an immunocompromised patient with acute infection. For this purpose, a new full coding genome amplification method is described, and a comprehensive analysis of polymorphic residues of a Genbank HEV complete sequence set is carried out. RNA extracts from serum and feces of a patient with acute HEV genotype 3 hepatitis were used as a template for cDNA synthesis (hereafter, the "IC sample"). The patient had not recently traveled outside Spain and was immunocompromised after bone marrow transplantation. Clinical presentation was acute hepatitis with neither persistence nor evidence of any serological (anti HEV IgM or IgG) response. After other virological causes of hepatitis were ruled out, HEV PCR was determined with a positive result. RNA was extracted manually. RNA was extracted from the stools with QIAamp ® Viral RNA Mini Kit (QIAGEN ® , Hilden, Germany), which simplifies purification of viral RNA with a fast spin-column. Stool samples were homogenized and resuspended in PBS 1X (1.125 ml). RNA was extracted from serum samples (200 l) with a High Pure Viral Nucleic Acid kit (Roche Diagnostics, Mannheim, Germany). The method uses chaotropic salt and glass fiber fleece in a high pure spin filter tube. This was done with a Transcription First Strand cDNA Synthesis Kit (Roche Diagnostics, Mannheim, Germany) adding 10 l of RNA extract to a final amount of 20 l cDNA, following the manufacturer's recommendations. To amplify the HEV genome coding region, 12 overlapping blocks of nested PCR systems were set up. An additional block was used with pre-designed nucleotides (Avellón et al., 2015) . 18 and 24 oligonucleotides were designed for the initial and nested reactions, respectively. Primer sequences and thermal conditions are summarized in Table 1 . To design the primers, 70 complete HEV genotype 3 genome sequences were obtained (GenBank database, National Center of Biotechnology Information, Bethesda, MD, USA) and aligned (Mega 5.0 software). Primers were designed with the help of Seqman software (DNASTAR, Lasergene Inc., Madison, WI, USA). PCR Master Mix and RNase-free water were used (Promega, Madison WI, USA) to perform first (from 5 l of extracts) and nested (from 2.5 l of first PCR amplification product) PCR reactions. Previously designed sense and antisense primers were used at a concentration of 0.5 M. Amplification was carried out in a Thermocycler (DNA Engine, MJ Research PTC-200) programmed as follows: 2 min at 94 • C (initial denaturation); 40 cycles: 1 min at 94 • C (denaturation), 1 min annealing (temperature of each system described in Table 1 ), 1 min at 72 • C (elongation): 5 min at 72 • C (final elongation). PCR products were separated by electrophoresis on 2% agarose gels in Tris-borate-EDTA (TBE 10X) buffer, and stained (Biotium GelRed ® , Hayward, CA, USA). The amplification products were purified with a QIAquick PCR Purification Kit (QIAGEN ® , Hilden, Germany) Sense and antisense DNA strands were both sequenced by the Sanger method. Sequence assembly and AA alignment were performed with Seqman and MegaAlign programs (DNAS-TAR, Lasergene Inc.). The sequence obtained (hereafter, the "IC sequence") was submitted to the NCBI database under accession number KU513561. 62 complete HEV genotype sequences from the GenBank database were used as references (hereafter mentioned as "reference sequences"): AB369691, AB369689, AB291951, AB369687, AB291953, AB291957, FJ653660, AB189071, AB437317, AB593690, JN837481, AB290312, FJ956757, AB301710, AB291955, AB291954, EU723516, AB189074, AB443626, AB189072, EU723513, AB189073, EU723515, AB189075, AB291952, AB291956, AB443627, AB443623, AB443624, FJ998008, AB362841, AB630971, AB189070, JQ679013, AB291961, HQ389544, AB362840, HQ709170, AB437316, FJ527832, HQ389543, AB291960, EU360977, JQ953664, AB740232, AB362842, JQ679014, AB362839, AB222184, AB437318, AB291962, AB246676, AB425830, AB248522, EU495148, AB630970, AB425831, AB248521, FJ705359, AB291963, AB222182 and AB222183. The degree of conservation of viral proteins was determined by calculating the percentage of polymorphic positions (PPs) in each protein. A residue was considered to be in a PP when mutations in more than one reference sequence were found. The PP% was calculated as PP × 100/number of total amino acids in each protein. The Table 1 Oligonucleotides used for amplification of complete HEV coding genome initial reaction (1st R) and nested reaction (2nd R). Tm: annealing temperature. I: inosine. degree of conservation was arbitrarily established as: high, PP% ≤ 1 vs. low PP% ≥ 15. This was obtained from reference sequences as follows: the amino acid consensus sequence was established for each protein according to the most frequent amino acid in each position. The consensus sequence was used to determine the mutation pattern of the IC sample, excluding the polymorphic positions. The PP% values of each protein are summarized in Tables 2 (ORF 1) and 3 (ORF 2 and ORF 3). With regard to ORF1, the Y domain was highly conserved (PP% = 0.4), while the protease and X domain had a low level of conservation (PP% = 19 and 22, respectively). Protease catalytic site residues were Cys-Tyr and Cys-Cys in 53 and 9 sequences, respectively. RdRp-conserved motifs III and IV presented several PPs. In ORF2, the C-terminal region of the S and M domains was highly conserved (PP% = 1 and 0.70, respectively) while the N-terminal region of the S domain had a low level of conservation (PP% = 22). Finally, analysis of ORF3 revealed an extremely high degree of conservation in the D1 domain (PP% = 0) while the other domains (D2, P1 and P2 domains) presented low conservation levels (PP% = 16, 35 and 25, respectively). The polyproline region was the most variable region in the genome. A large number of reference sequences (n = 43) were 110 AAs long, but we found sequences with insertions (n = 8) and deletions (n = 11) in this region. The homologous alphavirus region harbored T16C and T19G mutations in 53% and 89% of reference sequences, respectively. With regard to the IC sample, sequences obtained from serum and feces were identical, thus results presented in tables are applicable to sequences obtained of both samples. We found more than one mutation in the X domain, helicase and RdRp of ORF1 polyprotein ( Table 2 ). The PCP catalytic site was Cys-Tyr, and the RdRp conserved domains I, II, V-VIII were fully conserved as well as the N-terminal region of the S domain of the ORF2 protein (Table 3 ). The IC sequence showed neither deletions nor insertions in the polyproline region. The homologous alphavirus region had T16C and T19G mutations. In the nearly 40 years since the discovery of HEV, our knowledge of its genome and life cycle has grown only slowly. Despite its undeniable epidemiological and clinical significance, some technical issues, such as its adaptation to culture in vitro, have presented obstacles to researchers. Since Koonin and colleagues (Koonin et al., 1992) first described its genomic structure in 1992, there have been no significant advances in our understanding of the genome structure; the sequence published by this group is still used as a reference by many researchers. Thus, our method should contribute to the knowledge of the HEV genome, as it enables direct and sensitive amplification of the complete coding genome from clinical samples, unlike most published methods, which have been standardized for use after HEV has been grown in cell culture e.g. (Tanaka et al., 2007) . Our system would allow clinical samples (of serum or feces) to be used to analyze the complete sequence in human HEV infections of particular interest to public health; for instance comparative genome analysis of chronic and acute infections could be done. Furthermore, it is important to have highly sensitive research tools to investigate HEV transmission by blood products, since this is known to be possible with low viral loads (Huzly et al., 2014) . To date, the only complete HEV genome sequences available in databases from Spanish research groups correspond to viruses found in pigs (EU723513, EU723516 and EU723515) (Peralta et al., 2009) . There are few reports in which the HEV genome has been analyzed in immunocompromised patients (Lhomme et al., 2014b) . A crucial role in the viral cycle was proposed for methyltransferase, PCP and helicase proteins (Holla et al., 2013) . From the analysis of our reference sequences, we have brought to light that methyltransferase, PCP and helicase present numerous polymorphic AA positions. Viral methyltransferase is essential for HEV replication and infectivity, and is required for efficient binding of mRNA to ribosomes as well as for avoiding the innate immune response mediated by interferon (Pichlmair et al., 2006) . Heterogeneity of the second AA of the PCP putative catalytic site (Tyr/Cys), imprecision in the function and limits for the region that has been proposed but not confirmed as being a functional protease (Suppiah et al., 2011) , together with its low level of conservation, highlights the need for functional studies in relation to this protein. Conversely, conservation of RdRp I-VIII motifs including the GDD sequence that binds Mg 2+ , which is essential for replication, is consistent with its proposed key functional role (Koonin, 1991) . Two described domains are still not functionally characterized at present. Of these, the Y domain is highly conserved in our analysis, suggesting that an important function is located in this region, although little is known about it. In the case of the X or macro domain, although its function is unclear, it is analogous to other viral genomes that are better known as coronaviruses (Koonin et al., 1992) . It may not be essential for replication, given its high tolerance to polymorphic AAs. The macro domain is known to be involved in the inflammation process in animal models, and influences viral pathogenicity by substitution of a conserved AA residue that is responsible for the reduced secretion of inflammatory cytokines. However, this function has not been described in the case of HEV (Lhomme et al., 2014b) . The proline-rich region was confirmed as being the most variable genomic region of the virus. A recent study showed the mutation rate of the proline-rich region to be similar to that of other ORF1s, but due to a higher substitution rate and a preference for cytosine in the first and second positions, there is a higher frequency of proline residues in this region (Holla et al., 2013) . This region seems to be of special interest because of its potential relationship with the infectivity of HEV G3 in various animal species (Purdy et al., 2012) , as well as because of its possible relationship with immune control and viral persistence (Lhomme et al., 2014b) . Future studies of cases of chronic infection could provide useful information about the significance of this region. The region that is homologous to the alphavirus and that contains the ORF1 stop codon and the ORF2 and ORF3 initial codons is highly conserved (Rodriguez-Frias et al., 2012) . Changes were observed in two nucleotides, none of which was related to the aforementioned codons. According to our results the ORF2S and M domains are conserved in its C-terminal extreme. The S domain is considered to be a globally conserved region among the different HEV genotypes but has greater divergence in the N-terminal region. The M domain strongly interacts with the S domain and connects the P domain through a proline-rich region (Xing et al., 2010) . In contrast, ORF2P domain presents greater variation, in possible association with its putative function and the presumed immune escape as is has been related to cell entry and antibody neutralization as well as a target for some of the vaccines currently under development (Ahmad et al., 2011; Li et al., 2009) . ORF3 partially overlaps with ORF2 but has a different reading frame (Rodriguez-Frias et al., 2012) . It has been suggested that ORF3 might be an accessory protein of intracellular expression (Perez-Gracia et al., 2014) and that it probably affects the host response (Holla et al., 2013; Rodriguez-Frias et al., 2012) . Its D1 domain is Summarized function mRNA ribosome binding, innate immunity (Holla et al., 2013) Unknown Adaptation (Holla et al., 2013) and immune response modulation (Karpe and Meng, 2012) Protease (Suppiah et al., 2011) , deubiquitination and cellular immunity (Karpe and Meng, 2012) Unknown Helicase (Holla et al., 2013) Replication (Ahmad et al., 2011) Length ( (1) 110 AA in most of the sequences (n = 43); 114 AA (n = 2); 131 AA (n = 3); 172 AA (n = 3);107 AA (n = 7); 108 AA (n = 3); 109 AA (n = 1). fully conserved in our analysis, in accordance with its function as a possible cell survival signal (Kar-Roy et al., 2004) . The D2, P1 and P2 domains, which are related to the inhibition of innate host response, inflammatory response attenuation (Taneja et al., 2009) , ORF2 interaction and virus release, harbor many polymorphic AAs. The previously described PMSP motif of the P1 proline-rich region (Ahmad et al., 2011) occurs in the PMS(F/Y) reference sequences, but not in any case of P80, suggesting that a mistake has been made in classifying residues into this motif. Amino acid S79, whose phosphorylation is determinant for the ORF3-ORF2 interaction (Tyagi et al., 2002) , is fully conserved in all of them. The analysis of the IC sequence indicates that the sequence from feces was fully representative of that from serum, making molecular analysis more feasible when serum samples are scarce. Methyltransferase and the Y domain showed no mutations with respect to the consensus. In contrast, PCP, the X domain, helicase and RdRp had 1, 3, 3 and 9 AA substitutions, respectively. Mutations accumulate predominantly in the intermediate portion of the RdRp, presumably being located outside the protein's active site, although the previously described III and IV motifs (Koonin, 1991) are affected. Conversely, the 3 region and the GDD functional motif described as keys for replication (Agrawal et al., 2001 ) is fully conserved. The data provided by (Lhomme et al., 2014a) suggest that the proline-rich region has an important role in the persistence of the infection. The studied IC sequence harbors neither inser-tion nor deletions in this genome region, which may be to do with the resolution of the infection. Future research on immunocompromised patients becoming persistently infected would provide data on the molecular basis of HEV persistence. ORF2 and ORF3 domains present no or single AA substitutions, with the exception of the Nterminal portion of S domain, which accumulates six mutations. We hypothesize that these differences in the capsid protein might be related to morphological or functional features, but they must surely represent phylogenetic differences relative to the reference sequences in terms of sub-genotype. Finally, we found a mutation in the IC sequence of the PXXP motifs of the P2 region associated with virus release (Pawson, 1995) . Future directed mutation analysis might investigate the effect of this specific substitution on virus release. We conclude that the designed tool allows the molecular analysis of the complete HEV coding genome sequences from clinical samples. Our findings are consistent with some previously described features of the genome, although the discrepancies highlight the need for further functional studies. The value of this type of analysis will become clear once more samples have been completely sequenced and examined in conjunction with their associated clinical data and with the accumulated results from different types of infection. (HEV) genome binds specifically to the viral RNA-dependent RNA polymerase Molecular virology of hepatitis E virus Molecular biology and pathogenesis of hepatitis E virus Treatment of chronic hepatitis E in a patient with HIV infection Chronic hepatitis E: a review of the literature Molecular virology of hepatitis E virus Hepatitis E virus-related cirrhosis in kidney-and kidney-pancreas-transplant recipients The hepatitis E virus open reading frame 3 protein activates ERK through binding and inhibition of the MAPK phosphatase Clinical implications of chronic hepatitis E virus infection in heart transplant recipients The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses Computer-assisted assignment of functional domains in the nonstructural polyprotein of hepatitis E virus: delineation of an additional group of positive-strand RNA plant and animal viruses Characterization of the polyproline region of the hepatitis E virus in immunocompromised patients Influence of polyproline region and macro domain genetic heterogeneity on HEV persistence in immunocompromised patients Dimerization of hepatitis E virus capsid protein E2s domain is essential for virus-host interaction Hepatitis E virus: a non-enveloped member of the 'alpha-like' RNA virus supergroup? Infection with hepatitis E virus in kidney transplant recipients in southeastern France First report and molecular characterization of hepatitis E virus infection in renal transplant recipients in Brazil Protein modules and signalling networks Genetic characterization of the complete coding regions of genotype 3 hepatitis E virus isolated from Spanish swine herds Hepatitis E: current status Hepatitis E: an emerging disease RIG-I-mediated antiviral responses to single-stranded RNA bearing 5 -phosphates An update on hepatitis B, D, and E viruses The hepatitis E virus polyproline region is involved in viral adaptation Molecular organization and replication of hepatitis E virus (HEV) Hepatitis E: molecular virology, epidemiology and pathogenesis Hepatitis E virus infections in humans and animals Lack of processing of the expressed ORF1 gene product of hepatitis E virus Development and evaluation of an efficient cell-culture system for Hepatitis E virus Plasma and urine biomarkers in acute viral hepatitis E Hepatitis E virus infection in a liver transplant recipient in the United States: a case report The phosphorylated form of the ORF3 protein of hepatitis E virus interacts with its non-glycosylated form of the major capsid protein, ORF2 Hepatitis E virus: an underestimated opportunistic pathogen in recipients of allogeneic hematopoietic stem cell transplantation Structure of hepatitis E virion-sized particle reveals an RNA-dependent viral assembly pathway Hepatitis E virus: identification of type-common epitopes Mutational analysis of glycosylation, membrane translocation, and cell surface expression of the hepatitis E virus ORF2 protein