key: cord-1014919-pslg7eul authors: Patel, Seema; Patel, Snigdha title: Analysis of Ebola virus polymerase domains to find strain-specific differences and to gain insight on their pathogenicity date: 2016-07-30 journal: VirusDisease DOI: 10.1007/s13337-016-0334-8 sha: cdc339a7fc1848e5e3fad95ee783a931aac0e770 doc_id: 1014919 cord_uid: pslg7eul Ebola virus, a member of the family Filoviridae has caused immense morbidity and mortality in recent times, especially in West Africa. The infection characterized by chills, fever, diarrhea, and myalgia can progress to hemorrhage and death. Hence, it is a high priority area to better understand its biology in order to expedite vaccine development pipelines. In this regard, this study analyzes the domains in RNA polymerase of fifteen publicly-available Ebola isolates belonging to three strains (Zaire, Sudan and Reston). The protein FASTA sequences of the isolates belonging Zaire, Sudan and Reston strains were extracted from UniProt database and submitted to the interactive web tool SMART for the polymerase domain profiles. Subsequent in silico investigation furnished interesting results that sure can contribute to the understanding of Ebola pathogenesis. The key findings and patterns have been presented, and based on them hypotheses have been formulated for further empirical validation. A wide range of pathogens belonging to virus, bacteria, protozoa, and fungi Kingdom afflict mankind. Viruses are particularly challenging to control for their rapid antigen variations and immune evasion [1] . Ebola virus belonging to the family Filoviridae has been the cause of high mortality in recent times [2, 3] . Another Filoviridae member Marburg virus has also been associated with lethal human pathogenesis [4, 5] . A study reports 50-90 % fatality following infection with the hemorrhagic strains of these viruses [6] . As per World Health Organization (WHO) report out of 5335 cases recorded till September 2014, 50 % (2622 cases) led to death [4] . To contain the transmission and treat the vulnerable community in Africa, Centers for Disease Control and Prevention (CDC) has taken steps by deploying health care staffs [7] . For the enormous public health risk, Ebola virus has been the focus of intense research in recent times. It was first discovered in West Africa in 1976 and now it is endemic to various countries in this region, including Sudan, Zaire, Uganda, Guinea, Liberia, Sierra Leone, and Congo. In fact, the name Ebola traces its origin to Ebola river in Congo [8] [9] [10] [11] [12] . The virus spreads by contact with infected person's body fluid such as blood, saliva, urine, semen etc. [13] . Symptoms of the infection include chills, fever, diarrhea, malaise, and myalgia, which can progress to hemorrhage and death [8] . It is a zoonotic disease with bats as major vectors [14] . Fruit bats from Pteropodidae family have been validated as reservoirs (with the detection of virus-specific antibody in the bat serum) [15, 16] and insectivorous free-tailed bats from Molossidae family have been suspected as vectors for this virus [17] . Also, evidences suggest transmission of the virus from chimpanzee and monkeys, even pigs [8] . As characteristic of most viruses, Ebola virus has diversified into several strains. The most-studied strains include Zaire, Sudan, Côte d'Ivoire (from Tai Forest reserve), Bundibugyo, and Reston, all of which have originated in Africa, except the last one, which evolved in Philippines [8, [18] [19] [20] . Chronologically, the outbreak-associated strains are Zaire (strain Mayinga-76), Sudan (strain Maleo-79), Tai Forest (strain Cote d'Ivoire-94), Zaire (strain Gabon-94), Zaire (strain Kikwit-95), Sudan (strain Uganda-00), and Zaire (diverse lineage, 2014) [21, 22] . Among the existing Ebola strains, Zaire is the most aggressive one, and is linked to most outbreaks [23] . Ebola virus is an enveloped, non-segmented, single stranded, negative-sense RNA virus with genome length spanning 19,000 bases [8] . The RNA is coated in nucleocapsid, which in turn is covered in a glycoprotein-embedded membrane. The polyprotein comprises of seven parts, such as leader sequence, nucleoprotein, virion proteins (VP35 and VP40), glycoprotein, virion proteins (VP30, VP24), and RNA-dependent RNA polymerase [8] . These components of polyprotein are mostly conserved in terms of their amino acid length, such as nucleoprotein (739aa), VP35 (340aa), VP40 (326aa), glycoprotein (676-677aa), VP30 (288aa), VP24 (251aa), and polymerase (2212aa). The polyprotein has a 32aa-long conserved coiled coil region (LVSVTQHLAHLRAEIRELTNDYNQQRQSRTQT) at 2082-2113aa region [24] . VP24 and VP35 act as transcription activators [25] . The former perturbs interferon signaling and latter is an interferon antagonist, thus together they are capable of blocking production of interferons via STAT1 inhibition [25, 26] . VP40 is the matrix protein, which mediates virus-like particle budding [27] . Glycoprotein is the virulence factor that can be liberated or anchored to membrane [6] . These conjugated proteins are secreted into host extracellular space, in diverse truncated isoforms [28] . Full-length glycoproteins measure 150-170-kDa, and they are inserted into the viral membrane, through transcriptional editing [29] . These trimeric proteins with O-linked oligomannose glycans adhere to host cells and mediate fusion with host membrane [6] . Attachment to the endothelial cells via Niemann-Pick C1 receptors (C-type lectin membrane proteins) is followed by replication of the virus [30] . Antigenpresenting cells (APCs) like macrophages and dendritic cells are targeted by the virus, which creates a barrage of cytokines such as interferon (IFN-a), interleukin (IL-2, IL-10), and tumor necrosis factor-a (TNF-a) [31] . Also, excess of T lymphocyte (T helper and T cytotoxic cells) and Natural killer cells (NKC) apoptosis has been reported [23] . In advanced form of the infection, complement cascade is activated, which clots blood, causes endothelial leakage, multi-organ failure, hypotension, and leads to respiratory collapse [32] . Thus, antigenic subversion, characterized by immune suppression and inflammation is described as a potent pathogenesis mechanism of this virus [8, 28] , more or less, akin to other deadly viral pathogens like dengue, SARS (Severe acute respiratory syndrome) etc. [32] . Excessive fluid loss, leading to hyponatremia, hypokalemia, hypocalcemia, hypomagnesemia, hypoalbuminemia, and hypoxemia (abnormally low oxygen level in blood) are characteristic of Ebola fever, which if untreated can cause, shock and hemorrhage [33] . So, 'fluid replacement therapy' for replenishing the depleted electrolytes is a major support in averting the adverse effects [34] . In serious cases, vasoactive agents, hemodialysis and mechanical ventilation are recommended to prevent respiratory and circulatory collapse [35] . There are no Ebola-specific therapeutics yet [36] ; however, several promising candidates are under intense trial. Monoclonal antibodies (MAbs) have been validated to target glycoproteins on the virus membrane. Though the MAb-glycoproein interaction is still enigmatic, it has been revealed that MAbs bind to epitopes in glycoproteins base, glycan cap, or mucin-like domain [37] . In this regard, a combination of MAbs, termed as ZMapp has shown considerable therapeutic promise [38, 39] . It can mitigate viremia and related abnormalities up to 5 days post-infection [38] . Favipiravir (T-705) (an ant-influenza drug) has shown efficacy towards this virus [40] . Ribavirin is another drug effective against many RNA viruses such as hepatitis C (HCV), Lassa virus, and respiratory syncytial viruses (RSV) [41] . Studies have found synergistic effect of above two drugs in management of hemorrhagic Ebola fever [42] . A synthetic adenosine analogue BCX4430 is capable of inhibiting viral RNA polymerase function, as demonstrated in animal models [43] . Also, small interfering RNAs (sRNAs) are being tested to target the virus [12] . In this regard, phosphorodiamidate morpholino oligomers (PMO), a type of synthetic antisense molecules blocking mRNA coding for VP24 proteins has shown promise [44] . Convalescent plasma (plasma from Ebola survivors) is under evaluation for a possible therapeutic [45] . To finetune the emerging drugs and to develop novel therapeutics, a keen knowledge of protein domain configuration of Ebola virus is paramount. This investigation used polymerase protein FASTA sequences of Ebola virus available in publicly-available database UniProt (http://www.uniprot.org/uniprot/) [46] . Care was taken to pull out sequences belonging to different strains of Ebola i.e. Zaire, Sudan and Reston. For the protein domain information of the polymerase sequences, public platform SMART (Simple Modular Architecture Research Tool) [24] was used. Using HMMer (for alignment) and BLAST (for bit score), SMART identifies and annotates domains, assigning them to families and illustrating their topologies [24] . Subsequently, the domain profiles in the polymerase sequences and their distribution patterns were analyzed using scripts developed in Bash language. The scripts were constructed using the commands like awk, sort, grep, comm and while loop. The scripts included ebola_pro-tein_domains.sh, ebola_data_manipulations.sh and ebola_protein_common.sh. The script ebola_protein_domains.sh sorts the polymerase domains of each isolates alphabetically, counts the total number of domain for each isolate and then conducts comparison of domain profile between each pair of isolates. The pair-wise comparison was meant to find domain unique to an isolate. The script ebola_data_manipulations.sh uses the output of ebola_protein_domains.sh as input and finds the domains common to each pair of isolates. The script ebola_pro-tein_common.sh uses each isolate polymersase domain list and searches pathogenically-critical domains like YARHG, WH1, RICTOR_M, Pro-kuma_activ, IENR1, DDHD, DALR_2, WSN, VWC, Telomerase_RBD, RasGAP, PA2c, MIT, YqgFc, TLC, STI1, RUN, RL11, RAP, R3H, LamG, HALZ, B41, HOLI, PLCYc, Hr1, H4, GGDEF, LPD_N, LON etc. On executing the scripts, the generated output files were ebola_data, ebola_data_analysis and ebola_domain_consensus. Relevant and interesting findings were extracted from these result files. Domains common to all, shared among some and unique to some polymerase sequences; strain-specific signature and anomaly; relevance of the domains to pathogenesis were analyzed. Based on the data, clusters were formed and tabulated. Also, hypotheses were formulated and insights were discussed, which is likely to be of relevance in better management of Ebola infection. The 15 Ebola strains were A0A0A7LUV3, A0A068J465, A0A0B5EB22, A0A0D5W8U2, A0A0E3TN89, A0A0F7I MH5, A0A0G2Y8I7, A0A0G2YD12, A0A068J9B1, Q5XX01, Q6V1Q2, Q8JPX5, Q91DD4, Q05318, and X5H5B6. The SMART-predicted number of domains in the polymerase ranged from 54 to 70 (some of them are overlapping), of which minimum was found in Q5XX01 (a Sudan Ebola virus) and maximum in Q91DD4 (a Reston Ebolavirus). All the Zaire strains contained domains in the range of 61-69. In total, the number of unique domains observed in Ebola virus is 158 (though some of they were overlapped due to limitations of homology-based predictions). This information has been presented in Table 1 SMART annotations of some crucial domains have been presented within the parentheses [24] . The core domains are WH2 (WASP-Homology 2 is an actinbinding motif), TBC (GTPase activator proteins), SNc (Staphylococcal nuclease homologues), SMI1_KNR4 (yeast cell wall assembly regulator SMI1 and the cell The domains present in all Zaire strain isolates, but missing in some Reston and all Sudan strain isolate iso- . Further information about these domains can be obtained at http://smart.emblheidelberg.de/browse.shtml [24] . The following domains are lacking in all Sudan and Reston isolates; also missing in a Zaire isolate. The domains include Y1_Tnp (transposase IS200 like This virus is highly contagious and it has shown the potential to spread as an epidemic. Our understanding of this virus is still nascent and vaccine development is yet to succeed. In this scenario, precaution is the best strategy, which can be achieved by educating the vulnerable group, in the virus-endemic reason. Also, limiting interaction with wildlife vectors like primates and bats is required to obviate any outbreaks. Meanwhile, research understandings should be continued to unravel pathogenesis mechanisms and factors. This study has contributed in this objective, key inferences of which have been discussed here. Domain architectures are decisive in catalytic functions of proteins, including their pathogenicity roles [47] . Results of this study indicate that despite the similar component structures in Ebola virus, the domain distribution vary immensely and might be the cause of variable virulence vigor of different strains. Some critical findings have been analyzed and interpreted here. DDHD, a domain with four conserved amino acid residues forming metal binding site is a conserved domain. It is lacking in a Reston strain, suggesting the loss as one of Analysis of Ebola virus polymerase domains to find strain-specific differences and to gain… 247 the likely reason for the loss of pathogenesis in this strain. Studies in other pathogens have shown that this domain has conserved aspartate and histidine residues, modification of which leads to loss of phospholipase activity and membrane trafficking [48] . This domain plays role as co-chaperone for Hsp70 chaperones for proper protein folding with quality control and degradation pathways [50] . Role of this domain in regulating the heat shock protein quality check pathways can be correlated to the pathogenesis of the isolates harboring it. Four Zaire isolates show anomalous behavior such as A0A0A7LUV3 (Liberia-14), A0A0F7IMH5 (Liberia-14), Q6V1Q2 (Kikwit-95), Q05318 (Mayinga-76). The last two Zaire isolates have similar features (a BAG domain, shifted B41 domain), which suggest their phylogenetic proximity. Also, these two strains have been linked to large outbreaks. It leads to the hypotheses that the BAG domain might be their advantage. Isolate Q05318 (Mayinga-76) also has a mannose-specific lectin (B_lectin) and a chitin-binding domain (ChtBD3) which has been associated with host pathogenesis. ChtBD3 is present in isolate A0A0F7IMH5 (Libria-14) as well. ChtBD3 domain is present in serotype 3 of dengue virus, a deadly Flavivirus [51] . As Mayinga-76 strain was associated with the very first outbreak, it can be hypothesized that this lectin and chitin-binding domain in the ancestral strain led to human infection, which evolved over the time to lose it and diversify into other strains. By (54), and it lacks in otherwise well-conserved domains like VWC, YARHG, WH1, RICTOR_M, Pro-kuma_activ, IENR1, B41, among others. Reston isolates lack in otherwise well-conserved domain like B41, DDHD, Y1_Tnp, HOX, HOLI, PLCYc, Hr1, H4, GGDEF and LPD_N. VWC, a von Willebrand factor C domain is known to be involved with many developmental and pathological conditions via platelet activation [52] . However, role of this domain in infectious diseases is deficient, obliterating many critical links in pathogenesis. Table 3 contains the pertinent data. There is considerable domain variation in this virus, even within isolates of same strain. Some regions of the polymerase protein are conserved, some are variable. By comparison of the two Reston isolates, it was seen that, up to CPSase_L_D3 domain at 1312-1361aa (i.e. 55-57 domains), the polymerase is conserved in both. Domain HisKA is present in Sudan isolate at 316-375aa and in Reston isolates at 2076-2138aa, while lacking in the Zaire strains. HisKA is a crucial sensor kinase in pathogens like bacteria [53] , yet its absence in the aggressive Zaire strain seems enigmatic, which ought to be investigated. Domain MIT, involved in microtubule manipulation is present in all Zaire isolates (at 2128-2199aa) and Sudan isolate (at 2125-2196aa isolate), while missing from Reston isolates. It might be another likely reason that Reston isolates cannot infect human. Based on the findings, some investigation-worthy hypotheses have been made. The virus protein domain profiles and their functions revealed that the pathogenesis mechanism is not much different from other lethal viruses such as dengue. In this regard, drug repurposing to control Ebola virus seems pragmatic [54] . Limitation of this work is that most of the analyzed isolates are from Zaire strain, and only few are from Sudan and Reston strains. Also, presence or absence of only a few domains have been discussed here, though based on the results, literature search can yield other relevant clues. Also, the work carried out here can be replicated with more Ebola virus polymerase and other protein sequences to garner further insights on pathogenicity determinants and strain-specific features. This study furnished critical information regarding the polymerase protein domain diversity within the Ebola virus and related it to their variable virulence characteristics. The comparative analysis illuminated on many proteomic features of the lethal virus. It is clear that domain organization dictates virulence profile of different strains. Analyzing more isolates will eliminate inadvertent bias in interpretations. Presently, Ebola might be restricted to certain parts of the world, but the case fatality rate is highest among all pathogens at 90 %. In this regard, the work presented here is crucial in expanding our understanding of this Filovirus. Viral escape mechanisms-escapology taught by viruses Filoviruses'': a real pandemic threat? Ebola: a holistic approach is required to achieve effective management and control Ebola virus disease: a review on epidemiology, symptoms, treatment and pathogenesis Marburg virus infection in nonhuman primates: Therapeutic treatment by lipid-encapsulated siRNA Ebolavirus glycoprotein structure and mechanism of entry Ebola in West Africa-CDC's role in epidemic detection, control, and prevention Ebola haemorrhagic fever Ebola viral disease outbreak-West Africa Ebola transmission linked to a single traditional funeral ceremony Ebola virus disease outbreak in West Africa Treatment of ebola virus disease Ebola virus persistence in semen ex vivo Bats: important reservoir hosts of emerging viruses Mapping the zoonotic niche of Ebola virus disease in Ebolavirus and other filoviruses Investigating the zoonotic origin of the West African Ebola epidemic Molecular architecture of the nucleoprotein C-terminal domain from the Ebola and Marburg viruses Ebola hemorrhagic fever associated with novel virus strain, Uganda Outbreak of ebola virus disease in Guinea: where ecology meets economy Phylogenetic analysis of Guinea 2014 EBOV Ebolavirus outbreak Ebola outbreak in Western Africa 2014: what is going on with Ebola virus? Human fatal zaire Ebola virus infection is associated with an aberrant innate immunity and with massive lymphocyte apoptosis SMART: identification and annotation of domains from signalling and extracellular protein sequences Different temporal effects of Ebola virus VP35 and VP24 proteins on global gene expression in human dendritic cells Filovirus replication and transcription Host IQGAP1 and Ebola virus VP40 interactions facilitate virus-like particle egress Antigenic subversion: a novel mechanism of host immune evasion by Ebola virus Ebola virus pathogenesis: implications for vaccines and therapies Ebola viral glycoprotein bound to its endosomal receptor Niemann-Pick C1 Biomarkers for understanding Ebola virus disease Complement and viral pathogenesis Clinical management of Ebola virus disease in the United States and Europe Saint André-von Arnim A. Clinical presentation and management of severe Ebola virus disease Ebola virus disease: an update for anesthesiologists and intensivists Stateof-the-art workshops on medical countermeasures potentially available for human use following accidental exposures to Ebola virus Mechanism of binding to Ebola virus glycoprotein by the ZMapp, ZMAb, and MB-003 cocktail antibodies Reversion of advanced Ebola virus disease in nonhuman primates with ZMapp Progress of vaccine and drug development for Ebola preparedness Successful treatment of advanced Ebola virus infection with T-705 (favipiravir) in a small animal model Implications of high RNA virus mutation rates: lethal mutagenesis and the antiviral drug ribavirin Low-dose ribavirin potentiates the antiviral activity of favipiravir against hemorrhagic fever viruses Protection against filovirus diseases by a novel broad-spectrum nucleoside analogue BCX4430 A single phosphorodiamidate morpholino oligomer targeting VP24 protects rhesus monkeys against lethal Ebola virus infection The use of Ebola convalescent plasma to treat Ebola virus disease in resource-constrained settings: a perspective from the field The UniProt Consortium. The universal protein resource (Uni-Prot) Evolution of domain architectures and catalytic functions of enzymes in metabolic systems Roles of SAM and DDHD domains in mammalian intracellular phospholipase A1 KIAA0725p Getting a GRASP on CASP: properties and role of the cytohesin-associated scaffolding protein in immunity Hsp110/Grp170, HspBP1/Sil1 and BAG domain proteins: nucleotide exchange factors for Hsp70 molecular chaperones Global spread of dengue virus types: mapping the 70 year history Genetic and biochemical dissection of a HisKA domain identifies residues required exclusively for kinase and phosphatase activities Drug repurposing to target Ebola virus replication and virulence using structural systems pharmacology