key: cord-009614-lbjesv8y authors: Durmuş Tekir, Saliha D.; Ülgen, Kutlu Ö. title: Systems biology of pathogen‐host interaction: Networks of protein‐protein interaction within pathogens and pathogen‐human interactions in the post‐genomic era date: 2012-11-29 journal: Biotechnol J DOI: 10.1002/biot.201200110 sha: doc_id: 9614 cord_uid: lbjesv8y Infectious diseases comprise some of the leading causes of death and disability worldwide. Interactions between pathogen and host proteins underlie the process of infection. Improved understanding of pathogen‐host molecular interactions will increase our knowledge of the mechanisms involved in infection, and allow novel therapeutic solutions to be devised. Complete genome sequences for a number of pathogenic microorganisms, as well as the human host, has led to the revelation of their protein‐protein interaction (PPI) networks. In this post‐genomic era, pathogen‐host interactions (PHIs) operating during infection can also be mapped. Detailed systematic analyses of PPI and PHI data together are required for a complete understanding of pathogenesis of infections. Here we review the striking results recently obtained during the construction and investigation of these networks. Emphasis is placed on studies producing large‐scale interaction data by high‐throughput experimental techniques. Despite immense technological advances in medicine, pathogenic organisms remain the source of much human morbidity and mortality. HIV/AIDS, acute lower respiratory tract infections, hemorrhagic fever, diarrheal diseases, tuberculosis and malaria are particularly notorious for high mortality rates [1] [2] [3] . The continuous emergence of new diseases and drug-resistant pathogens has heightened the global burden of infectious diseases in the 21 st century [1, 4] . To tackle such biological threats, an improved understanding of pathogenic microorganisms and their interactions with host organisms is needed since pathogen-host molecular interactions have crucial roles in initiating, sustaining, or preventing infection. Pathogenic microorganisms communicate with human cells through interactions with human proteins both on the surface of the cell and within the interior of the cell. These interactions allow the microorganisms to enter the host cell and manipulate cellular mechanisms in order to use the host cell's capabilities to their own advantage, resulting in infection in the host organism. Detailed knowledge of pathogen-host protein interactions may enable us to comprehend the mechanisms of infection and to identify better strategies to prevent or cure infection [5, 6] . However, the identification of new drug and vaccine targets for infectious diseases is only possible when the molecular machinery within individual pathogenic and host organisms is understood. For instance, anti-infection therapeutics should target essential genes in the pathogens which have no homology with human genes [7] . The very first genome sequencing was published in 1977 with the DNA sequence for the genome of a virus, bacteriophage phiX174 [8] . Following the sequencing of the bacterial pathogen Haemophilus influenzae in 1995 [9] and the human genome in 2000 [10] , sequence data for prokaryotic and eukaryotic genomes have appeared at an accelerated rate. Today, genomic data for most of the pathogen and host organisms are available [11] . These data are used to study individual genes and corresponding proteins as well as to identify intra-and interspecies connections between proteins. In the light of these advances, the initial steps towards complete understanding of infection mechanisms through protein interactions have been recently published. In this review, the efforts to systematic determination and analysis of protein interaction networks underlying infection pathogenesis are summarized (mainly in a chronological order) to present the current picture of the research on infectious diseases. From a classical perspective, a protein is a functional unit that specifies a small, but discrete, part of the cellular physiology of an organism. In the post-genomic era, a protein is seen to function as an element within network of its interaction, and its role should be evaluated within this network together with its interacting partners [12] . Advances in genomics and proteomics have been followed by the first large-scale efforts to identify functional networks of interacting proteins using the two-hybrid method [13] [14] [15] , pull-down assays [16, 17] , and protein chips [18] . To increase our understanding of the mechanisms of infection, protein-protein interaction (PPI) networks of pathogenic organisms should be determined in order to capture their functional and structural organizations. Pathogenic PPI maps reveal biological pathways and processes, allowing prediction of protein functions and discovery of new drug and vaccine targets. The first genome-wide protein interaction networks were determined for viruses [19] [20] [21] . The first large-scale bacterial networks [22] [23] [24] followed successes in eukaryote mapping [15, [25] [26] [27] . Today, the genome-wide PPI maps for a number of pathogens and hosts are available in public databases: BIND [28] , BioGrid [29] , DIP [30] , HPRD [31] , IntAct [32] , MINT [33] , MIPS [34] , Reactome [35] and STRING [36] . Primarily due to their small genome size, whole genome PPI maps were first constructed for viruses. The first interaction map of whole proteome was determined for Escherichia coli bacteriophage T7, mapping 25 interactions among viral proteins [19] . Subsequently, genomewide analyses of important human pathogens, hepatitis C virus [20, 37] , vaccinia virus [21] , herpesviruses [38, 39] , and SARS coronavirus [40, 41] were performed through intraviral PPI maps. Hepatitis C virus (HCV), a flaviviridae family member causing severe liver disease, is a positive-sense singlestranded RNA virus. It encodes only a single polyprotein which is co-or post-translationally processed into at least 10 viral proteins [42] . A controlled two-hybrid strategy based on a random genomic HCV library screen was used by Flajolet et al. [20] , resulting in the identification of known and novel PPIs. Interactions among structural and non-structural proteins were revealed in the study, leading to the conclusion that almost all of the viral proteins encoded by the genome function in the HCV life-cycle, as in the cases of other members of the flaviviridae [43] . The roles of these functional interactions were discussed within the framework of the constructed genome-wide interaction map. Interacting domains of the viral polyprotein were also identified to shed light on the development of anti-viral agents [20] . Another genome-wide PPI map of HCV was then generated for the viral non-structural proteins [37] . Vaccinia virus, well-known as a smallpox vaccine and also the source of potential recombinant vaccines against cancer and infectious diseases, is a member of poxviridae family. It is a large, double-stranded DNA virus. Poxviruses replicate themselves in the cytoplasm of the host cells without depending on the host's transcriptional machinery. The large genome of vaccinia virus can potentially express more than 200 proteins [44, 45] . McCraith et al. [21] performed a comprehensive two-hybrid analysis of full-length vaccinia virus proteins and detected 37 PPIs (including 28 novel interactions) among both characterized and uncharacterized proteins. Many of the PPIs mapped involved one partner which was known to function in a specific process, coupled with another of unknown function, allowing functions to be assigned to previously unannotated proteins within DNA replication, transcription, virion structure, or host evasion. Another double-stranded DNA virus family is herpesviridae whose members encode 70-170 proteins. Herpesviruses cause human diseases such as Kaposi sarcoma, B-cell lymphomas, chickenpox, shingles, and nasopharyngeal carcinoma [46] [47] [48] . The genome-wide intraviral protein interaction maps for three members of this family, Kaposi sarcoma-associated herpesvirus (KSHV), varicella-zoster virus (VZV), and Epstein-Barr virus (EBV) were generated by two-hybrid and analyzed comprehensively to reveal viral network properties [38, 39] . In the work of Uetz et al. [38] , 123 PPIs for KSHV and 173 PPIs for VZV were identified, the largest dataset published to date, allowing the construction of the first viral networks. Topological network analyses of these interactome maps indicated that the viral networks appear as a single, highly coupled module ( Fig. 1 ) with relatively many hubs and few peripheral nodes [38] in contrast to scale-free cellular networks with well-separated functional modules [49, 50] . Just after this study was published, Calderwood et al. [39] reported the detection of 43 PPIs among EBV proteins. The construction of a PPI map for EBV by merging these interactions with already published ones resulted in a network of 52 proteins with 60 interactions. This large-scale network allowed the prediction of functions of uncharacterized proteins, further defining viral mechanisms. In these consecutive studies [38, 39] , core proteins common to all herpesviruses and noncore ones specific to each strain were investigated thoroughly. The severe acute respiratory syndrome coronavirus (SARS-CoV) is a positive-sense single-stranded RNA virus belonging to the family of the largest RNA viruses known, coronaviridae. Its genome encodes 14 open reading frames expressing up to 30 structural and non-structural proteins that have roles in viral replication, assembly, and other functions for viral amplification in host cells [40, 41, 51] . For a genome-wide analysis of PPIs of SARS-CoV, interactions between all SARS-CoV proteins were determined [40, 41] by two-hybrid producing 65 and 40 interactions, respectively. Intraviral PPIs were analyzed to elucidate the functions of the proteins as well as to identify the essential proteins in viral replication. von Brunn et al. [40] compared the intraviral network topology of SARS-CoV with a previously defined viral network [38] and cellular networks [52] [53] [54] , concluding SARS-CoV network contained similarities to the KSHV network [38] . Insights gained into molecular mechanisms and topological network properties provided by the genome-wide analyses of intraviral PPI maps (Table 1 ) may be used as a basis for further characterization of the functions and mechanisms of viral proteins, especially for other members of the same virus families. Having successfully built genome-wide PPI maps for viruses, similar two-hybrid methodology was applied to construct PPI networks for the larger, more complex genomes of pathogenic bacteria. The first prokaryotic PPI map was built for Helicobacter pylori [22] . Other large-scale prokaryotic networks eventually emerged for Campylobacter jejuni [55] , Treponema pallidum [56] Mycobacterium tuberculosis [57] , and Bacillus subtilis [58] . Genome-scale analysis of interacting proteins that assemble into protein complexes were performed for E. coli [23, 24] and Mycoplasma pneumoniae [59] . The first large-scale intrabacterial PPI map was constructed for the human gastric pathogen, and gram-negative bacterium H. pylori, identifying 1280 interactions between 46.6% of all 261 bacterial proteins using the twohybrid method [22] . The comparison of these H. pylori PPIs with previously described interactions between orthologous E. coli proteins resulted in prediction of protein functions within biological pathways such as chemotaxis and urease activity, essential for H. pylori pathogenicity. In this study, the interacting domains of H. pylori proteins were also identified and used in protein function predictions. Interacting domains may serve in mapping new functional domains, providing crucial information for antibacterial drug design studies. Gram-negative bacterium E. coli, the main cause of urinary tract infections and a model bacterial system, is one of the best characterized and early studied organisms [60] [61] [62] . However, any large-scale analysis of protein complexes in E. coli was not performed until the studies of Butland et al. [23] and Arifuzzaman et al. [24] . First, 716 binary interactions involving 83 essential and 152 nonessential proteins, were identified by pull-down assay using tandem affinity purification-mass spectrometry, targeting 1000 ORFs (about one-quarter of the E. coli genome) [23] . A small number (15%) of these interactions were already available in DIP, BIND, and STRING. Ten newly described E. coli PPIs were found as orthologous to the interactions reported for H. pylori [22] . The novel interactions were analyzed for functional annotations of uncharacterized proteins, allocating them within ribosome function, RNA processing, RNA binding, and so on. The graph theoretical analysis of the PPI map of E. coli revealed scale-free behavior and a high correlation between connectivity and the degree of conservation. The genome-wide PPI map of E. coli K-12 strain with 11 511 interactions among 2667 proteins was then constructed by a similar method [24] . The comprehensive analysis of this large-scale network also validated the scale-free nature and the connectivity-conservation correlation found previously [23] . Arifuzzaman et al. [24] identified 107 functional units which have roles in metabolic pathways, transcriptional and translational machinery, recombination and flagella assembly. Analysis of PPIs based on this functional unit categorization provided further functional annotations. The gram-negative, food-borne pathogen C. jejuni is the major cause of gastroenteritis. The proteome-level analysis revealed 11687 interactions involving 80% of 1654 C. jejuni proteins [55] , the most comprehensive bacterial PPI map determined by two-hybrid. A scale-free network was obtained, removing low confidence-scored interactions. This PPI map of C. jejuni was used to identify evolutionarily conserved subnetworks through comparison with protein networks of H. pylori [22] , E. coli [23] and Saccharomyces cerevisiae in DIP. Further analyses of the identified conserved sub-networks allowed the prediction of new C. jejuni interactions using orthologous interactions. This comparative analysis also enabled the identification of essential C jejuni genes based on their orthology to essential genes in other organisms. This comprehensive interactome data were next used to predict protein roles and to map functional pathways such as chemotaxis. The causative agent of syphilis, T. pallidum, has one of the smallest genomes known in extracellular bacteria, encoding 1039 proteins [63] . The global PPI network of T. pallidum, involving 3649 interactions connecting 726 bacterial proteins, was identified by two-hybrid [56] . The high-confidence subset connects 576 proteins by 991 interactions. In that study, an integrated network of DNA-metabolism related processes was constructed and 18 proteins were functionally annotated within this network. Additionally, various orthologous interactions were predicted for completely sequenced genomes, allowing the description of phylogenetically conserved interaction patterns. Atypical pneumonia causing human pathogen, M. pneumoniae also has one of the smallest genomes in self-replicating organisms with 689 protein-encoding genes, making it a good model organism to study proteome organization in prokaryotes [64] . A proteome-wide analysis was performed by tandem affinity purificationmass spectrometry, identifying 62 homo-multimeric and 116 hetero-multimeric protein complexes [59] . About a third of the found hetero-multimeric complexes were observed to interact with proteins forming 35 larger, multiprotein complexes implying higher level of proteome organization and protein multifunctionality, allowing functional annotations of assemblies as well as prediction of biological roles of individual proteins within the complexes. M. tuberculosis causes millions of deaths each year with tuberculosis infection [65] . After computational efforts to construct large-scale PPI maps of M. tuberculosis [66, 67] , its genome-wide network was identified experimentally by two-hybrid [57] . This global network is composed of 8 042 interactions among 2907 proteins which represent 74.1% of the whole proteome. The topological properties of the undirected network of these interactions were calculated and compared with those of the previously defined prokaryotic PPI networks [22-24, 55, 56] . Similar scale-free behavior following a power-law distribution was observed. In fact, the networks obtained by pull-down assay [23, 24] differ in values of clustering coefficient from the networks obtained by two-hybrid analysis [22, 55, 56] . Moreover, Wang et al. [57] performed a cross-species network comparison analysis of M. tuberculosis interactions with the available large-scale PPI data [22-24, 55,56] and identified conserved sub-networks. Additionally, the highly connected critical proteins and mechanisms of the protein secretion pathways which have roles in its pathogenesis were revealed. A large-scale PPI network was recently constructed for the gram-positive bacterium B. subtilis (which is rarely pathogenic) by two-hybrid [58] . This network of 793 interactions involves 287 bacterial proteins. Due to its role as a model organism, many studies were performed to characterize the biological functions of its PPIs in cellular processes [68] [69] [70] . However, many processes remained uncharacterized. Hence Marchadier et al. [58] performed a comprehensive analysis with the integration of transcriptomic data focusing on cell division, cell responses to stresses, the bacterial cytoskeleton, DNA replication and chromosome maintenance. These sequential efforts on construction of large-scale PPI networks for prokaryotes (Table 1) constitute the first comprehensive description of the intraspecies mechanisms of the bacterial pathogens. The protozoan pathogen Plasmodium falciparum causes malaria which results in deaths of nearly a million of people each year [71] . A comprehensive protein interaction map of this pathogen was generated by two-hybrid, identifying a highly interconnected, scale-free network of 2823 interactions within 1267 proteins (~25% of the predicted P. falciparum proteins) [72] . In this network, 33% of the interactions are between two uncharacterized proteins whereas 49% of the interactions include one such protein. Bioinformatic analysis of this network yielded functional annotations of the proteins within the processes; chromatin modification, transcription, messenger RNA stability, ubiquitination, and invasion of host cells. More detailed studies of PPIs within P. falciparum are required in order to unravel its pathogenesis mechanisms thoroughly. Despite the increasing rate in the identification of genome-wide PPI networks, they remain unconstructed for most pathogens. In the light of accelerating advances in genomics, proteomics, and interactomics, large-scale maps for many more organisms are expected to be built in the near future. Increasing numbers of PPI networks will allow the comparison of networks across diverse organisms, resulting in generalized conclusions about pathogenic molecular mechanisms. The first examples of such comparative studies have been highlighted in the sections 2.1 and 2.2 above. Integration of several highthroughput interaction datasets to generate more detailed networks is also possible, as indicated by recent examples for the E. coli system [73, 74] . The frequency of such integrated networks is expected to increase, owing to the large number of diverse data sets. These will be invaluable in defining whole proteomic maps of the pathogens. One of the most striking results of bioinformatic analyses on the constructed PPI maps is the identification of essential proteins functioning within pathogens. These proteins should be examined thoroughly to test their potential as novel therapeutic targets. The exploration of genome-wide PPI maps of the pathogens permits the assignment of unannotated proteins to biological pathways with function prediction. The proteins annotated to the host invasion processes may provide a launching point for pathogen-host interaction studies. Biochemical interactions of pathogens with their hosts are necessary to invade the host organism. These connections between pathogens and hosts include interactions between proteins, nucleotide sequences, and small ligands [75, 76] . However, the protein interactions of pathogen-host systems have been identified as the most important, and therefore the most studied, type of pathogen-host interactions (PHIs) [76, 77] . Since these interspecies crosstalks determine the pathogenesis, focusing on the whole PHI system, instead of investigating a pathogen or host individually, may allow us to capture critical mechanisms (i.e. strategies used by pathogens and host immune responses) during infection that cannot be provided by traditional methods. Due to a lack of sufficient experimental PHI data until recent years, many computational PHI prediction methods have been developed [78] [79] [80] [81] [82] [83] [84] . These studies focused mainly on interactions of P. falciparum and human immunodeficiency virus (HIV), as these are some of the most threatening pathogens to humans. Very recently experiments have been carried out to determine the first large-scale molecular interactions between human and viruses [39, 85, 86] and bacteria [87, 88] . As a result of an increase in data available for pathogen-host systems, PHIspecific databases have been introduced such as PHIbase [89] , VirusMINT [90] , VirhostNet [91] , PATRIC [92] , and PHISTO [93] . Although these advances in data archiving are promising, most data relevant to PHI are still buried in the biomedical literature. Some rare efforts have been performed to obtain hidden PHIs from the literature by text mining [94] [95] [96] . As in the case of intraspecies pathogen PPIs, large-scale PHI data were generated for viral systems before bacterial systems ( Table 2 ). The first examples are for commonly observed human pathogens, EBV [39] , HCV [85] and influenza A virus (H1N1 and H3N2) [86] and then recently for HIV [97] . In Calderwood et al. [39] , protein interactions between herpesvirus, EBV and human were mapped by twohybrid in conjunction with EBV intraviral PPI mapping, providing 173 PHIs between 40 EBV proteins and 112 human proteins. A systematic analysis of these interactome maps of PPIs and PHIs enabled hypotheses of the roles of EBV proteins in pathogenesis to be generated. Furthermore, intraspecies protein interaction data for human were integrated from databases (BIND, DIP, HPRD, MIPS) and from the literature [52, 53] to analyze the organization of the human proteins targeted by EBV within human molecular machinery. It was found that EBV proteins tend to target human proteins which are highly connected (hubs) and central to many paths (bottlenecks) in the human PPI network. On the other hand, the degree distribution of the EBV-human protein interaction network could not be fitted to any model because of its incompleteness (Fig. 2) . Attempts to analyze incomplete maps of PPIs and PHIs are still able to supply a partial understanding of mechanisms underlying infection. A similar thorough analysis was earlier performed with herpesviral protein networks of KSHV and VZV and their interaction with the human proteome [38] . In that study, protein interactions between herpesviruses and human were predicted using the interacting orthologs of both proteins in other organisms [54] . Combined virus-human networks were constructed by starting with the viral networks, adding their human protein targets, and then adding the cellular interactions among the targeted human proteins. The topological analyses of the combined herpesviruses-human networks revealed distinct properties from both viral and human interactomes providing insights into the impact of the two organisms on each other [38] . A proteome-wide PHI map for the flavivirus HCV was mapped by two-hybrid and then by literature mining of previously found interactions between HCV and human [85] . A map of 481 interactions between 11 HCV proteins and 421 human proteins was generated (314 PHIs by twohybrid). 65% of this PHI network included novel interactions. The integrated human network of 44 223 PPIs among 9520 proteins [98] was used to evaluate the interplay between HCV and human. Very similar behavior to EBV [39] was observed for HCV in terms of attacking hub and bottleneck proteins in the human network. To assess the human pathways targeted by HCV, KEGG functional annotation pathways [99] were used. Four pathways were detected to be enriched in HCV-targeted human proteins. Three of them were associated already with HCV clinical www.biotechnology-journal.com syndromes as insulin, TGF-β and Jak/STAT pathways. The last enriched pathway, focal adhesion, is a novel observation as a human pathway affected during HCV infection [85] . Influenza A is a member of negative-sense singlestranded viruses of orthomyxoviridae family. It is the sources of all flu pandemics infecting multiple species. For H1N1 A/PR/8/34 strain of influenza virus, 31 intraviral PPIs among 10 viral proteins and 135 PHIs between 10 viral and 87 human proteins, most of which are expressed in primary human bronchial cells, were detected by twohybrid [86] . Some of the PHIs constructed had been published previously [100] . The topology of the constructed intraviral network revealed a highly interconnected nature, as observed previously for other viral networks [38, 101] . In the case of the influenza A-human interaction network, important properties about connectivity of proteins were observed. First, viral proteins interact with significant number of human proteins, reflecting the multifunctionality of the small number proteins encoded in RNA viruses. Second, each of 24 human proteins connects with two or more viral proteins forming virus-human multiprotein complexes. Additionally, it was observed that viral proteins generally target human proteins which are highly connected within their own network, as it was the case in herpesviruses-human system [39] . In Shapira et al. [86] another PHI network was identified for strain of influenza virus, H3N2 A/Udorn/72 by the same experimental approach. This PHI network consists of 81 interactions between 10 viral and 66 human proteins, reflecting a similar nature to the network for H1N1 strain-human system. This confirms the conserved functions of influenza virus proteins through strains. Besides direct physical interactions between viral and human proteins, host responses in bronchial cells to influenza infection was identified by expression profiling, generating a regulatory map of interactions between influenza proteins and their human targets. Comprehensive analysis of the physical and regulatory maps of the PHI system elucidated human mechanisms involved in infection. For example, NF-κB, mitogen-activated protein kinase, apoptosis, and Wnt signaling pathways are regulated through transcriptional and/or physical interactions during influenza A infection. One of the most dangerous human pathogens, HIV, belongs to positive-sense single-stranded RNA virus family retroviridae. Acquired immunodeficiency syndromecausing HIV has been extensively studied since its first observation near the end of the 20 th century [102] [103] [104] [105] . Similar to other RNA viruses, HIV has a small genome and depends largely on human cellular machinery to be replicated. Identifying the physical contacts between HIV and human proteins during HIV replication is critically important for a full understanding of HIV infection. Being one of the most studied pathogens, there are many PHI data for HIV-1 in VirusMINT and PHISTO. The current PHI data have been produced mainly by small-scale experiments [106] [107] [108] . Very recently, a global PHI network was generated for HIV-human protein complexes by affinity tagging and purification mass spectrometry, producing 497 PHIs between 16 HIV-1 proteins and 435 human proteins [97] . It was observed that HIV-targeted human proteins are highly conserved across primates. The novel interactions identified in that study requires further work to detail their biological significance in terms of HIV infection. Besides whole proteins, domains of the interacting proteins were investigated and the enriched domain types in targeted human proteins were indicated for facilitating future structural modeling studies regarding HIVhuman system. The first large-scale interaction networks between viruses and humans [39, 85, 86, 97] provide crucial clues about the viral infections, verifying the critical importance of PHI analyses in infection researches. Until very recent years, the PHI data were scarce for bacterial systems because of lack of any large-scale experiments. The first extensive bacterial PHI networks were identified for important human pathogens, Bacillus anthracis, Francisella tularensis, and Yersinia pestis [87] , then another high-throughput experimental study generating PHI data of Y. pestis was reported [88] . Gram-positive bacteria B. anthracis and Y. pestis and gram-negative bacterium F. tularensis are respiratory pathogens causing anthrax, bubonic plague, and acute pneumonic disease, respectively. Using a two-hybrid assay, large-scale interaction data were generated between these bacteria and human producing 3073 PHIs between 943 B. anthracis proteins and 1748 human proteins, 4059 PHIs between 1218 Y. pestis proteins and 2108 human proteins, and 1383 PHIs between 349 F. tularensis proteins and 999 human proteins [87] . The first conclusion of computational analyses of these comprehensive bacteria-human networks, in combination with the integrated human PPI network from databases BIND, DIP, HPRD, IntAct, MINT, MIPS, and Reactome, was that bacterial proteins tend to target hubs and bottlenecks in the human network. Secondly, the roles of human proteins targeted by these bacteria were investigated using their gene ontology annotations [109] . The tendency of all three pathogens to target human proteins involved in immune responses was observed as previously reported [110] [111] [112] . Besides being effectors of immune signaling, the bacteria-targeted human proteins also have crucial roles in apoptosis [87] . Thirdly, the conserved protein interaction modules of the three PHI networks were computed [113, 114] for a more systematic comparative analysis. Conserved modules revealed common attacks by the bacterial pathogens to same human pathways. Subsequently, another PHI map was generated for plague causing Y. pestis by a different two-hybrid strategy by choosing only potential virulence factors as bait pro-teins [88] . 204 PHIs were yielded between 66 Y. pestis proteins and 109 human proteins and then 23 previously reported PHIs were integrated to construct a comprehensive network between Y. pestis and human. A graph theoretic analysis confirms that Y. pestis preferentially targets hub and bottleneck proteins in the human intranetwork as concluded previously for viruses [39, 86] and bacteria [87] . Signaling pathways, crucial for human immune system, were found to be enriched in human proteins targeted by Y. pestis. These pathways include mitogen-activated protein kinase signaling and Toll-like receptor signaling and also pathways functioning in focal adhesion, regulation of cytoskeleton, and leukocyte transendoepithelial migration. Finally, Y. pestis-targeted human proteins were compared with those targeted by viruses whose PHI networks were identified previously. 16 of 109 Y. pestis-targeted human proteins are included in PHI networks of EBV [39] and HCV [85] indicating the common infection strategies of both viruses and bacteria. The recent detected first large-scale PHI networks of bacteria-human systems [87, 88] contribute largely to the understanding of bacterial infection mechanisms with immune evasion. As PHI data available for various pathogens increase, a need to analyze comprehensive PHI data for all pathogen types together arises in order to draw a generalized picture. Although infection mechanisms of individual pathogens have been studied through intraspecies pathogenic PPI maps and interspecies PHI maps, a general overview of infection mechanisms was missing until analyses of PHI data from different infection agents were attempted [6, 93] . In the absence of large-scale PHI networks for bacterial, protozoan and fungal systems, Dyer et al. [6] performed the first global analysis of 10 477 protein interactions between 190 pathogen strains of viruses, bacteria, protozoa, and human through properties of targeted 1233 human proteins. Diversity of the available PHI data was not rich, 98.3% of 10 477 PHIs belonged to the virushuman systems with 77.9% of the interaction data drawn from HIV -human interaction systems. The importance of the pathogen-targeted proteins was evaluated within the intraspecies human PPI network of 75 457 interactions. These PHI and PPI data were integrated from public databases; MINT, IntAct, DIP, HPRD, Reactome, BIND and MIPS. Firstly, targeting hub and bottleneck proteins was concluded to be global behavior for all pathogens, as reported for individual pathogen strains previously [39, [86] [87] [88] . Gene ontology [109] functions enriched in the targeted human proteins by different pathogens revealed common infection mechanisms. Attack of human transcription factors and key proteins that control the cell cycle and regulate apoptosis and transport of genetic material across the nuclear membrane were found to be among the common viral strategies. Despite its scarcity (174 interactions in the datset), bacterial PHI data allowed identification of specific human proteins that function in the host immune response (via Toll-like receptors and I-κB kinase/NF-κB signaling cascade) as a target of bacterial infection strategy [6] . Recently we performed another study with comprehensive PHI data to explore common and special infection strategies for viruses and bacteria [93] . A significant amount of bacterial PHIs, constituting 36.5% of all data, was avaiable thanks to Dyer et al. [87] . We analyzed 23 435 interactions between 3419 proteins of viral, bacterial, protozoan and fungal pathogens (totally 257 strains) and 5210 proteins of human obtained from PHISTO (www.phisto.org). To generate the intra species human protein network, 194006 PPIs were integrated from BioGrid, DIP, IntAct, MINT and Reactome. The significant amount of bacterial and viral PHI data allowed us to focus on comparisons between their specific infection mechanisms. Firstly, attacking hub and bottleneck proteins in the human PPI network was verified as a common infection strategy of both bacteria and viruses. Furthermore, viruses were observed to target human proteins of much higher connectivity and centrality values in comparison to bacteria. Secondly, gene ontology enrichment analysis of the targeted human proteins verified the special mechanisms of bacteria and viruses use to manipulate of human immune defense mechanisms and cellular processes, respectively (as reported in Dyer et al. [6] but relying on lower amounts of PHI data). A first attempt at the investigation of the human proteins targeted by both bacteria and viruses revealed that attacking human metabolic processes is a common strategy used by both pathogens during infections [93] . Global analysis of PHI data provides insights into the strategies adapted by bacteria and viruses to subvert human cellular processes and immune system for the infection. However, large-scale PHI networks for pathogens other than bacteria and viruses are still undetermined, leaving their pathogenesis mechanisms to be relatively uncharacterized. Research on infectious diseases through PHIs has accelerated within the post-genomic era (Fig. 3) . However, large-scale PHI networks have been infrequently studied. Efforts to identify and analyze large-scale PHIs for diverse pathogen types would be expected to parallel the acceleration of biotechnology and bioinformatics research. Increasing amounts of data available will allow more complete data sets to be compiled, resulting in characterization of topological properties of PHI networks. The first attempt to fit the degree distribution of EBV-human interaction network failed due to scarcity of data [39] . On the other hand, bioinformatic analyses of the pathogen-tar-geted human proteins succeeded in unraveling some infection strategies such as targeting human hubs and bottlenecks, subverting cellular processes for the usage of pathogens' own advantages and evasion of immune defenses [6, 39, 85, 87, 88, 93] . The huge amount of data expected to be generated for PHI systems will enable us to capture all details of infection processes. potentially leading to the development of new and more efficient therapeutics. Conventional treatments for infectious diseases often aim to kill pathogens by targeting their essential proteins. This approach unfortunately forces the pathogens to evolve for survival and consequently selects resistant strains (especially in the case of RNA viruses with a high mutation rate). To fight drug-resistant patho gens, novel alternative therapeutics are emerging which target host proteins required by pathogens to replicate and persist within the host organism. If these host factors are indispensable for pathogens, but not essential for host cells, their silencing may inactivate pathogenic activity, allowing them to serve as therapeutic targets [4, 115] . In the light of PHI studies, some human factors required by viral and bacterial pathogens have been determined for HIV [115] [116] [117] [118] [119] , HCV [120] , West Nile virus [121] , Influenza virus [122, 123] , and M. tuberculosis [124] in recent years. Despite the efforts reviewed here, the use of systems biology approaches to investigate PHI is still considered relatively undeveloped. The availability of new PHI network data, together with further topological and functional analyses of pathogen-host systems, are expected to shed more light on infection mechanisms and novel therapeutic targets for infectious diseases in the near future. We particularly thank Dr. Tunahan Çakır for critical reading of the manuscript and for his contributions to Figure 3 . The financial support was provided by the Research Funds of Bogaziçi University, through project 5554D. The doctoral scholarship for Saliha Durmuş Tekir is sponsored by TÜBITAK, is gratefully acknowledged. The authors declare no conflict of interest. Figure 3 . The number of scientific publications including PHI-related terms in PubMed in the post genomic era. The searched PHI-related terms: "pathogen host interactions", "host pathogen interactions", "pathogen host interaction", "host pathogen interaction", "pathogen-host interactions", "pathogen-host interaction", "host-pathogen interactions", "host-pathogen interaction". Infectious diseases: for considerations the 21st century Host-pathogen systems biology Filoviruses are ancient and integrated into mammalian genomes New Strategies to fight infectious diseases -arms race on a microscale Signaling during pathogen infection The landscape of human proteins interacting with viruses and other pathogens Antibacterial drug discovery and structure-based design Nucleotide sequence of bacteriophage ϕX174 DNA Whole-genome random sequencing and assembly of Haemophilus influenzae Rd The sequence of the human genome Protein function in the post-genomic era A novel genetic system to detect protein-protein interactions Interaction mating reveals binary and ternary connections between Drosophila cell cycle regulators Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens Functional organization of the yeast proteome by systematic analysis of protein complexes Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry Global analysis of protein activities using proteome chips A protein linkage map of Escherichia coli bacteriophage T7 A genomic approach of the hepatitis C virus generates a protein interaction map Genome-wide analysis of vaccinia virus protein-protein interactions The protein-protein interaction map of Helicobacter pylori Interaction network containing conserved and essential protein complexes in Escherichia coli Largescale identification of protein-protein interaction of Escherichia coli K-12 Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Protein interaction mapping in C. elegans using proteins involved in vulval development The biomolecular interaction network database and related tools: 2005 update The BioGRID interaction database: 2011 update The database of interacting proteins: 2004 update Human protein reference database The IntAct molecular interaction database in 2012 MINT, the molecular interaction database: 2012 update MPact: the MIPS protein interaction resource on yeast Reactome: a database of reactions, pathways and biological processes The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored Protein-protein interactions between Hepatitis C virus nonstructural proteins Herpesviral protein networks and their interaction with the human proteome Epstein-Barr virus and virus human protein interaction maps Analysis of intraviral protein-protein interactions of the SARS coronavirus ORFeome Genome-wide analysis of protein-protein interactions and involvement of viral proteins in SARS-CoV replication Hepatitis C virus: structure, protein products and processing of the polyprotein precursor Flaviviridiae: The viruses and their replication The complete DNA sequence of vaccinia virus Genetically engineered poxviruses for recombinant gene expression, vaccination, and safety The complete DNA sequence of varicella-zoster virus Identification of herpesvirus-like DNA sequences in AIDS-associated Kaposi's sarcoma The genome of Epstein-Barr virus type 2 strain AG876 Emergence of scaling in random networks Specificity and stability in topology of protein networks Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage Towards a proteome -scale map of the human protein-protein interaction network A human proteinprotein interaction network: a resource for annotating the proteome A first-draft human protein-interaction map A proteome-wide protein interaction map for Campylobacter jejuni The binary protein interactome of Treponema pallidum -the syphilis spirochete A global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv An expanded protein-protein interaction network in Bacillus subtilis reveals a group of hubs: Exploration by an integrative approach Proteome organization in a genome-reduced bacterium Primary structure of the succinyl-CoA synthetase of Escherichia coli One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products New partners of acyl carrier protein detected in Escherichia coli by tandem affinity purification Complete genome eequence of Treponema pallidum, the syphilis spirochete Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae Epidemiology, strategy, financing Mycobacterium tuberculosis interactome analysis unravels potential pathways to drug resistance Uncovering new signaling proteins and potential drug targets through the interactome analysis of Mycobacterium tuberculosis An expanded view of bacterial DNA replication DNA polymerase I acts in translesion synthesis mediated by the Y-polymerases in Bacillus subtilis Cell-cycle-dependent spatial sequestration of the DnaA replication initiator protein in Bacillus subtilis Chemical genetics of Plasmodium falciparum A protein interaction network of the malaria parasite Plasmodium falciparum Inferring genome-wide functional linkages in E. coli by combining improved genome context methods: comparison with high-throughput experimental data Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins The battle of two genomes: genetics of bacterial host/pathogen interactions in mice Structural microbiology at the pathogen-host interface Mining host-pathogen interactions Host pathogen protein interactions predicted by comparative modeling Computational prediction of host-pathogen protein-protein interactions A data integration approach to predict host-pathogen protein-protein interactions: application to recognize protein interactions between human and a malarial parasite Ortholog-based protein-protein interaction prediction and its application to interspecies interactions Prediction of interactions between HIV-1 and human proteins by information integration Prediction of HIV-1 virus-host protein interactions using virus and host sequence motifs Structural similarity-based predictions of protein interactions between HIV-1 and Homo sapiens Hepatitis C virus infection protein network A physical and regulatory map of host-influenza interactions reveals pathways in H1N1 infection The human-bacterial pathogen protein interaction networks of Bacillus anthracis, Francisella tularensis, and Yersinia pestis Insight into bacterial virulence mechanisms against host immune response via the Yersinia pestis-human protein-protein interaction network PHI-base update: additions to the pathogen host interaction database Virus-MINT: a viral protein interaction database VirHost-Net: a knowledge base for the management and the analysis of proteome-wide virus-host interaction networks PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species Infection strategies of bacterial and viral pathogens through pathogen-human protein-protein interactions Document classification for mining host pathogen protein-protein interactions Text mining for discovery of host-pathogen-interactions Literature mining of hostpathogen interactions: comparing feature-based supervised learning and language-based approaches Global landscape of HIV-human protein complexes Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets Gene annotation and pathway mapping in KEGG The multifunctional NS1 protein of influenza A viruses Connecting viral with cellular interactomes Gay compromise syndrome How does HIV cause AIDS? Science HIV-1 Nef impairs MHC class II antigen presentation and surface expression HIV-1 TAR miR-NA protects against apoptosis by altering cellular gene expression Anchorage of HIV on permissive cells leads to coaggregation of viral particles with surface nucleolin at membrane raft microdomains Human immunodeficiency virus type 1 Vpr interacts with antiapoptotic mitochondrial protein HAX-1 HIV-1 envelope triggers polyclonal Ig class switch recombination through a CD40-independent mechanism involving BAFF and C-Type lectin receptors Gene Ontology: tool for the unification of biology Francisella tularensis induces cytopathogenicity and apoptosis in murine macrophages via a mechanism that requires intracellular bacterial multiplication Macrophage apoptosis by anthrax lethal factor through p38 MAP kinase inhibition Inhibition of MAPK and NF-KB pathways is necessary for rapid apoptosis in macrophages infected with Yersinia Graemlin: General and robust alignment of multiple large interaction networks Conserved patterns of protein interaction in multiple species Network-based prediction and analysis of HIV dependency factors Identification of host proteins required for HIV infection through a functional genomic screen Global analysis of hostpathogen interactions that regulate early-stage HIV-1 replication Genome-scale RNAi screen for host factors required for HIV replication Host cell factors in HIV replication: meta-analysis of genome-wide studies A genome-wide genetic screen for host factors required for hepatitis C virus propagation RNA interference screen for human genes associated with West Nile virus infection Genome-wide RNAi screen identifies human host factors crucial for influenza virus replication Human host factors required for influenza virus replication Genome-wide analysis of the host intracellular network that regulates survival of Mycobacterium tuberculosis