key: cord-0969190-rzc2juqo authors: Saha, Sovan; Kumar Halder, Anup; Sekhar Bandyopadhyay, Soumyendu; Chatterjee, Piyali; Nasipuri, Mita; Basu, Subhadip title: Computational Modeling of Human-nCoV Protein-Protein Interaction Network date: 2021-12-10 journal: Methods DOI: 10.1016/j.ymeth.2021.12.003 sha: ab33443736f19ef7aee3949bc6d76f614461e5d5 doc_id: 969190 cord_uid: rzc2juqo Novel coronavirus (SARS-CoV2) replicates the host cell's genome by interacting with the host proteins. Due to this fact, the identification of virus and host protein-protein interactions could be beneficial in understanding the disease transmission behavior of the virus as well as in potential COVID-19 drug identification. International Committee on Taxonomy of Viruses (ICTV) has declared that nCoV is highly genetically similar to the SARS-CoV epidemic in 2003 (∼ 89% similarity). With this hypothesis, the present work focuses on developing a computational model for the nCoV-Human protein interaction network, using the experimentally validated SARS-CoV-Human protein interactions. Initially, level-1 and level-2 human spreader proteins are identified in the SARS-CoV-Human interaction network, using Susceptible-Infected-Susceptible (SIS) model. These proteins are considered potential human targets for nCoV bait proteins. A gene-ontology-based fuzzy affinity function has been used to construct the nCoV-Human protein interaction network at a ∼99.98% specificity threshold. This also identifies 37 level-1 human spreaders for COVID-19 in the human protein-interaction network. 2474 level-2 human spreaders are subsequently identified using the SIS model. The derived host-pathogen interaction network is finally validated using six potential FDA-listed drugs for COVID-19 with significant overlap between the known drug target proteins and the identified spreader proteins. COVID-19 evolved in the Chinese city of Wuhan (Hubei province) [1] . The first case of human species affected by nCoV was observed on 31 st December 2019 [2] . Soon it expands its adverse effect on almost all nations within a brief period [3] . World Health Organization (WHO) observes that the massive disastrous outbreak of nCoV is mainly due to mass community spreading and declares a global health emergency on 30th January 2020 [4] . After proper assessment, WHO presumes its fatality rate to be 4% [5] which urges the global researchers to work together to discover an appropriate treatment for this pandemic [6, 7] . Coronaviridae is the family to which a coronavirus belongs. It also infects birds and mammals besides affecting human beings. Though the common symptoms of the coronavirus are common cold, cough, etc., it is accompanied by severe acute, chronic respiratory disease and multiple organ failure leading to human death. Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS) were the two major outbreaks in 2003 and 2012 before SARS-CoV2. The source of origin of SARS was located in Southern China. Its fatality rate was within 14%-15% [8] , due to which 774 people lost their lives among 8804 affected cases. Saudi Arabia was marked as the base for the commencement of MERS. 858 persons among 2494 infected cases were defeated in their battle against the MERS virus. Thus it generated a much higher fatality rate of 34.4% [9] when compared to that of SARS. All three epidemic creators SARS, MERS, and SARS-CoV2, are biologically included in the genus Beta coronavirus under the Coronaviridae. Both structural and non-structural proteins are involved in the formation of SARS-CoV2. Out of the two, structural proteins like the envelope (E) protein, membrane (M) protein, nucleocapsid (N) protein, and the spike (S) protein play a significant role in transmitting the disease by binding with the receptors after entering into the human body [10] . So, there is an urgent need to understand and analyze the mechanism of disease transmission of this new virus. In this research work, Protein-Protein Interaction Networks (PPIN) are the most significant attribute to study the disease propagation mechanism from SARS-CoV2 to humans. It plays a crucial role in identifying essential proteins [11] [12] [13] [14] [15] responsible for various diseases. They are also significant in identifying protein functions [16] [17] [18] [19] . According to Lotem et al. [20] , though human PPIN is constantly expanding, very little information is available about the human PPIN, which gets generated in disease conditions. With the enhancement in the availability of the human PPIN data, the primary focus of research has been shifted from the basic understanding of the PPIN to the study of the PPIN underlying various kind of human diseases [21] . According to the work of Ideker et al. [22] , PPIN study mainly falls under four categories: 1) Identification of human disease genes based on network analysis, 2) Implication of additional genes involved in the disease by using protein networks, 3) Identification of protein subnetworks involved in diseases and 4) Classification of case-control studies based on protein PPIN. Host-pathogen PPINs are significant for understanding the mechanism of transmission of infection, which is essential for developing new and more effective therapeutics, including rational drug design. Progression of infection and disease results due to the interaction of proteins in between pathogen and host. Pathogen plays an active role in spreading infection. Pathogen and host PPIN permit pathogenic microorganisms to utilize host capabilities by manipulating the host mechanisms to abscond from the host's immune responses [23] [24] [25] . Detection of target proteins through the analysis of pathogen and host PPIN is the central point of research [13, 26, 27] . Topologically significant proteins having a higher degree of interactions are generally found to be important drug targets. However, proteins with fewer interactions or topologically not substantial may be involved in the mechanism of infection because of some biological pathway relevance. However, clinically validated Human-nCoV protein interaction data is limited in the current literature. This has motivated us to develop a new computational model for the nCoV-Human PPI network. We have subsequently validated the proteins involved in the host-pathogen interactions with respect to potential Food and Drug Administration (FDA) drugs for COVID-19 treatment. Key aspects of this research work are highlighted below: It has been reported that SARS-CoV has ∼ 89% [28, 29] genetic similarities with nCoV. SARS-CoV-Human protein-protein interaction network has also been studied widely and available in the literature [30] [31] [32] . Recently, we developed a computational model to identify potential spreader proteins in a HumanSARS CoV interaction network using the SIS model [14] . In addition, sequence information of 29 nCoV proteins has been released [33] . Gene ontological (GO) information (Biological Process (BP), Molecular Function (MF), Cellular Component (CC)) of 14 of the nCoV proteins are available [33, 34] . We have recently developed a method to predict interaction affinity between proteins from the available GO graph [35] . Assessment of interaction affinity between nCoV proteins with potential Human target/bait proteins, which are susceptible to SARS-CoV infection, has been done. Fuzzy affinity thresholding is done to detect High-Quality nCoV-Human PPIN. The selected human proteins are considered as level-1 human spreader nodes of nCoV. Level 2 spreader nodes in nCoV-Human PPIN are detected using the spreadability index and validated by SIS [14, 36] model. Our developed model is validated for the target proteins of the potential FDA drugs for COVID-19 treatment [37] . All the related terminologies referred in this work like Ego Network [38] (please see Fig S1 in supplementary document),spreadability index [14] , Node weight [39] , Edge ratio [38] , Neighborhood density [38] , Betweenness Centrality (BC) [40] , Closeness centrality (CC) [41] , Degree centrality (DC) [42] and Local average centrality (LAC) [43] are discussed in the supplementary document. Our developed computational model for nCoV-Human PPIN consists of two crucial methodologies 1) identification of spreader nodes by spreadability index along with the validation of SIS model and 2) Fuzzy PPI model. First, the SIS model identifies spreader nodes of SARS-CoV proteins (candidate set of nCoV interactors). Then, the Fuzzy PPI model is applied to extract the nCoV-Human interactions, and finally, nCoV spreaders are identified using the SIS model. In nCoV-Human PPIN, the former acts as a pathogen/bait while the host, the human, acts as 'Prey'. The transmission of infection starts when a pathogen enters a host body and infects its protein, affecting its directly or indirectly connected neighbor proteins. Considering this method of transmission, PPIN of humans and SARS-CoV are considered to detect spreader nodes. Spreader nodes are those nodes/proteins which transmit the disease fast among their neighbors. But not all the nodes in a PPIN are spreaders. So, proper detection of spreader nodes is crucial. Thus, spreader nodes are identified by the spreadability index, which measures the transmission capability of a node/protein. Furthermore, the compactness of PPIN and its transferal capability are evaluated using centrality analysis. Nodes with high centrality values are usually considered spreader nodes or the most critical node in a network. The spreadability index [14] is one of the centrality-based measures that combines three major topological neighborhood-based features of a network. They are 1) Node weight [39] 2) Edge ratio [38] and 3) Neighborhood density [38] . Nodes having a high spreadability index are considered spreader nodes. The spreader nodes thus identified are also validated by the SIS model [36] . The SIS model is implemented to design the SARS-CoV and SARS-CoV2 outbreak into a disease model consisting of proteins based on their present infection status. A protein can be in either of the three states: 1) S: Susceptible, which means that every protein is initially susceptible though not yet infected but at risk of getting infected by the disease. 2) I: Infected, which means that the disease already infects the protein and 3) S: Susceptible, which means proteins again become susceptible after getting recovered from the infected state. This model is implemented to generate the overall infection capability of a node after a certain range of iterations. Thus the sum of the infection capability of the top selected spreader nodes is computed by this model, which is compared against the sum obtained for the selected top critical nodes by other existing centrality measures like Betweenness Centrality (BC) [40] , Closeness centrality (CC) [41] , Degree centrality (DC) [42] and Local average centrality (LAC) [43] (Please see the supplementary document for more details). Our proposed method for selecting spreader nodes in SARS-CoV PPIN [14] has performed better than the other existing state-of-the-art like Betweenness Centrality (BC) [40] , Closeness Centrality (CC) [41] , Degree centrality (DC) [42] , and Local average centrality (LAC) [43] . The comparison and results are included in our recently published work [14] (Please see Fig. S2 and Table S1 - Table S5 in the supplementary document for more details). The complete source code is available online here. A synthetic PPIN is considered in Fig. 1 to demonstrate the entire methodology of the spreadability index (see Table 1 ). In addition, computational analysis of the spreadability index of our proposed model with one of the other methodology LAC (computed by using CytoNCA [44] ) has been highlighted in Table 2 . is the total number of edges that are outgoing from the ego network [14] (for details please see the supplementary document) whereas is denoted as the total number of interconnections in the neighbor of node [38] . of node 3 is 6 while of node 3 is 3, which highlights that 3 3 node 3 has the highest transmission ability from its ego network to outside when compared to other nodes. Node 3 also has the highest spreadability index. But LAC failed to rank node 3 in the first position. The same scenario can be observed for some other nodes in the synthetic network too. Besides SIS validation result shows that the selected top-ranked spreader nodes in this proposed model have the highest infection capability compared to the other ranked nodes. The binding affinity between any two interacting proteins can be estimated by combining the semantic similarity scores of the GO terms associated with the proteins [26, 35, [45] [46] [47] . A greater number of semantically similar GO annotations between any protein pair indicates higher interaction affinity. The fuzzy PPI model is a hybrid approach [35] that utilizes both the topological [48] features of the GO graph and information contents [47, 49, 50] of the GO terms. GO is organized in three independent directed acyclic graphs (DAGs): molecular function (MF), biological process (BP), and cellular component (CC) [34] . The nodes in each GO graph represent GO terms, and the edges represent different hierarchical relationships. In this work, the two most essential relations, 'is_a' and 'part_of,' have been used for GO relations [51] . The semantic similarity between any two proteins is estimated by considering the similarities between their all pairs of annotating gene ontology (GO) terms belonging to a particular ontological graph. The similarity of a GO term pair is determined by considering specific topological properties (shortest path length) of the GO graph and the average information content (IC) [52] of the disjunctive common ancestors ( ) [45, 46] of the GO terms as proposed in [35] . Fuzzy PPI first relies on a fuzzy clustering of the GO graph where the selection of GO terms as cluster center is based on the level of association of that GO term in the GO graph. Then, the cluster centers are selected based on the proportion measure of GO terms. The proportion measure for any GO term is computed as represents the ascendant and descendant of term , and is the total number of GO terms in ( ) ( ) ontology . A higher value of the proportion measure ( ) signifies higher coverage of ascendants and descendants associated with the specific node. Finally, the GO terms for which this proportion measure is above a predefined threshold are selected as cluster centers. In this work, the cluster centers are chosen based on the threshold values as suggested in [26, 35] . After selecting the cluster centers, the degree of membership of a GO term to each of the selected cluster centers is calculated using its respective shortest path lengths to the corresponding cluster centers. The membership of the GO term to a cluster decreases with an increase in its shortest path length to the cluster center. The membership function is defined as where is center and is the width of membership function, and is the shortest path length from to . in membership values between the GO pair and for each cluster center, is computed to find the weight parameter. The weight parameter is defined as This weight value determines how different two GO terms can be with respect to the cluster centers. Next, the shared information content (SIC) is computed using the average information content (IC) [52] of the of the GO term pair ( ) for all three GO graphs. The SIC is defined as , represents the disjunctive common ancestors of GO-term and . The semantic similarity (SS) between the GO term pair and defined as The semantic similarity of protein pair ( ) for each GO-type (CC, MF, and BP), is estimated by utilizing the , maximum similarity of all possible GO pairs from the annotations of proteins and for each type of GO. The interaction affinity of protein pair ( ) is defined as the average of CC, MF, and BP-based semantic similarity. , This work uses the available ontological information to calculate the fuzzy interaction affinity score between the protein pairs of SARS-CoV2 and spreader human proteins (please see Fig. 3 ). Here, the SARS-COV's level-1 and level-2 spreader proteins are employed as the primary target for the proposed fuzzy PPI model for interaction affinity computation. A bipartite relation of GO pairs is primarily generated from each pair of proteins for each type of GO annotations (CC, MF, and BP) independently (Fig. 3A) . To reduce the computational overhead and time, semantic similarity scores are previously computed between all GO pairs belonging to a particular GO type using equation 5 [35] . The semantic similarity is computed by exploring the topological properties of the GO subgraphs. For each type of GO subgraphs, a different set of cluster center nodes (GO terms) are identified based on proportion measure (equation 1) that rely on the annotation score and GO relationship graph hierarchy. The GO semantic similarity is estimated with a distance-based measure between the target GO pair by exploring the membership score (equation 2, 3) and values (equation 4) compared to respective cluster canters of each GO subgraphs (Fig. 3B) . For each GO type, the max of all possible scores of the bipartite links in a particular GO subgraph is considered the final semantic score of that type of GO. Similarly, all three different scores are evaluated and averaged to find the interaction affinity for the annotated protein pair. Then, the fuzzy score of interaction affinity is computed by normalizing the interaction affinity using max-min normalization. Finally, with high specificity threshold (please see Fig. 6 ), high-quality interactions (78 interactions involving 37 human level-1 spreaders) are extracted for human-SARSCoV2. SARS-CoV-Human PPIN serves as a baseline for our model. The potential level-1 and level-2 human spreaders of SARS-CoV become the possible candidate set for selecting level-1 human spreaders of SARS-CoV2. Various datasets have been curated for this purpose which has been outlined below: The dataset [53, 54] consists of all possible interactions between human proteins experimentally documented in humans. Human proteins are represented as nodes, while edges represent the physical interactions between proteins. It is a collection of 21557 nodes and includes 342353 edges/interactions. The dataset [30] consists of interactions between SARS-CoV proteins. It contains 7 unique proteins and the involvement of 17 interacting edges. Only the densely connected proteins are considered rather than the isolated ones since the former play a more active role in the transmission of infection than the latter. The dataset [30] comprises 118 interactions between SARS-CoV and humans. It is used to fetch the level-1 human interactions of SARS-CoV. This data is collected from the pre-released dataset of available SARS-CoV2 protein from UniProtKB [33] [55] , which includes 14 reviewed SARS-CoV2 proteins. GO graph types (CC, MF, and BP) are collected from GO Consortium [34, 51] . In addition, the protein to GOannotation map is retrieved from the UniProtKB database. Six potential FDA drugs: Lopinavir [56] , Ritonavir [57] , Azithromycin [58] , Remdesivir [59] [60] [61] , Favipiravir [62, 63] , and Darunavir [64] have been identified from the DrugBank [65] published white paper [37] which have been used for validation in our proposed model. Our developed computational model of nCoV-Human PPIN contains high-quality interactions (HQI) and proteins identified by Fuzzy affinity thresholding and spreadability index validated by the SIS model. The sources of input and the generated results always play a crucial role in any computational model, which is also true for our proposed model. SARS-CoV-Human PPIN (up to level-2) is formed by the combination of SARS-CoV-Human and Human-Human PPIN datasets. SARS-CoV-Human dataset generates the direct level-1 human interactions of SARS-CoV, while the human-human PPIN dataset is used to fetch the corresponding level-2 human interactions. Potential spreader nodes are identified using the spreadability index validated by the SIS model [14] . The entire process of the detection of spreader nodes in SARS-CoV-Human PPIN is depicted in four steps in Fig. 2 (used only for description): 1) Spreader nodes (6 spreaders) in SARS-CoV PPIN are detected by spreadability index. 2) Corresponding level-1 human proteins of the spreader nodes in SARS-CoV PPIN are identified. 3) Spreader nodes (24 spreaders) in level-1 human proteins of the spreader nodes in SARS-CoV PPIN are detected. 4) The same process is repeated, and spreader nodes (9 spreaders) in level-2 human proteins of the spreader nodes in SARS-CoV PPIN are identified. The selected spreader nodes in SARS-COV-Human PPIN are highlighted in additional Table A1, Table A2 , and Table A3 . The network view of SARS-CoV-Human PPIN at each level and various selected thresholds of spreadability index are also available online (SARS-CoV human spreaders link: L-1, human spreaders at the high threshold of spreadability index link: L1 & L2:high, and human spreaders at low threshold link: L1 & L2:low). Red-coloured nodes represent SARS-CoV proteins, while blue-colored nodes are the selected spreader nodes in it. Deep green colored nodes represent level-1 human connected proteins with SARS-CoV proteins, while yellow-coloured nodes represent the selected human spreaders. Light green colored nodes represent level-2 human spreaders of SARS-CoV. The fuzzy PPI model finds the interaction affinity between the SARS-CoV2 and Human proteins (L1 and L2 spreader of SARS-CoV) using ontological gene information. All GO pair-wise interaction affinities are assessed from three independent GO-relationship graphs CC, MF, and BP. The fuzzy interaction affinity of a protein pair is computed from all three pair-wise scores of all GO-pair affinities. C) Heatmap representation of Fuzzy PPI score. D) Network representation of Human and SARS-CoV2 proteins with 0.2 onward thresholds of Fuzzy PPI score at high specificity. Finally, high-quality interactions are extracted to retrieve the potential human prey for SARS-CoV2 at the 0.4 threshold. The GO information can be helpful to infer the binding affinity of any pair of interacting proteins using three different types of GO hierarchical relationship graphs (CC, MF, and BP) [34] . The fuzzy PPI model has been applied to find the interaction affinity between the SARS-CoV2 and Human proteins using GO-based information (please see Fig. 3 and section 2.2 for details). To identify the interactors of SARS-CoV2 on humans using the Fuzzy PPI model, a set of candidate proteins are selected, which are identified as the L1 and L2 spreader nodes of SARS-CoV using the SIS model (as depicted in Fig. 2) . The fuzzy PPI model is constructed from the ontological relationship graphs by evaluating the affinity between all possible GO pairs annotated from any target protein pair. Finally, the fuzzy score of interaction affinity of protein pair is computed from these GO pair-wise interaction affinity into a range of [0,1]. We have used experimentally validated human protein interactions (physical only) from publicly available interaction databases, such as HIPPIE [66] , STRING [67] , BioGRID [68] , DIP [69], HuRI [70] for positive data and Negatome 2.0 [71] , Trabuco et al. [72] for negative data. The positive interactions are also filtered by removing the edges that are common in both positive and negative interaction sets. In each database, Gold standard data is curated by using the scoring scheme provided by the respective databases. The selection criteria are described in Table S6 in the supplementary document. With this benchmarking data set, the FuzzyPPI Model has been assessed with different fuzzy scoring cut-off values. The performance of this assessment is reported in Table S7 in the supplementary document. In any classification task, specificity signifies the ability to identify a positive sample correctly. In order to identify highquality positive interactions, we used the specificity metric. With the increasing value of specificity, the number of false-positive (FP) interactions has shown a sharp fall as depicted in the following table. At threshold≥0.2 and ≥0.4, the FP is 0.0048% and 0.0001% of total negative interactions respectively. Thus, the Specificity threshold is set at ≥0.4 . The heatmap representation of fuzzy interaction affinities (with a score ≥ of 0.2 for very high specificity ∼ 99%) is shown in additional Fig. A1 and Table A4 . The high-quality interaction (HQI) is retrieved at threshold 0.4 (almost ∼ 99:98% Specificity), which results in a total of 78 interactions between SARS-CoV2 and humans (37 human level-1 spreaders). The interaction networks predicted from the Fuzzy-PPI model are shown in Fig. 4. Human proteins present in the high-quality interactions of nCoV-Human PPIN fetched by applying fuzzy affinity threshold are considered level-1 spreaders. From these 37 level-1 spreaders, corresponding level-2 human interactions are obtained using the human-human PPIN dataset. Spreadability index is thus computed for these level-2 human proteins for the identification of level-2 human spreader nodes. The SIS model also verifies the selection. The selected spreader nodes in SARS-COV2-Human PPIN (2474 level-2 human spreaders under high threshold) are highlighted in additional Table A4, Table A5, and Table A6 . In addition, the computational model of nCoV-Human PPIN under a high threshold has been highlighted online here. It highlights the human level-1 (marked in yellow) and level-2 spreader nodes (marked in green). The network view of SARS-CoV2-Human PPIN at each level and various selected thresholds are also available online (SARS-CoV2 Level-1 human spreaders, Level-1 & Level-2:high spreaders at the high threshold of spreadability index and Level-1 & Level-2:low human spreaders at a low threshold of spreadability index). After proper assessment of all potential drugs as mentioned in the DrugBank [65] white paper [37] , six drugs: Lopinavir [56] , Ritonavir [57] , Azithromycin [58] , Remdesivir [59] [60] [61] , Favipiravir [62, 63] and Darunavir [64] are identified which are showing expected results to some extent in the clinical trials done for SARS-CoV2 vaccine. All approved human protein targets for each of the five approved drugs are fetched from the advanced search section [73] of the drug bank [65, 74] . When searched in our proposed model of nCoV-Human PPIN, these targets are found to play an active role of spreader nodes. This reveals that the selected spreader nodes are of biological importance in transmitting infection in a network that makes them the protein drug targets of the potential FDA drugs for COVID-19. The target protein hits in our nCoV-Human PPIN for each of the 7 potential FDA drugs are highlighted in Fig. 5 . It can be observed that 3 target proteins for Ritonavir, 2 target protein hits for each of Lopinavir, Darunavir and Azithromycin, and 1 target protein hit for Remdesivir and Favipiravir. Out of these protein targets, ACE2 is the most important one since it is considered one of the crucial receptors of humans for nCoV to transmit infection deep inside the human cell [75] [76] [77] . Based on this validation, further research is conducted along with drug repurposing study, docking study and COVID-19 symptoms-based analysis in our next research work [78] which helps us to identify a possible potential drug for COVID-19 named Fostamatinib [79] [80] [81] . Clinical studies involving Fostamatinib is also in progress [82, 83] . Though the research is at initial level, yet it somehow supports our research findings to some extent. In any host-pathogen interaction network, the identification of spreader nodes is crucial for disease prognosis. However, not every protein in an interaction network has an intense disease-spreading capability. In this work, we have used the SARS-CoV-Human PPIN network and the spreader nodes at both level-1 and level-2 using the SIS model. These spreader nodes are considered for computing the protein interaction affinity score to unmask the level-1 human spreaders of nCoV. In addition, GO annotations have also been considered along with PPIN properties to make this model more effective and significant. With the gradual progress of the work, it has been observed that the selected human spreader nodes, identified by our proposed model, emerge as the potential protein targets of the FDA-approved drugs for COVID-19. The primary hypotheses of the work may be listed as follows: 1) There is a genetic overlap of ∼ 89% [84] between SARS-CoV and SARS-CoV2, which also leads to a significant overlap in spreader proteins between human-SARS-COV and human-SARSCOV2 protein-interaction networks. 2) Fuzzy PPI approach can assess protein interaction affinities at very high specificity with respect to benchmark datasets, as shown in Fig. 6 . High specificity signifies a meager false-positive rate at a given threshold. Thus, at a 0.4 threshold (∼ 99:9% specificity), the proposed model evaluates high-quality positive interactions in Human-nCoV PPIN. Finally, we propose that the developed computational model effectively identifies Human-nCoV PPIs with high specificity. The nCoV-Human interactions are inferred from another pandemic initiator SARS-CoV, which is highly genetically similar to nCoV. We also recognize the spreadability index of the human spreader proteins (up to level-2), validated through the SIS model. Due to high network density in human interaction networks, the number of proteins increases with the transition from one level to another. So, our proposed model can also identify human spreader proteins in level-2 by using the spreadability index validated by the SIS model. Our proposed method has identified the ACE2 and TMPRSS2 as an interactor of SARS-CoV2 proteins, which is essential for entry into the human host. SARS-CoV2 interacts with the SARS-CoV entry receptor ACE2 as SARS-CoV2 preserves those amino acid residues of SARS-CoV that are essential for ACE2 binding [85] . However, the binding strength of SARS-CoV2 with ACE2 is 10 to 20 times more than the SARS-CoV2-ACE2 attachment [86] . This is because several changes occur in the receptor-binding domains (RBDs) of SARS-CoV2 spike protein [87] . In addition, the cellular serine protease TMPRSS2 primes SARS-CoV2 for host entry, and a Serine protease inhibitor blocks SARS-CoV2 infection of lung cells [85, 88] . Thus, TMPRSS2 activity is essential for viral spread and pathogenesis in the infected host [85, 89] . In a recent study [90] , Gordon et al. have identified 332 high-confidence SARS-CoV2-human protein-protein interactions where they have worked on the sequence analysis of SARS-CoV2 isolates. They cloned, tagged, and expressed 26 of the 29 SARS-CoV2 proteins in human cells and identified the human proteins that were physically associated with each using affinity-purification mass spectrometry (AP-MS). However, while comparing their seminal work with ours, we found that the SARS-CoV2 protein sequences used by Gordon et al. do not map directly with the available UniProt accession ids. In our case, we have worked only on the UniProt listed SARS-CoV2 proteins and applied a mathematical model of binding affinity assessment on a subset of UniProt listed reviewed Human proteins. Therefore, direct comparison and validation could not be possible with respect to Gordon et al., primarily because of the unavailability of direct mapping of SARS-CoV2 proteins into corresponding UniProt accession ids. However an attempt has been made to map UniProt ids of SARS-CoV2 proteins of Gordon et al., from COVID-19 UniProtKB reference data [55] (please see Table S8 in the supplementary document). It is clear from the Table S8 in the supplementary document that though UniProt ids are available for some of them but GO annotations for most of them are missing. Another interesting observation is that the entries marked in green have been also taken into consideration in this research work as well. It should be noted here that the current work depends heavily on the underlying GO Network of the host-pathogen PPIN. As evident from Table S8 , GO annotations are often missing in the new protein list. Therefore we are working on a new strategy for the computational prediction of GO annotations for the set of proteins [16] [17] [18] [19] in the Gordon's list as well new mutant variants. One of the key highlights of our study may be underlined by the fact that the target proteins of the potential FDA drugs for COVID-19 overlap with the spreader nodes of the proposed nCoV-Human protein interaction network. Target proteins of six potential FDA drugs: Lopinavir [56] , Ritonavir [57] , Azithromycin [58] , Remdesivir [59] [60] [61] , Favipiravir [62, 63] , and Darunavir [64] for COVID-19 as mentioned in the DrugBank white paper [37] overlap with the spreader nodes of the proposed in silico nCoV-Human protein interaction model (see Fig. 5 ). Though clinical trials for the COVID-19 vaccine are on their way to date, three out of the six repurposed drugs, i.e., Remdesivir [91] and Favipiravir [92] are found to be the most promising as well as effective ones. Our proposed model successfully identified their protein targets R1AB SARS2, TLR9, ACE2, CYP3A4, and ABCB1 as spreader nodes. This assessment reveals the fact that these spreader nodes indeed have biological relevance relative to disease propagation. It also motivates us to further do a drug repurposing study on the generated SARS-CoV2-human PPIN in our subsequent research work [78] , which highlights that the drug Fostamatinib/R406 might be the one of the potential drugs to be used for SARS-CoV2. A novel coronavirus outbreak of global health concern World-Health-Organization Coronavirus disease (COVID-19) outbreak ?CDC_AA_refVal=https%3A%2F%2Fwww.cdc.gov%2Fcoronavirus%2F2019-ncov%2Flocations-confirmed-cases.html the-second-meeting-of-the-international-health-regulations-(2005)-emergency-committeeregarding-the-outbreak statement-on-the-meeting-of-the-international-health-regulations-(2005)-emergencycommittee-regarding-the-outbreak Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Data sharing and outbreaks: best practice exemplified Likelihood of survival of coronavirus disease WHO | Middle East respiratory syndrome coronavirus (MERS-CoV) Emerging coronaviruses: Genome structure, replication, and pathogenesis A novel essential protein identification method based on PPI networks and gene expression data Method for Identifying Essential Proteins by Key Features of Proteins in a Analysis of protein targets in pathogen-host interaction in infectious diseases: a case study on Plasmodium falciparum and Homo sapiens interaction network Detection of spreader nodes and ranking of interacting edges in Human-SARS-CoV protein interaction network Target Protein Function Prediction by Identification of Essential Proteins in Protein-Protein Interaction Network Funpred 3.0: Improved protein function prediction using protein interaction network Protein function prediction from dynamic protein interaction network using gene expression data Protein function prediction from protein-protein interaction network using gene ontology based neighborhood analysis and physico-chemical features Modified FPred-Apriori: improving function prediction of target proteins from essential neighbours by finding their association with relevant functional groups using Apriori algorithm Human protein interaction networks across tissues and diseases Protein interactions and disease: computational approaches to uncover the etiology of diseases Protein networks in disease Computational prediction of host-pathogen protein-protein interactions A survey on Ebola genome and current trends in computational research on the Ebola virus Supervised learning and prediction of physical interactions between human and HIV proteins, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases Review of computational methods for virus-host protein interaction prediction: a case study on novel Ebola-human interactions JUPPI: A multilevel feature based method for PPI prediction and a refined strategy for performance assessment China releases genetic data on new coronavirus, now deadly | CIDRAP Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan The SARS-Coronavirus-host interactome: Identification of cyclophilins as target for pan-Coronavirus inhibitors Analysis of Intraviral Protein-Protein Interactions of the SARS Coronavirus ORFeome Human Coronavirus: Host-Pathogen Interaction UniProt: The universal protein knowledgebase Consortium, others, The Gene Ontology (GO) database and informatics resource Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph The mathematical theory of infectious diseases and its applications COVID-19 : Finding the Right Fit Identifying Potential Treatments Using a Data-Driven Approach Identifying influential spreaders based on edge ratio and neighborhood diversity measures in complex networks Detecting overlapping protein complexes in PPI networks based on robustness The rush in a directed graph The centrality index of a graph Lethality and centrality in protein networks A local average connectivity-based method for identifying essential proteins from the network level CytoNCA: A cytoscape plugin for centrality analysis and evaluation of protein interaction networks Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors Measuring semantic similarity between Gene Ontology terms Using information content to evaluate semantic similarity in a taxonomy An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology An Information-Theoretic Definition of Similarity Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, Proceedings of the 10th Research on Computational Linguistics International Conference tool for the unification of biology A mathematical theory of communication Large-Scale Analysis of Disease Pathways in the Human Interactome BioSNAP: Network datasets: Human protein-protein interaction network Coronavirus puts drug repurposing on the fast track Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label nonrandomized clinical trial Prophylactic and therapeutic remdesivir (GS-5734) treatment in the rhesus macaque model of MERS-CoV infection China approves antiviral favilavir to treat coronavirus -UPI.com Taiwan synthesizes anti-viral drug favilavir for COVID-19 patients -Focus Taiwan Efficacy and Safety of Darunavir and Cobicistat for Treatment of COVID-19 -Full Text View -ClinicalTrials.gov DrugBank: a knowledgebase for drugs, drug actions and drug targets HIPPIE: Integrating protein interaction networks with experiment based quality scores Christian v. Mering, STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets The BioGRID interaction database: 2019 update DIP: the database of interacting proteins A reference map of the human binary protein interactome Negatome 2.0: a database of noninteracting proteins derived by literature mining, manual annotation and protein structure analysis Negative protein-protein interaction datasets derived from large-scale two-hybrid experiments Advanced Search -DrugBank Interaction between RAAS inhibitors and ACE2 in the context of COVID-19 ACE-2 is shown to be the entry receptor for SARS-CoV-2: R&D Systems COVID-19 and Angiotensin-Converting Enzyme Inhibitors and Angiotensin Receptor Blockers: What Is the Evidence? Drug repurposing for COVID-19 using computational screening: Is Fostamatinib/R406 a potential candidate? Drug Approval Package: TAVALISSE (fostamatinib disodium hexahydrate) FDA approves fostamatinib tablets for ITP | FDA Positive Topline Data Shows Fostamatinib Meets Primary Endpoint of Safety in Phase 2 Clinical Trial in Hospitalized Patients with COVID-19 Multi-Center Phase 3 Study to Evaluate the Efficacy and Safety of Fostamatinib in COVID-19 Subjects Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Role of Structural and Non-Structural Proteins and Therapeutic Targets of SARS-CoV-2 for COVID-19 Structural basis of receptor recognition by SARS-CoV-2 Efficient activation of the severe acute respiratory syndrome coronavirus spike protein by the transmembrane protease TMPRSS2 A pneumonia outbreak associated with a new coronavirus of probable bat origin A SARS-CoV-2 protein interaction map reveals targets for drug repurposing Trial shows Covid-19 patients recover with Gilead's remdesivir Clinical Trial of Favipiravir Tablets Combine With Chloroquine Phosphate in the Treatment of Novel Coronavirus Pneumonia -Full Text View -ClinicalTrials.gov The authors are thankful to the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, to provide infrastructure facilities during the work. Furthermore, the authors also acknowledge Prof. Jacek Sroka (Institute of Informatics, University of Warsaw) for his contribution toward the developments of the fuzzy PPI methods. This work is partially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, and Department of Biotechnology project (BT/PR16356/BID/7/596/2016), Ministry of Science and Technology, Government of India.