key: cord-0903687-8ljk1w3i authors: Feng, Yufei; Cheng, Xiaoning; Wu, Shuilong; Mani Saravanan, Konda; Liu, Wenxin title: Hybrid drug-screening strategy identifies potential SARS-CoV-2 cell-entry inhibitors targeting human transmembrane serine protease date: 2022-05-11 journal: Struct Chem DOI: 10.1007/s11224-022-01960-w sha: 60604ada4a83a00bca9d993de2f9cfb32d3366e3 doc_id: 903687 cord_uid: 8ljk1w3i The spread of coronavirus infectious disease (COVID-19) is associated with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which has risked public health more than any other infectious disease. Researchers around the globe use multiple approaches to identify an effective approved drug (drug repurposing) that treats viral infections. Most of the drug repurposing approaches target spike protein or main protease. Here we use transmembrane serine protease 2 (TMPRSS2) as a target that can prevent the virus entry into the cell by interacting with the surface receptors. By hypothesizing that the TMPRSS2 binders may help prevent the virus entry into the cell, we performed a systematic drug screening over the current approved drug database. Furthermore, we screened the Enamine REAL fragments dataset against the TMPRSS2 and presented nine potential drug-like compounds that give us clues about which kinds of groups the pocket prefers to bind, aiding future structure-based drug design for COVID-19. Also, we employ molecular dynamics simulations, binding free energy calculations, and well-tempered metadynamics to validate the obtained candidate drug and fragment list. Our results suggested three potential FDA-approved drugs against human TMPRSS2 as a target. These findings may pave the way for more drugs to be exposed to TMPRSS2, and testing the efficacy of these drugs with biochemical experiments will help improve COVID-19 treatment. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11224-022-01960-w. In December 2019, a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with a high mortality rate occurred in Hubei Province, People's Republic of China [1] [2] [3] . The World Health Organization (WHO) announced SARS-CoV-2 as a global pandemic on March 11, 2020, because of the high frequency of contamination and exponential spread of infections across six continents and over a hundred countries [4] . According to WHO, the disease has caused ~6.17 million deaths, ~495 million confirmed cases, and ~211 countries as of April 7, 2022 [5] . Comparatively, with two other acute human coronaviruses, namely Middle East respiratory syndrome coronavirus (MERS-CoV) and severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV-2 shows a high transmission level and is very contagious [6] [7] [8] . Despite the severe lack of SARS-CoV-2 specific medicines, many promising therapeutic targets are being investigated by researchers around the globe, and many carefully planned clinical experiments are being performed in a systematic way [9] . During this pandemic, researchers urge to identify preclinical or clinical drugs that target three essential proteins: main protease, spike protein, and RNA-dependent RNA polymerase (RdRp), respectively [10] . The spike proteins in the course of entry, main protease in proteolytic activation, and RdRp in transcription are three distinct targets that play essential roles in the SARS-CoV-2 replication [11] . Our previous work used a hybrid drug screening strategy targeting RdRp and found that pralatrexate and azithromycin efficiently prevent SARS-CoV-2 replication in vitro [12] . Simultaneously, many studies in the literature revealed that SARS-CoV-2 recognizes angiotensin-converting enzyme 2 (ACE2) as the cell entrance receptor, interacting in tandem with the TMPRSS2, a transmembrane serine protease 2 encoding gene on the membrane cell surface [13] [14] [15] . A 492-amino acid protein that anchors to the plasma membrane is encoded by the TMPRSS2 gene [16] , and unfortunately, the protein's crystal structure is still unresolved. Recent research revealed that invading the host cell requires spike protein priming, which is feasible due to the production of TMPRSS2 by the host cell [13] . Matsuyama et al. show that SARS-CoV-2 infection is enhanced by TMPRSS2 [17] . This invasion into the cell could be stopped by spike protein neutralizing antibodies and TMPRSS2 inhibitors like camostat mesylate [13] . Given the importance of TMPRSS2 as a potential target, we believe that serine protease inhibitors can aid in the development of novel strategies to prevent SARS-CoV-2 viral entry and pathogenesis [18] [19] [20] . New advancements in computational drug repurposing are enabled by the increased FDA (Food and Drug Administration)-authorized small molecule drugs in public repositories [21] . An example is the ZINC database which contains a vast number of different molecules, as well as links to merchants where such molecules can be purchased and obtained physically [22] . The other is the popular drug bank, a database of more than 10,000 drugs [22] . Databases of approved drugs are the most crucial aspect of drug discovery since they constitute the foundation for drug repurposing approaches and are reviewed in the literature [23] . Drug repurposing strategies have become a popular way to test for prospective authorized drugs that may also have effectiveness for other indications [21, 24, 25] . In this method, approved drugs for sets of diseases are considered safe for a human prescription. The remaining task is to establish their efficiency toward the disease under study [26, 27] . In addition, Enamine REAL fragments database (https:// enami ne. net) contains 38.2 million different collections of compounds that reflect the REAL drug-like space (substances that meet the "rule of 5" and Veber criteria: MW500, SlogP5, HBA10, HBD5, rotatable bonds10, and TPSA140) and are free of PAINS and hazardous compounds. The hidden therapeutic potential of existing drugs is identified by using a variety of methodologies in drug repurposing methods [28] . Molecular modeling and data mining methodologies have recently been presented as effective drug repurposing strategies and are discussed in our published reports [12, [29] [30] [31] [32] . We recently presented a hybrid drug virtual screening technique based on deep machine learning, molecular docking, and molecular dynamics simulation for discovering prospective RdRp targeting therapeutic candidates from 1906 market-available pharmaceuticals [12, [33] [34] [35] [36] . This computational study analyzes potential approved FDA drugs and others that can block TMPRSS2 priming ACE2 after infecting the host cell. Three computational approaches are combined to enhance the accuracy and efficiency of the prediction [37] [38] [39] ; (i) virtual screening over the current approved drugs database and fragments database to analyze their prospective for repurposing [40] [41] [42] , (ii) machine learning [43] , and (iii) molecular dynamics and well-tempered metadynamics [44] . The target used in the present study is a typical transmembrane serine protease type 2 (TMPRSS2). This is because the SARS-CoV-2 epidemic is dependent on the host cell molecules like ACE2 and TMPRSS2, which can be prevented by a therapeutically validated protease inhibitor [13] . The amino acid sequence of TMPRSS2 was reported in the UniProt database (UniProt accession: O15393) and is made up of 492 amino acid residues. As per UniProt, the target protein comprises a catalytic (1-255) and non-catalytic domain (256-492). Around 35% sequence identity was found between the templates of known structures in Protein Data Bank (PDB) and TMPRSS2, which is evident from similarity searching of UniProt sequence against PDB using BLAST (Basic Local Alignment Search Tool) online program [45] . I-TASSER, iterative threading assembly refinement software, generated the atomic coordinates of the 3D structural model of TMPRSS2 [46] . Plasma kallikrein A was the template with the highest overall sequence identity (PDB: 1ZOM). It uses secondary-structure enhanced profile-profile threading alignment and iterative structure assembly simulations using a threading assembly refinement tool [47] . Although the crystal structure of transmembrane protease serine 2 (PDB Identifier: 7MEQ) is available now, no PDB structure was available when we started this project. Since our predicted structure is highly similar to the experimental ones, we used the model for further research. Moreover, the I-TASSER predicted ligand-binding site and cofactor methods are reliable for our screening procedure. The quality of the model was evaluated by examining the model's stereochemical quality on the Ramachandran map by using ProSA and Rampage [48, 49] . Furthermore, PRO-CHECK [50] , ERPAT [51] , and QMEAN [52] tests were done to evaluate the quality of the predicted models. The final optimized structural model is taken into account for further investigation. The virtual screening library was the TargetMol-Approved Drug Library, consisting of 2356 compounds (https:// www. targe tmol. com/ compo und-libra ry/ Appro ved-Drugs-Libra ry). These 2356 compounds present in the TargetMol repository are drugs approved by Food and Drug Administration (FDA), the European Medicine Agency (EMA), or China Food and Drug Administration (CFDA) or included in the US Pharmacopeia (USP) Dictionary, the British Pharmacopoeia (BP), the European Pharmacopoeia (EP), the Japanese Pharmacopoeia (JP), or Chinese Pharmacopoeia (CP) Dictionary. All approved drugs in the database have well-known bioactivities, safety, and bioavailability, and these drugs are structurally diverse, medicinally active, and cell permeable. Dense fully connected neural network (DFCNN), a deep learning-based technique for predicting protein-drug interactions, was previously reported in our paper and was utilized in this research for primary drug screening [12] . Our recently published articles describe developing the deep learning model [12, 30] . Our DFCNN model takes its training data from the PDBbind database [53] , where crystallized high-resolution protein-ligand complexes are positive, and cross-docked complexes are negative. DFCNN uses a concatenated molecular vector of protein pocket and ligand as an input representation. The molecular vector is created by Mol2vec [54] , a natural language processing model inspired by the word2vec model. The DFCNN model has advantages over other methods: it does not rely on the docking simulation results (returns no docking pose). It incorporates nonbinding decoys in the training dataset. The docking simulation is exceptionally quick due to its independence, and the addition of nonbinding decoys during training conceives the model robust in real-world circumstances. Since the model does not rely on the protein-drug complex conformation, the model is around 100,000 times faster than Autodock Vina in estimating the protein-ligand binding probability (range 0-1). This model is used to predict protein-drug binding because the approach has been proven to have greater accuracy and efficiency and is well suited for use in an emerging disease outbreak. The DFCNN model's source code can be found at https:// github. com/ haipi ng1010/ DeepB indRG. For large-scale drug screening, DeepBindBC, an efficient and accurate deep machine learning-based model, is employed. Autodock Vina provided structural information for proteindrug complexes is used as input [55] . DeepBindBC can attain higher accuracy because it combines physical-chemical and spatial parameters between the protein-ligand interfaces. The PDBbind database was used to train the DeepBindBC ResNet model. The protein-ligand interface parameters in DeepBindBC will be transformed into a figure-like representation [53, 56] . Since DeepBindBC uses top docking poses generated by Autodock Vina and DFCNN needs molecular vector information, the two approaches are complementary, and DeepBindBC consumes significantly longer time than DFCNN. The DeepBindBC model can achieve higher accuracy, but it requires Autodock Vina's protein-drug complex structure information as input. We use a DFCNN score, DeepBindBC score, and Autodock Vina score, respectively, to rank the top binding protein-ligand complexes. The source code of the DeepBindBC model is available at https:// github. com/ haipi ng1010/ DeepB indBC. We have screened the Enamine REAL fragments dataset against the target, obtained from an online link (https:// enami ne. net/ compo und-colle ctions/ real-compo unds/ realcompo und-libra ries), a size of 15,635,761 compound fragments. We first do the DFCNN-based screening, select compound fragments with a DFCNN score larger than 0.8, and then carry docking for those selected compounds over the protein target using the procedure described above. After docking, we chose a ligand with a docking score ≤ −7.4 kcal/mol. Force field-based molecular dynamics (MD) simulations conducted additional drug screening for the protein-ligand complexes with the top score. The initial protein-drug complexes were generated using Autodock Vina docking, and the ligand was modified using pymol software [57] to place it in the correct protonation state. The MD simulations were performed using the AMBER-99SB force field in Gromacs [36] . ACPYPE was used to generate the ligand topology and the partial charges of the ligand [58] . First, we made a dodecahedron box and placed the structural complex in the center, then filled with TIP3P water molecules [59] . A minimum distance of 1 nm was fixed between the protein and the box edge. The Gromacs program tool added counter ions to neutralize the total charge. For non-bonded van der Waals interactions, a threshold of 14 Å was used. The LINCS algorithm restricted covalent bonds using hydrogen atoms [60] . We used a 0.001-ns step size for energy minimization, a 100-ps simulation with the isothermal-isovolumetric ensemble (NVT), and a 10-ns simulation with the isothermalisobaric ensemble (NPT) for water equilibrium. A 100-ns NPT production run was completed (step size 2 fs). With a fixed temperature of 300 K and a pressure of 1 atm, the Parrinello-Rahman barostat and the modified Berendsen thermostat were applied for simulation. Gromacs tools were used to calculate the trajectory's root mean square deviation (RMSD) and hydrogen bond number. Finally, the compounds were used for MD simulation. Further, the obtained candidate fragments were subjected to molecular dynamics simulations. To reduce the simulation time without affecting the accuracy, we consider a domain (the amino acid 250 to 492 part) for MD simulation, which should cover most of the amino acids involved in protein-ligand interaction. Metadynamics simulations have been employed to compute binding free energy to estimate if a protein-ligand will bind. The number of atom contacts correlates with the number of protein-ligand interface coordination numbers, and a higher number implies that the protein-ligand is in a binding state. The coordination number C is as follows: and During simulation, n was 6, m was 12, d 0 was 0 nm, and r 0 was 0.5 nm. d 0 is a parameter of the switching function. The distance between atoms i and j is given by r ij . The above function can calculate the degrees of contact between two groups of atoms. The community-developed PLUgin for MolEcular Dynamics (PLUMED) was used to execute a 100-ns metadynamics simulation for each protein-ligand system [61] . During the metadynamics simulation, Gaussian values with a height of 0.3 kJ/mol and SIGMA (Gaussian width) of 5 were deposited every 1 ps. The Plumed program was used to construct the free energy landscapes of the metadynamics simulations, which were then visualized using Gnuplot. The detailed procedures for conducting simulations were similar to our recently published work [12] . The molecular mechanics/Poisson-Boltzmann surface area (MMPBSA) technique was used to calculate the free binding energy of TMPRSS2-drug and TMPRSS2-fragment complex structures from molecular dynamics trajectories [62] . For the top three TMPRSS2-drug complexes and nine TMPRSS2-fragment complexes, the MM-PBSA free energy was computed. The MM/PBSA method in the g mmpbsa tool was used to compute the binding free energy using the last 20 ns trajectory (20 frames from each nanosecond) of the 40 ns typical NPT MD simulation [63] . The theoretical structural model built by using ITASSER has been used in this study. COFACTOR algorithm within I-TASSER was used to extract the ligand from the template structure (PDB ID: 1ZOM) using structural comparison and protein-protein networks [46, 64] . The model's quality is validated using standard procedures, where around 98% of amino acid residues fall inside the allowed region of the Ramachandran map. The model is also ranked according to its lowest DOPE (Discrete Optimized Protein Energy) score of −42,586.89. The binding pocket of the target is defined by extracting amino acid residues within 1 nm of the ligand. The protein molecule and drugs are prepared in an appropriate format to perform Autodock Vina docking, DFCNN-based screening, and DeepBindBC-based screening. The results suggest that by observing protein-ligand interactions from diverse perspectives, these three techniques (Autodock Vina, DFCNN, and DeepBindBC) complement one another. We eliminate drugs that have poor prediction by any of the three approaches. Table 1 shows the 5 drugs with an Autodock Vina score of more than −7.0 kcal/mol, DFCNN score of more than 0.9, and DeepBindBC score of more than 0.8. We considered the top three protein-drug complexes that have Autodock Vina score greater than or equal to 8 for further studies. Our drug screening against TMPRSS2 as a target resulted in some top FDA-approved drugs. (Autodock Vina score: 9.9, DFCNN: 0.97, DeepBindBC: 0.99) is a top-in-class small molecular inhibitor approved by FDA that inhibits the B-cell lymphoma-2 (BCL-2) gene and is used to treat chronic lymphoid leukemia [65] . Furthermore, the drug is found to be effective against other leukemia types [66] . Interestingly, we found an antifungal drug "anidulafungin" to treat candida infections in the top The drug carfilzomib is known to treat small lung cancer. Interestingly, it is reported that carfilzomib is a promising proteasome inhibitor for the treatment of relapsed and refractory multiple myeloma [67] . The three-dimensional representation of the ligand binding to the pocket of TMPRSS2 is presented in Fig. 1 . The figure shows that the ligand is binding to the target protein's non-catalytic domain (256-492). The Enamine REAL fragments screening results were tabulated and found to have different compounds as top hits by each method (Table 2 drugs with a DFCNN score greater than 0.9). The top docking poses of the nine fragments binding with TMPRSS2 are presented in Supplementary Fig. 1 . The two-dimensional chemical interactions of protein and ligands are shown in Fig. 2 . The drug ABT-199 was found to form more interactions with the amino acid residues of TMPRSS2. The drug carfilzomib forms 15 contacts with amino acid residues of the protein. The figure indicates the van der Waals and conventional hydrogen bonds as light and dark green colors. It is observed that the glycine and tryptophan residues at different positions of TMPRSS2 form van der Waals interactions with drugs upon complex formation. Since van der Waals interactions are highly distance dependent and occur when adjacent atoms form close contact, the dominant interactions between protein and drug indicate a strong binding [68] . The ABT-199 and anidulafungin form amide-pi stacked interactions with TMPRSS2, indicating a dark pink color. The amino acid tryptophan at 461st position is the critical residue involved in amidepi stacked interactions. The cysteine residue (297th and 437th positions) forms alkyl interactions with ABT-199, carfilzomib, and anidulafungin. We used atomic molecular dynamics simulations on TMPRSS2-drug complexes to further validate the top three selected drug complexes and investigate their interactions and stability. The protein-drug complex stability is validated Fig. 4 The number of hydrogen bonds formed between TMPRSS2 and top three drugs (TMPRSS2-ABT-199 (A), TMPRSS2-anidulafungin (B), and TMPRSS2-carfilzomib (C)) during the simulation time by computing the root mean square deviation (RMSD) over the simulation time. We performed 100-ns atomic MD simulations with a fixed temperature of 300 K and a pressure of 1 atm. The drugs such as ABT-199 and carfilzomib are found to be stable as observed by the minimum RMSD fluctuations (less than 1 nm) of ligand binding to the protein, presented in Fig. 3 . The snapshot of protein-drug binding poses after molecular dynamics simulations is presented in Supplementary Fig. 2 . The drug anidulafungin was relatively stable over most simulation time but had more fluctuation after 80 ns time (greater than 1.5 nm). Overall, the drug ABT-199 was stable over most of the simulation time with minimum RMSD values during the 100-ns simulation time. The larger values of RMSD indicate the obtaining of reliable binding conformation during the simulation. If the fluctuation in RMSD values is low, it still shows stable binding. Hydrogen bonding, hydrophobic, and electrostatic interactions are typically used to stabilize protein-ligand interactions. The hydrogen bond numbers were calculated using Gromacs tools by inputting trajectory files. The distance (≥ 3.5 Å) and the hydrogen donor-hydrogen acceptor angle (30°) were used to define hydrogen bond formation. Figure 4 depicts the formation of hydrogen bonds between protein and drug over a 100-ns simulation time. Among the three top TMPRSS2-drug complexes, ABT-199 and anidulafungin form more hydrogen bond numbers (> 5) with the protein than others. Further, carfilzomib forms fewer hydrogen bonds to stabilize the complex (average of > 3). As a result, RMSD estimates and hydrogen bond numbers between protein and drug suggest ABT-199 and anidulafungin are forming stabilizing interactions than the other complex. The top nine Enamine REAL fragments' RMSD and hydrogen bond numbers are shown in Figs. 5 and 6, respectively. The potential real drug-like compounds (Z2712431854, Z2873112960, Z2364211566, Z3047170929, and Z2482433032) are found to be relatively stable (RMSF Fig. 5 The root mean square deviation of the top nine protein-fragment complexes for 100-ns molecular dynamics simulations. The panels A-H represents the RMSD values of protein with nine fragments presented in Table 2 less than 1 nm) by observing RMSF values obtained during MD simulations. All other compounds have high RMSF values, as evident from Fig. 5A , C, F, and G. The hydrogen bond numbers between target and compounds support the above finding presented in Fig. 6 . After molecular dynamics simulations, the snapshot of TMPRSS2-fragment's binding poses is presented in Supplementary Fig. 3 . However, the Enamine REAL fragments may not be very strong binders, which is understandable because smaller compounds usually have weak binding. But the preferred fragments can give us a clue about which kinds of groups the pocket prefers to bind, hence helping future structure-based drug design. Since Enamine REAL fragments are relatively small, it is possible to modify or combine different fragments to achieve stronger binding. The most efficient technique to support virtual screening results is to apply metadynamics approaches to compute the binding free energy landscape of the protein and drug complex [44] . The molecular dynamics trajectories obtained from molecular dynamics simulations were used to compute the binding free energies. In the metadynamics approach, the number of atoms at the interface of the protein-drug complex was used as a collective variable (CV). The number of atom contacts correlates with the number of protein-drug interface coordination numbers, and a higher number implies that the protein-ligand interface is in a binding state. Figure 7 shows the binding free energy vs. coordination number (CV: collective variable) of the top three TMPRSS2-drug complexes from metadynamics simulations. Suppose the lowest energy basin falls close to the coordination number of 0. In that case, it indicates the compounds are unbinding, while if the free energy basin falls at a relatively large coordination number value, it prefers to bind. Figure 7 shows that the lowest energy basins for all three complexes (ABT-199, anidulafungin, and carfilzomib) are at a relatively large Table 2 coordination number value, especially anidulafungin. This indicates the compounds are strongly binding with the protein. The lowest energy poses of protein-drug complexes for ABT-199, anidulafungin, and carfilzomib had more interactions in the interface region, as specified by high coordination numbers. The most often utilized approach to support molecular docking results is computing the binding free energy of protein-ligand complexes using molecular mechanics-Poisson-Boltzmann surface area continuum salvation (MM-PBSA). The MM PBSA method was used to calculate the physical properties of protein and ligand complexes. The MD trajectories were used to predict the binding energies. Table 3 shows that the free binding energy of protein with ABT-199 (−145.096 kJ/mol) and carfilzomib (−115.071 kJ/ mol) is higher than the other protein-anidulafungin complex (−10.899 kJ/mol). Compared to TMPRSS2-drug complexes, the TMPRSS2-Z2482433032 fragment complex has a comparatively low free energy of binding (−104.142 kJ/mol). Tables 3 and 4 also include stabilizing physical free energies such as van der Waals, electrostatic, and solvent accessible surface area (SASA). The van der Waals, electrostatic, and SASA free energies of the TMPRSS2-drug complexes are all higher. According to our findings, TMPRSS2 complexed with drugs has improved projected free energies, as indicated by MMPBSA calculations. The SARS-CoV-2 is enclosed and includes single-stranded RNA beta coronaviruses, making them highly pathogenic. The 16 non-structural proteins (NSP1-16) and four primary Fig. 7 Free energy landscape of top three TMPRSS2-drug complexes (TMPRSS2-ABT-199, TMPRSS2-anidulafungin, and TMPRSS2-carfilzomib), computed using well-tempered metadynamics approaches structural proteins (spike, membrane, envelope, and nucleocapsid) are coded by the SARS-CoV-2 genome [69] . The structural proteins are required for SARS-CoV-2 replication, whereas the non-structural proteins are involved in viral assembly and transmission. The SARS-CoV-2 infection progression begins with viral entrance mediated by the spike glycoprotein's interaction with the host ACE2 receptor molecule and breaking of the spike protein by the TMPRSS2 before fusion with the host cell membrane [70, 71] . We targeted the TMPRSS2 protein as a therapeutic target to inhibit viral entrance because of its crucial role in viral pathogenesis. We recently published several papers on recommending possible small molecular chemical compounds against the SARS-CoV-2 main protease and RdRp proteins using our novel pipeline that includes deep learning-based drug screening [12, 72] . The resulting top three drugs and top nine drug-like compounds were used for further analysis. MD simulations examined the top three TMPRSS2-ligand complexes for 100 ns to validate the drug-screening results further. During the simulation, the structure of the docking complexes altered. However, ligands stayed mainly within the TMPRSS2-binding region. Multiple van der Waals and hydrogen bonding interactions help to stabilize the TMPRSS2-ligand complexes. MD simulations revealed that the compounds are relatively stable, based on a careful examination of hydrogen bonding during simulation time. The well-tempered metadynamics and MMPBSA free energies further support the results of drug screening. Deep learning-based large-scale drug screening and MD simulations suggest that the drugs ABT-199, anidulafungin, and carfilzomib are the most promising candidates for blocking viral entry in cells. In addition, this research provides nine potential drug-like compounds from the database that can facilitate drug discovery. The validation of recommended possible drugs through experimental investigations should be the next stage in moving the research forward. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s11224-022-01960-w. Author contribution All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Yufei Feng, Xiaoning Cheng, Shuilong Wu, and Wenxin Liu. The first draft of the manuscript was written by Konda Mani Saravanan and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. Data availability All relevant data are presented in paper and its supplementary material. Ethics approval Not applicable. Consent for publication Not applicable. The authors declare no competing interests. Z2712431854 −113.922+/−9 987+/−33.124 −9 Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak WHO Declares COVID-19 a pandemic Estimating the COVID-19 infection rate: anatomy of an inference problem Middle East respiratory syndrome: emergence of a Pathogenic Human Coronavirus Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats SARS and MERS: recent insights into emerging coronaviruses Human coronaviruses and therapeutic drug discovery A perspective on potential target proteins of COVID-19: comparison with SARS-CoV for designing new small molecules Precision therapeutic targets for COVID-19 A novel virtual screening procedure identifies Pralatrexate as inhibitor of SARS-CoV-2 RdRp and it reduces viral replication in vitro SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor Characterization of the receptorbinding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine Research and development on therapeutic agents and vaccines for COVID-19 and related human coronavirus diseases TMPRSS2: a potential target for treatment of influenza virus and coronavirus infections Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells The pivotal role of TMPRSS2 in coronavirus disease 2019 and prostate cancer Antonarakis ES et al (2020) TMPRSS2 and COVID-19: serendipity or opportunity for intervention? ACE2 and TMPRSS2 variants and expression as candidates to sex and country differences in COVID-19 severity in Italy A survey of current trends in computational drug repositioning ZINC: a free tool to discover chemistry for biology Drug databases and their contributions to drug repurposing Drug repositioning: identifying and developing new uses for existing drugs Exploring polypharmacology in drug discovery and repurposing using the CANDO platform Computational drug repositioning: from data to therapeutics Data sharing for novel coronavirus (COVID-19) Machine learning approach for predicting new uses of existing drugs and evaluation of their reliabilities Deep learningbased drug screening for COVID-19 and case studies Deep learning based drug screening for novel coronavirus 2019-nCov DeepBindPoc: a deep learning method to rank ligand binding pockets using molecular vector representation Structural basis for the inhibition of SARS-CoV2 main protease by Indian medicinal plant-derived antiviral compounds Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission Predicting protein interresidue contacts using composite likelihood maximization and deep learning GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation Quantitative structure-activity relationship: promising advances in drug discovery platforms A machine learning-based prediction platform for P-glycoprotein modulators and its validation by molecular docking Concepts of artificial intelligence for computer-assisted drug discovery Structure-based virtual screening for drug discovery: a problem-centric review Virtual screening strategies in drug discovery: a critical review Recognizing pitfalls in virtual screening: a critical review Identification of novel compounds against three targets of SARS CoV-2 coronavirus by combined virtual screening and supervised machine learning Well-tempered metadynamics: a smoothly converging and tunable free-energy method Structural modeling and analysis of the SARS-CoV-2 cell entry inhibitor camostat bound to the trypsin-like protease TMPRSS2 I-TASSER: a unified platform for automated protein structure and function prediction LOMETS: A local meta-threading-server for protein structure prediction ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins Structure validation by Cα geometry: ϕ, ψ and Cβ deviation PROCHECK: a program to check the steroechemical quality of protein structures Verification of protein structures: patterns of nonbonded atomic interactions QMEAN: a comprehensive scoring function for model quality assessment The PDBbind database: methodologies and updates Mol2vec: unsupervised machine learning approach with chemical intuition Autodock Vina DeepBindRG: a deep learning based method for estimating effective protein-ligand affinity PyMOL, molecular visualization system ACPYPE -AnteChamber PYthon Parser interfacE Comparison of simple potential functions for simulating liquid water LINCS: a linear constraint solver for molecular simulations Analyzing and biasing simulations with PLUMED Recent developments and applications of the MMPBSA method G-mmpbsa -A GROMACS tool for high-throughput MM-PBSA calculations COFACTOR: an accurate comparative algorithm for structure-based protein function annotation Selective BCL-2 inhibition by ABT-199 causes on-target cell death in acute myeloid leukemia Activity of ABT-199 and acquired resistance in follicular lymphoma cells Carfilzomib: a Promising proteasome inhibitor for the treatment of relapsed and refractory multiple myeloma Biophysical Principles Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2): an update TMPRSS2-inhibitors play a role in cell entry mechanism of COVID-19: an insight into camostat and nafamostat COVID-19 outbreak: history, mechanism, transmission, structural studies and therapeutics Evaluation of residue-residue contact prediction methods: from retrospective to prospective Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations