key: cord-279106-3ffa9djf authors: Syatila Ab Ghani, Nur; Emrizal, Reeki; Makmur, Haslina; Firdaus-Raih, Mohd title: Side chain similarity comparisons for integrated drug repositioning and potential toxicity assessments in epidemic response scenarios: the case for COVID-19 date: 2020-10-21 journal: Comput Struct Biotechnol J DOI: 10.1016/j.csbj.2020.10.013 sha: doc_id: 279106 cord_uid: 3ffa9djf Structures of protein-drug-complexes provide an atomic level profile of drug-target interactions. In this work, the three-dimensional arrangements of amino acid side chains in known drug binding sites (substructures) were used to search for similarly arranged sites in SARS-CoV-2 protein structures in the Protein Data Bank for the potential repositioning of approved compounds. We were able to identify 22 target sites for the repositioning of 16 approved drug compounds as potential therapeutics for COVID-19. Using the same approach, we were also able to investigate the potentially promiscuous binding of the 16 compounds to off-target sites that could be implicated in toxicity and side effects that had not been provided by any previous studies. The investigations of binding properties in disease-related proteins derived from the comparison of amino acid substructure arrangements allows for effective mechanism driven decision making to rank and select only the compounds with the highest potential for success and safety to be prioritized for clinical trials or treatments. The intention of this work is not to explicitly identify candidate compounds but to present how an integrated drug repositioning and potential toxicity pipeline using side chain similarity searching algorithms are of great utility in epidemic scenarios involving novel pathogens. In the case of the COVID-19 pandemic caused by the SARS-CoV-2 virus, we demonstrate that the pipeline can identify candidate compounds quickly and sustainably in combination with associated risk factors derived from the analysis of potential off-target site binding by the compounds to be repurposed. Epidemics caused by novel infectious agents result in situations where no known treatment regimens are 29 in practice. Case management would therefore first rely on treating and alleviating the symptoms. The focus 30 of the treatment would then move on to eradication of the infectious agent from the host and more in-depth 31 therapeutic management. Such an epidemic scenario presented itself in the city of Wuhan, Hubei Province, 32 China in late 2019 [1] . The causative pathogen for the observed acute respiratory distress was later identified 33 to be a novel human coronavirus (nCoV19) named as Severe Acute Respiratory Syndrome Coronavirus 2 1 (SARS-CoV-2) [2] . Although, many coronaviruses are found in bat reservoirs, it is probable that SARS-CoV-2 2 also has intermediate hosts such as pangolins and snakes [3] . 3 Three months after it was first reported, the disease, named coronavirus disease 2019 (COVID- 19) , had 4 progressed into a global pandemic. The fast spread of the disease was however paralleled by the speed that 5 data regarding the disease and its causative agent were generated. In mid-January 2020, the first genome 6 sequence was deposited into GenBank (https://www.ncbi.nlm.nih.gov/genbank/); by mid-July 2020, more 7 than 40,000 complete genomes with high coverage from samples throughout the world had been deposited in 8 the GISAID database (http://www.gisaid.org/; http://epicov.org). While the rate of genome sequencing and 9 data sharing is unprecedented, the rapid availability of structure data has also been equally impressive. In late 10 September 2020, more than 400 structures of SARS-CoV-2 proteins had been deposited in the Protein Data 11 Bank (PDB) [4] . 12 Despite the number of confirmed cases passing 32.9 million with more than 1 million fatalities 13 worldwide in early October 2020, treatment options are still lacking for COVID-19 although several vaccines 14 have recently started their trials in July 2020. This dire but data-rich scenario has led investigators to resort to 15 drug repurposing strategies. Although such efforts to reposition approved drugs to a new target can be 16 explored in a clinical setting, we focus specifically on how computational approaches can feature prominently 17 in the identification of the candidate compounds. 18 Various approaches have been deployed to explore the repertoire of known and approved compounds for 19 COVID-19. Zhou et al. utilized network-based analyses of drug targets and the virus-host interactions in the 20 human interactome to list 16 potential drugs to prioritize for repurposing [5] . An even larger effort that 21 generated a SARS-CoV-2-Human protein-protein interaction map was able to identify 66 druggable human 22 proteins that could be targeted by 69 currently available FDA approved compounds to be used as COVID-19 23 treatments [6] . 24 Side chain similarity comparisons [7, 8] have been reported to be a potential starting point in drug 25 repurposing efforts [9] . For such an approach, the 3D arrangements of known drug binding sites are collected 26 as a search database to identify similar sites in non-homologous structures thus implying the capacity to bind 27 similar ligands. Drug-target interaction prediction using structural data has remained a largely unexplored 28 niche [10] . The identification of possible alternative binding sites for an approved drug can also provide 29 insights into their possible off-target effects. There is a clear urgency to discover and deploy suitable 30 candidates that can be repositioned against targets associated with COVID-19. Nevertheless, it is prudent to 31 steer clear of adverse effects resulting from the poly-pharmacological actions of promiscuous drugs with the 32 ability to bind to other targets [11] . 33 In this work, amino acid side chain similarity searching was utilized to propose alternative target sites in 34 SARS-CoV-2 protein structures for drug repositioning. These searches were based on the premise that if a 35 known drug binding site could be found in a SARS-CoV-2 protein, then that protein could also serve as an 36 alternative target for the same drug. This same principle was then used to identify off-target sites that could 37 present as side effects or result in some form of toxicity. The list of potential drugs derived from the side 38 chain arrangements similarity searches was then used to propose structurally similar compounds that could 39 also target the sites already identified for repositioning. Our approach differs significantly from that reported 40 by Zhou et al. [5] and Gordon et al. [6] which can serve as additional confirmatory analysis and complement 41 the gaps in existing work. The details of these differences will be discussed in a later section. The downloaded sequences were clustered at 90% sequence similarity cut-off using the CD-HIT program 3 [12]. Members of individual clusters were sorted according to the X-ray crystallography resolution; the 4 SARS-CoV-2 protein sequence with the higher resolution structure was selected as the cluster's 5 representative. The PDB structures containing representative sequences were compiled together for further 6 similarity searches against the dataset of known drug binding sites derived from protein-drug complexes in 7 the PDB. 8 For the selection of drug compounds, we further selected drug compounds that are: (i) currently 9 undergoing clinical trials for COVID -19, and (ii) protein was used as the receptor structure for docking. The Python script contains all the necessary commands 28 that will be executed in the UCSF Chimera command line to automatically pre-process structures and perform 29 blind molecular docking. The pre-processing steps of the ligand and receptor structures include the removal of 30 water molecules and ligands, assigning the partial charges for both standard and non-standard residues, as 31 well as an additional energy-minimization step. The atomic partial charges for standard residues including 32 standard amino acids, water and know ligands, as well as non-standard residues were assigned based on the 33 AMBER ff14SB force field (default), while the partial charges for non-standard residues were calculated 34 using the Antechamber module based on the AM1-BBC method. In the case of residues with missing side 35 chains, the amino acid side chains were replaced based on information from a rotamer library. Energy 36 minimization was performed with steps of steepest descent minimization set to 100. Molecular docking was 37 carried out using a local installation of Autodock Vina and linked for use in UCSF Chimera. 38 Blind docking was carried out instead of using the binding site as a reference point. Therefore, a 39 whole protein structure target was exhaustively searched for potential binding poses using the default settings 40 for parameters such as exhaustiveness value (set to 8) and maximum number of binding modes (set to 9). The 41 default box size was used to sample the ligand orientation where it automatically covers the entire protein 42 receptor thus allowing for matches of binding poses to not only known binding sites, but also to other putative 43 sites that have not been reported elsewhere. 44 Upon completion of the docking run using the Python script, UCSF Chimera loads a selection of 45 docking poses for visualization where the docking poses are ranked according to the docking scores reported 46 in kcal/mol with more negative values indicating better binding. The sites found from the sub-structural 47 similarity search is also visualized. The UCSF Chimera session for individual script runs were saved for 48 further curation and analysis. The sites from the sub-structural similarity search were compared against the 1 sites in pre-computed binding poses from molecular docking, where an overlap of at least three matched 2 residues with poses of docking scores more negative than -6.5 kcal/mol selected for further analyses. 3 4 2.3. Searching for potential off-targets from human for selected drugs proposed for COVID-19 5 6 Potential off-targets for selected drugs proposed for COVID-19 were identified using three different 7 methods. First, known human proteins bound to the selected drug compounds were obtained from the PDB 8 through the 'Advanced Search' interface in the RCSB using the ligand PDB ID as a query. The list of PDB 9 structures retrieved were filtered to only contain PDB with organism denoted as 'Homo Sapiens'. Second, 10 human proteins with similarly arranged sites to drug binding sites for the selected drugs were retrieved from 11 pre-compiled results for sub-structural similarity searches in Drug ReposER web server. Third, human 12 proteins with more than 30% sequence similarity to individual SARS-CoV-2 protein structures were retrieved 13 from blastp searches against the PDB. The list of proteins retrieved was filtered to only contain proteins with 14 sequences more than 30% sequence identity to the query SARS-CoV-2 protein. 15 These human structures were then used for molecular docking against the selected compounds. 16 Molecular docking runs were conducted based on the above-mentioned protocol using Python scripts 17 executed in UCSF Chimera. A compound's involvement in specific biological mechanisms and potential 18 adverse effects upon interaction with the selected compounds were manually assessed and extracted from 19 information queried and similar ligands were structurally aligned in the UCSF Chimera interface [15] . 28 The queried and the similar ligands were individually searched against the Drug ReposER 29 application database to retrieve results for sub-sructural similarity searches. Both sets of results were 30 compared and shared SARS-CoV-2 protein targets from the list of proteins (proteins containing sites similar 31 to binding sites for both queried and similar ligands) were obtained for molecular docking against the 32 corresponding ligand molecules with Autodock Vina using the above-mentioned protocol [15] . 33 34 3. Results and Discussion 35 In this study, sub-structural similarity searches and docking analyses were carried out to: (i) identify 36 potential targets and drug binding sites in SARS-CoV-2 proteins; (ii) identify off-targets for proposed drug 37 compounds for COVID-19; (iii) identify other approved drugs with similar structure to proposed drugs that 38 are potentially useful for COVID-19 treatment. A total of 351 SARS-CoV-2 proteins were obtained from the 39 PDB that included the following proteins; ADP ribose phosphatase (PDBID: 6w02), spike protein (PDBID: 40 6vsb), main protease (PDBID: 6lu7), nucleocapsid (PDBID: 6m3m), NSP7-NSP8 complex (PDBID: 6yhu), 41 NSP9 replicase (PDBID: 6w4b), NSP10-NSP16 complex (PDBID: 6w4h), NSP15 (PDBID: 6w01), ORF7a 42 encoded accessory protein (PDBID: 6w37) and RNA-dependent RNA polymerase or NSP12 (PDBID: 6m71). 43 The substructure similarity searching used in this work utilized the ASSAM computer program which 44 solves a maximal common subgraph problem to match similar 3D arrangements of amino acids in a dataset 45 of protein structures [7] . The arrangements of amino acids in 3D space are represented as graphs, where the 46 graph nodes are the pseudo-atoms representing side chain groups and the graph edges are distances between 47 the side chain groups. Using this scheme, it is possible to match similar 3D arrangements, such as catalytic 48 sites and ligand binding sites, in non-homologous structures. Drug ReposER is an extended application of the 1 ASSAM program that focuses on sub-structures that constitute the binding sites for approved drug molecules 2 [9]. 3 At the time of writing, approximately a third of the proteins encoded in the SARS-CoV-2 genome have 4 corresponding PDB structures. In anticipation that more structures will be deposited, we have enabled the 5 analysis pipeline to be deployed to process new structures as and when they become available, based on the 6 clustering of protein sequences and comparison to readily available structures. The results from the analyses 7 reported in this work and those that will be carried out by the pipeline for new structures will be made 8 accessible via a dedicated module of the Drug ReposER web application -9 http://mfrlab.org/drugreposer/covid19/. The list of PDB IDs with pre-computed results from sub-structural 10 similarity searches and the sequence clusters are also available at the same resource. 11 The search for COVID-19 treatments has resulted in the registration of more than 3000 clinical trials in 12 the ClinicalTrials.gov database to explore the repurposing of more than twenty readily available drugs 13 ( Searching for sites in the SARS-CoV-2 protein structures (hit sites) that are geometrically similar to 22 sites for approved drug compounds (query sites) using the Drug ReposER application [9] had identified 23 matches that included 22 sites from protein-drug complexes with sequence identities lesser than 30% to the 24 corresponding SARS-CoV-2 proteins (Table 1 ). These results show that the computational approach adopted 25 in this study is able to find similarly arranged sites in unrelated proteins which could be an advantage when 26 there are limited numbers of homologous structural models to be used for comparison of binding sites. In 27 addition, the selection of matches to proteins with lesser than 30% sequence similarity could be indicative of 28 function differences, thus potentially distinct pathways where the bound drugs could be repurposed to. 29 The sites identified in the SARS-CoV-2 proteins were then docked with their corresponding drug 30 compounds derived from the protein-drug complex data. Molecular docking runs resulted in the identification 31 of several poses with docking scores ranging from -6.0 kcal/mol up to -17.6 kcal/mol, which are congruent 32 with the results of the Drug ReposER searches (Table 1, Figure 1 ). Of these 22 potential interactions, six have 33 been reported in other studies [18] [19] [20] . 34 The sub-structural similarity searches carried out revealed that six of the nine analysed SARS-CoV-2 35 contain multiple potential alternative binding sites for different compounds. For example, the nucleocapsid 36 protein contains potential sites for losartan, ritonavir, darunavir and aspirin ( The HIV protease inhibitors -darunavir (017), ritonavir (RIT) and lopinavir (AB1) -inhibit the HIV 18 aspartyl protease and prevents the cleavage of Gag and Pol proteins into their subsequent protein components 19 [22] . The potential antiviral activity of such inhibitors against coronaviruses had been previously studied; 20 nelfinavir for example, had been reported to inhibit the replication of SARS-CoV and prevent cytopathic 21 effects [23]. Lopinavir and ritonavir had been shown to improve clinical outcomes from SARS-CoV 22 infections and are hypothesized to bind to the 3-chymotrypsin-like protein (3CLpro) or main protease [24] . 23 Our analysis also demonstrated the potential ability of lopinavir (PDBID: 2qhc) to bind to ADP ribose 1 phosphatase (PDBID: 6w02) with a docking score of -17.6 kcal/mol at a position close to the known substrate 2 binding site ( Figure 1A ). 3 We found that the NSP10-16 complex (PDBID: 6w75) may potentially bind to ritonavir (RIT) in a 4 manner similar to that observed in the HIV protease (PDBID: 1sh9) while ADP ribose phosphatase (PDBID: 5 6w02) could potentially bind to lopinavir (PDBID: 2qhc) with a high docking score (-17.6 kcal/mol) ( Figure 6 1A). A potential site for folic acid (FOL) binding that is similar to the arrangement found in dihydrofolate 7 reductase (PDBID: 4i13) was also found at the interaction site between domain III (residue 201-303) of two 8 monomers, where dimerization is crucial for protease function took place (PDBID: 6y2g). We also found that 9 the nucleocapsid might bind to losartan, darunavir and aspirin at the dimerization site between two monomers 10 in a similar manner to the SARS-CoV-2 main protease. 11 The Drug ReposER searches also identified similarly arranged sites between the indomethacin-bound 12 prostaglandin E synthase 2 (PDBID: 1z9h) and the NSP10 protein (PDBID: 6w4h) ( Figure 1B ). An 13 arrangement of amino acid residues that make up the indomethacin binding site in cyclooxygenase-2 (COX-14 2), also known as prostaglandin synthase 2 (PDBID: 4cox), was also found to be similar to residues at the 15 vicinity of docked indomethacin in the NSP7-NSP8 complex (PDBID: 6yhu) ( response. In this context, the binding of indomethacin to these protein structures (NSP7/NSP8 or NSP10) may 30 also prevent potential inflammatory events. The same mechanism could be adopted by other NSAIDs like 31 naproxen (NPS), that might recognize similar sites from COX-2 (PDBID: 3nt1) in the ADP ribose 32 phosphatase (PDBID: 6w6y), as indicated from the sub-structural similarity we have uncovered. These sub-33 structural similarities to a known indomethacin binding site may explain the mechanism for studies that have 34 reported the ability of NSAIDs to bind to SARS-CoV-2 proteins [29] although the atomic level details of such 35 interactions have not yet been reported. 36 37 3.2. ADP ribose phosphatase of NSP3 as potential target in SARS-CoV-2 38 39 Our analysis revealed that the ADP ribose phosphatase of NSP3 from SARS-CoV-2 has the most 40 number of 3D residue arrangements that are similar to the binding sites in known drug targets compared to 41 other SARS-CoV-2 proteins (Table 1, Figure 2 ). All the identified sites are within the substrate binding sites 42 with the docking scores for the different poses ranging from -6.7 to -17.0. In this case, the known ADP ribose 43 phosphatase -APR complex was used as a control to obtain reasonable docking scores that could be 44 considered acceptable based on predicted binding poses between the ADP ribose phosphatase and the 45 substrate, APR. The molecular docking with energy minimization steps resulted in several binding poses with 46 docking scores ranging from -7.9 to -9.8, with all sites located within the actual binding site for APR. 47 The ADP ribose phosphatase of non-structural protein 3 (NSP3) is likely to be targeted by anti-48 retrovirals and several other drugs more than any other SARS-CoV-2 structures, particularly at the active site 1 of the structure (Figure 2A ). This finding is in agreement with recent computational screening for the drug 2 binding ability of SARS-CoV-2 proteins which highlighted the promiscuity of NSP3 in binding to other 3 molecules at the ADP ribose binding site [21,28]. The de-ADP ribosylation activity of NSP3 suppresses the 4 expression of host innate immunity genes such as interferon and interleukin related genes [30] . Disruption of 5 NSP3 function will allow for the host immune system to respond normally to the infection. 6 7 8 9 Sub-structural similarity searches and molecular docking runs have revealed the potential binding sites 2 for darunavir (017) that originally targeted HIV protease (PDBID: 6dh3), as well as chloroquine (CLQ) that 3 originally targeted quinone reductase 2 (PDBID: 4fgl) and indicated for malaria and rheumatoid arthritis, onto 4 the ADP ribosylation site of NSP3 (PDBID: 6w02) (Table 1, Figure 2 ). Despite the similarity of these sites in 5 terms of their 3D arrangements, the similarity of their molecular functions is unlikely to be related. 6 The docking results indicate that HIV protease inhibitors and NSAIDs are among the existing drugs that 7 could potentially be repositioned against ADP ribose phosphatase and several non-structural proteins for 8 treatment of COVID-19. The similarly arranged residue patterns observed between the binding poses in 9 SARS-CoV-2 proteins from docking simulations and those from available drug-bound protein complexes 10 allow us to infer the similarities of the binding mechanisms shared by these proteins despite the lack of 11 sequence similarities. 12 13 3.3. Potential off-targets of approved drugs proposed for COVID-19 14 15 The binding of drug compounds to off-target sites in proteins other than their intended targets can lead to 16 unexpected pharmacological outcomes including the activation or disruption of molecular functions that cause 17 adverse effects or other unexpected conditions [11,31]. However, off-target effects are not necessarily 18 negative and it is this same concept that is in use to repurpose approved compounds for alternative indications 19 based on the availability of similar of binding sites shared among proteins involved in distinct disease 20 pathways [11, 31] . We deployed the same substructure searching methodology to identify off-target sites for 21 the drugs being explored as COVID-19 treatments. 22 23 An ASSAM search of the human protein structures in the PDB using the drug binding sites we have 26 identified was used as a means to investigate whether the use of these drugs could alter other pathways. The 27 searches led us to a compilation of potential off-target sites and/or effects for eleven approved compounds 28 (Table 2 ). 29 30 The substructural similarity searches for potential off-target sites in human proteins using the Drug 7 ReposER application was able to identify several proteins that have similar geometry to the binding site of a 8 drug proposed for repositioning against SARS-CoV-2 targets (Section 3.3.2). The same data also allowed us 9 to compile potential repurposing opportunities of these drugs for other indications including COVID-19 10 ( Table 2 and Table 3 ). 11 Non-homologous proteins that share similarly arranged sites for a particular drug molecule are more 12 likely to be considered as off-targets because they may have different molecular function and are involved in 13 distinct pathways that may not be associated with the original target disease. Recent computational studies 14 have proposed several HIV protease inhibitors [18, 20, 21] , NSAIDs [29] , and losartan [41] as potential 15 therapeutic agents for COVID-19. Although we can confirm the presence of potential binding sites to these 16 drugs on SARS-CoV-2 proteins, we were also able to identify potential off-target sites where these drugs may 17 alternatively bind in the structures of human proteins (Table 3) . 18 19 (Table 3 ). Peripheral neuropathy due to the neurotoxicity of 6 HIV protease inhibitors have been reported as complications resulting from anti-HIV treatment [53] . One 7 potential off-target protein that might cause such symptoms is the HERC2 protein (PDBID: 3kci). Disruption 8 of this protein causes reduction of E6AP activity that has been implicated in neurodevelopment disorders such 9 as Angelman syndrome and autism [57] . 10 Losartan targets the angiotensin type II receptor, however, it may also bind to the drug metabolizing 11 cytochrome P450 (PDBID: 5x24) that has a similarly arranged site in ceruloplasmin (PDBID: 1kcw) ( Table 3 , 12 Figure 3B ). Ceruloplasmin has been implicated with Parkinson's disease where disruption of the oxidative 13 activity by ceruloplasmin causes increased iron levels in the brain that is correlated to Parkinson's [47, 56] . On 14 the other hand, it was also reported that losartan could be useful for Parkinson's where it might be able to 15 reduce oxidative stress and neurodegeneration [58] thus warranting further investigations regarding the 16 neuroprotective benefits of losartan. 17 The function inhibition of certain off-target proteins may provide coincidental antiviral effects (Table 3) . 18 Other than potentially targeting the SARS-CoV-2 proteins, NSAIDs such as naproxen (NPS), indomethacin 19 (IMN) and aspirin (AIN) may also interact with host proteins involved in mounting the defense against viral 20 infections. For example, we found that naproxen might be able to bind polypyrimidine tract-binding protein 1 21 (PTBP1) (PDBID: 1qm9) based on the similarity of the binding site for naproxen in serum albumin (PDBID: 22 4po0) (Table 4, Figure 3C ). 23 The PTBP1 protein had been shown to activate the replication of picornaviruses and coronaviruses 24 through binding to its RNA binding domain [49, 59] , thus binding of naproxen to its binding site could 25 potentially block viral replication. Other NSAIDs like the indomethacin and aspirin might also induce 26 antiviral properties by binding to myeloperoxidase, which is a part of host defense system (Table 3) . The 27 protein acts as tissue damage factor that induces secondary bacterial lung infections causing the acute 28 respiratory distress syndrome seen in influenza [51] . Decreased function of myeloperoxidase had been shown 29 to potentially decrease inflammatory damage and lung viral load [51]. SARS-CoV-2 proteins that may share a similar fold to human proteins were also considered as potential 14 off-targets. In this case, SARS-CoV-2 proteins that retrieved a human protein by blastp alignment with more 15 than 30% sequence identity is a possible indication of fold similarity. These protein structures were then 16 analyzed to ascertain whether they contained a similar sub-structure arrangement as the SARS-CoV-2 protein 17 that is being targeted for drug repositioning (Table 4) . 18 19 Table 4 20 Human proteins with more than 30% sequence identity to SARS-CoV-2 proteins retrieved by a blastp search of the 21 PDB database. The SARS-CoV-2 ADP-ribose phosphatase from NSP3 has a similar sequence to human ADP-ribose 14 binding protein and both share a similar mechanism of ADP ribose binding (Figure 4) . The SARS-CoV-2 15 spike protein is found have sequence similarities to the IRAP protein with both being associated to the renin-16 angiotensin pathway. No human sequences with possible fold similarities were detected for the main protease 17 and non-structural proteins that include the NSP7, NSP8 and NSP10 which are conserved in viruses. The binding of approved drugs or inhibitors to interleukin 17a and mineralocorticoid receptors that have 2 similar sequences to the nucleocapsid and NSP15 respectively, could prevent inflammation by the immune 3 response [65,66], a known complication of COVID-19 (Table 3 ). This would mean that such drugs could 4 target both the virus and the host in parallel with potentially therapeutic results. 5 6 3.4. Other compounds with potential as COVID-19 therapeutics based on ligand structure similarity 7 8 It is known that similar drugs may require a similar binding environmentand can have similar inhibitory 9 effects, thus making it possible that a target protein can interact with a set of drug molecules with similar 10 structures [67] . With this premise, the structures of the drugs proposed for repositioning against SARS-CoV-2 11 targets (proposed drugs) were used as a reference point to find other drug molecules with similar structures 12 (matched drugs). This was carried out using the ligand search interface in the PDB that is based on the 13 comparison of pharmacophores. The search identified 6 matches with similar structures to the input 14 queries -quinacrine, vardenafil, lenalidomide, pomalidomide, amprenavir and methotrexate ( Table 15 5). With the exceptions of methotrexate, which has structural similarities to folic acid 16 (ClinicalTrials.gov IDs: NCT04352465 and NCT04434118), and lenalidomide, which is related to 17 thalidomide (ClinicalTrials.gov ID: NCT04361643), none of these compounds are involved in any 18 known clinical trials for COVID-19 at the time of writing. Molecular docking targeting the SARS-CoV-2 19 proteins using both the proposed and matched drugs (shared SARS-CoV-2 protein targets) resulted in several 20 binding poses that is indicative that the matched drugs can potentially bind to SARS-CoV-2 proteins in a 21 similar manner as the proposed drugs (Table 5, Figure 5 ). 22 23 Our analyses found that both darunavir and amprenavir can potentially bind to the same SARS-CoV-2 4 site (P125, G130, I131, V155, and D157) in NSP3 ( Figure 4E , Table 5 ). Darunavir, when docked on NSP3, 5 has a molecular binding affinity of -9.4 kcal/mol. Amprenavir, when docked at the similar site ( Figure 4E ), 6 also has a molecular binding affinity of -9.4 kcal/mol (Table 5) . 7 Some structurally similar drug molecules are intended for similar indications. Both darunavir and 8 amprenavir ( Figure 4E ) are protease inhibitors that have been used for the treatment of HIV. However, 9 amprenavir is useful against infections that exhibit resistance to other protease inhibitors used in HIV 10 treatment [68] . Thus, amprenavir might confer an advantage in a scenario where the protein target from 11 SARS-CoV-2 develops resistance towards darunavir. Both chloroquine and quinacrine ( Figure 4A ) have been 12 indicated for the treatment of systemic lupus erythematosus as well as other diseases [69] . 13 A study comparing the oculotoxicity of chloroquine and quinacrine in the management of lupus 14 erythematosus found that quinacrine exhibits less oculotoxicity compared to chloroquine if taken at low doses 15 [70]. Thus, quinacrine might be a less toxic alternative compared to chloroquine with regard to any 16 ophthalmologic side effects. 17 Sildenafil and vardenafil ( Figure 4D) Vardenafil, amprenavir and methotrexate had been reported to potentially bind to SARS-CoV-2 proteins 32 through structural analyses [19, 78] . To our knowledge, the potential use of quinacrine, lenalidomide, and 33 pomalidomide for COVID-19 have not been reported elsewhere in the context of binding ability through 1 structural analyses. Furthermore, the finding that quinacrine is a readily available compound that has yet to be 2 explored or proposed for COVID-19 is novel to this work. Should the current candidate drug molecules 3 proposed for COVID-19 clinical trials fail at any stage of the process, these structurally similar drug 4 molecules can be investigated as potential alternatives. It is not unexpected that the use of these structurally 5 similar compounds could be used in concert as a cocktail for more effective therapy [79]. 6 7 8 indicate regions containing residues within less than 4.0Å to docked drug molecules. The orange shaded areas 10 indicate regions containing residues that form the binding sites identified by Drug ReposER. 3.5 Distinction from other COVID-19 drug repurposing efforts and future directions 13 In this work, all drugs that have been proposed for clinical trials were analyzed using the Drug ReposER 14 pipeline to find their potential binding sites in any SARS-CoV-2 protein by virtue of having similar 3D 15 arrangements of amino acid residues to the known target sites. It is not unexpected that our results will 16 overlap or have parallels with the outcomes of other studies that have been recently published or are ongoing. 17 However, the results presented here and in the COVID-19 Drug ReposER resource, will also provide the 18 relevant supporting insights regarding why or how a particular drug may be effective while at the same time, 19 have the added advantage of presenting the potential capacity for off-target interactions that may cause or 20 explain any side effects upon administration. 21 The human proteins based on the analysis of association networks between human and SARS-CoV-2 proteins [6] . 30 In comparison to our analyses, there are two drugs that overlap with our results, chloroquine (targeting 31 sigma1-receptor:NSP6) and indomethacin (targeting PTGES2:NSP7). 32 This study was intended to develop a pipeline to identify drug compounds that could be repositioned 33 against SARS-COV-2 targets using the available structural information in the PDB. This pipeline was also 34 able to identify potential side effects or toxicity associated with those compounds that arose from off-target 35 binding. Integrating the data to pharmacophore matching tools allowed other similarly structured drug 36 compounds to be identified that also had the potential to be repositioned against SARS-CoV-2 targets. The 37 information derived from such analyses could be used as a means of decision making to prioritize down-38 stream experimental validation and assays. This study does not provide any experimental evidence validating 39 the binding of the proposed repositioned drugs to SARS-CoV-2 proteins. The results of this study should not 40 be regarded as an explicit treatment recommendation or protocol for COVID-19. 41 A limited set of existing drugs extracted from lists of those currently undergoing or planned for COVID- 42 19 trials was used in this work. The analyses reported only utilized data of compounds that were structurally 43 present as a standalone ligand in the PDB. Both these factors restricted the number of potential candidates that 44 could be proposed for repurposing. Despite these limitations, our analyses yielded 22 target sites for 45 repurposing of which only 6 had been mentioned in other studies. It is clear that the work reported here could 46 be extended to include all known drug binding sites in the PDB. Although the current targets for repositioning 47 in this study considers only SARS-CoV-2 proteins, the pipeline can be integrated to network analyses 48 methods to identify human proteins that could also yield therapeutic effects for COVID-19. Furthermore, this 49 study can also be extended to include other SARS-CoV-2 structures as and when they become available. Such 50 data will be updated via the specific Drug ReposER resource for COVID-19. The fastest and safest route to providing drug treatments for COVID-19 would be to reposition approved 4 compounds against targets from this newly described disease. At the time of writing, the search for effective 5 COVID-19 treatments is still ongoing. Despite being subject to the availability of associated protein 6 coordinate structure data in the PDB, the use of amino acid 3D side chain based sub-structure comparisons 7 have proven to be a feasible means of identifying candidate compounds to be repositioned for COVID-19. 8 Our analyses yielded 22 potential sites in SARS-CoV-2 proteins and 16 drug compounds that could be 9 repurposed for COVID-19. It is clear that the use of structural data from the PDB is able to provide high 10 quality mechanistic level details for strategizing the selection of candidate compounds to be repurposed. The 11 capacity to not only identify new target sites, but also identify potential off-target sites, provide a deeper level 12 of context for the decision making process to safely proceed with exploring specific compounds to be 13 repurposed for the new disease. 14 15 Acknowledgments 16 17 We thank the Malaysia Genome Institute for the use of computational resources. This The authors have no affiliation with any organization with a direct or indirect financial interest 2 in the subject matter discussed in the manuscript 3 Clinical features of patients infected with 2019 27 novel coronavirus in Wuhan, China The species 30 Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS Recent progress and challenges in drug development against COVID-33 19 coronavirus (SARS-CoV-2) -an update on the status Protein Data Bank: The 36 single global archive for 3D macromolecular structure data Network-based drug repurposing for novel 39 coronavirus 2019-nCoV/SARS-CoV-2 A SARS-CoV-2 protein 42 interaction map reveals targets for drug repurposing SPRITE and ASSAM: web servers 44 for side chain 3D-motif searching in protein structures IMAAAGINE: a webserver for searching 47 hypothetical 3D amino acid side chain arrangements in the Protein Data Bank Drug ReposER: a web server for predicting similar amino 3 acid arrangements to known drug binding interfaces for potential drug repositioning Drug repositioning or 6 target repositioning: A structural perspective of drug-target-indication relationship for available 7 repurposed drugs Drug Promiscuity in PDB: Protein Binding Site Similarity Is 10 Key CD-HIT Suite: a web server for clustering and comparing 12 biological sequences Firdaus-raih M. SPRITE and ASSAM : web servers 14 for side chain 3D-motif searching in protein structures AutoDock Vina: Improving the speed and accuracy of docking with a new scoring 17 function, efficient optimization, and multithreading UCSF Chimera-20 A Visualization System for Exploratory Research and Analysis UniProtKB/Swiss-Prot Trial Reporting in ClinicalTrials.gov -The Final Rule Structure based drug discovery 27 by virtual screening of 3699 compounds against the crystal structures of six key SARS-CoV-2 28 proteins 2020 Computational View toward the Inhibition of SARS-CoV-2 Spike 30 Glycoprotein and the 3CL Protease Potential Drugs Targeting Nsp16 Protein 33 May Corroborates a Promising Approach to Combat SARSCoV-2 Virus 2020 Nature to Nurture-Identifying Phytochemicals from Indian 36 Medicinal Plants as Prophylactic Medicine by Rational Screening to Be Potent Against Multiple Drug 37 Targets of SARS-CoV-2 2020 Therapeutic options for the 2019 novel coronavirus (2019-nCoV) HIV protease inhibitor 41 nelfinavir inhibits replication of SARS-associated coronavirus Coronaviruses-drug discovery and therapeutic 44 options Coronavirus Nsp10, a 46 critical co-factor for activation of multiple replicative enzymes Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 1 co-factors Indomethacin and resveratrol as potential treatment adjuncts for 3 SARS-CoV-2/COVID-19 Inhibition of cyclooxygenase 2 blocks human 5 cytomegalovirus replication Repurposing naproxen as a 8 potential antiviral agent against SARS-CoV-2 2020 The conserved 10 coronavirus macrodomain promotes virulence and suppresses the innate immune response during 11 severe acute respiratory syndrome coronavirus infection Large-scale detection of drug off-targets: 14 hypotheses for drug repurposing and understanding side-effects Oxidative stress in malaria 20 parasite-infected erythrocytes: Host-parasite interactions 4-Acyl pyrroles: 23 Mimicking acetylated lysines in histone code reading An engineered lipocalin that tightly 26 complexes the plant poison colchicine for use as antidote and in bioanalytical applications PDE5 inhibitors -pharmacology and clinical applications 20 years after sildenafil 29 discovery Folate and antifolate pharmacology Inhibitors and Structure-Activity 32 Relationships of Human Cytochrome P450 2C9 and Implications in Drug Development Structural Basis for Catalytic and Inhibitory Mechanisms of Human Prostaglandin Reductase PTGR2 Mechanisms of 38 peroxisome proliferator activated receptor γ regulation by non-steroidal anti-inflammatory drugs Inhibition of Viral 41 Macrodomain of COVID-19 and Human TRPM2 by losartan Tay-Sachs disease mutations in HEXA target the α chain of 44 hexosaminidase A to endoplasmic reticulum-associated degradation Perspectives on progressive strategies and recent trends in 47 the production of recombinant human factor VIII Down-regulation of the 2 tumor suppressor gene C-terminal Src kinase: An early event during premalignant colonic epithelial 3 hyperproliferation Different origin of 5 adipogenic stem cells influences the response to antiretroviral drugs Deficiency in Serine Protease Inhibitor Neuroserpin Exacerbates Ischemic Brain Injury by Increased 9 Ceruloplasmin and iron in Alzheimer's disease 11 and Parkinson's disease: A synopsis of recent studies Selenocysteine Lyase-Mediated Selenium Recycling Pathway Leads to Metabolic Syndrome in Mice Activation of picornaviral IRESs by PTB shows differential 17 dependence on each PTB RNA-binding domain Contribution of 20 neutrophil-derived myeloperoxidase in the early phase of fulminant acute respiratory distress 21 syndrome induced by influenza virus infection Influenza virus infection causes 24 neutrophil dysfunction through reduced G-CSF production and an increased risk of secondary bacteria 25 infection in the lung Perturbation of intracellular cholesterol and fatty acid homeostasis during 27 flavivirus infections Neurologic complications of HIV-1 infection and its treatment in the era of 29 antiretroviral therapy A comparison of neurodegeneration linked with 32 neuroinflammation in different brain areas of rats after intracerebroventricular colchicine injection Risk of stroke associated with nonsteroidal anti-inflammatory drugs. Vasc Health 35 A case of Parkinsonism worsened by losartan: A 37 probable new adverse effect Proteomic investigations of 39 human HERC2 mutants: Insights into the pathobiology of a neurodevelopmental disorder Importance of the Brain Angiotensin System in Parkinson's Disease Binding Protein Affects Coronavirus RNA Accumulation Levels and Relocalizes Viral RNAs to 45 Novel Cytoplasmic Domains Different from Replication-Transcription Sites PARP9 and PARP14 48 cross-regulate macrophage activation via STAT1 ADP-ribosylation AT4) Receptor Is the Enzyme Insulin-regulated Aminopeptidase EphrinB3 are critical for its use as an alternative receptor for Nipah virus Altered inhibitory 9 synapses in de novo GABRA5 and GABRA1 mutations associated with early onset epileptic 10 encephalopathies Brain Penetrant p38αMAPK Inhibitor Candidate for Neurologic and Neuropsychiatric Disorders That 13 Attenuates Neuroinflammation and Cognitive Dysfunction The Role of the Mineralocorticoid Receptor in 16 Inflammation: Focus on Kidney and Vasculature Interleukin-17A (IL-17A), a 19 key molecule of innate and adaptive immunity, and its potential involvement in COVID-19-related 20 thrombotic and vascular mechanisms Advances in the development of shape similarity methods and their 23 application in drug discovery Pharmacology and clinical experience with amprenavir Current and Future Use of Chloroquine and Hydroxychloroquine in 27 Infectious, Immune, Neoplastic, and Neurological Diseases: A Mini-Review Antimalarial Therapy for Lupus Erythematosus: An Apparent 30 Advantage of Quinacrine Vardenafil for 33 treatment of men with erectile dysfunction: efficacy and safety in a randomized, double-blind, 34 placebo-controlled trial Comparison of clinical trials with sildenafil, vardenafil and tadalafil in erectile 36 dysfunction Pomalidomide and its clinical potential for relapsed or refractory multiple 38 myeloma: An update for the hematologist Lenalidomide and thalidomide: Mechanisms of action -Similarities and differences Molecular basis for pharmacokinetics and pharmacodynamics of methotrexate in 43 rheumatoid arthritis therapy New Approaches to Cancer 46 Chemotherapy with Methotrexate Adverse drug reaction to methotrexate: 1 pharmacogenetic origin Potential covalent drugs targeting the main protease of the SARS-CoV-2 3 coronavirus Nanoparticles for combination drug therapy