key: cord-0879897-ctzgblrq authors: Zhou, Yadi; Wang, Fei; Tang, Jian; Nussinov, Ruth; Cheng, Feixiong title: Artificial intelligence in COVID-19 drug repurposing date: 2020-09-18 journal: Lancet Digit Health DOI: 10.1016/s2589-7500(20)30192-8 sha: d1a25588ed177489da5570d49489800b6714ae6c doc_id: 879897 cord_uid: ctzgblrq Drug repurposing or repositioning is a technique whereby existing drugs are used to treat emerging and challenging diseases, including COVID-19. Drug repurposing has become a promising approach because of the opportunity for reduced development timelines and overall costs. In the big data era, artificial intelligence (AI) and network medicine offer cutting-edge application of information science to defining disease, medicine, therapeutics, and identifying targets with the least error. In this Review, we introduce guidelines on how to use AI for accelerating drug repurposing or repositioning, for which AI approaches are not just formidable but are also necessary. We discuss how to use AI models in precision medicine, and as an example, how AI models can accelerate COVID-19 drug repurposing. Rapidly developing, powerful, and innovative AI and network medicine technologies can expedite therapeutic development. This Review provides a strong rationale for using AI-based assistive tools for drug repurposing medications for human disease, including during the COVID-19 pandemic. The artificial intelligence (AI) pioneers of the 1950s foresaw building machines that could sense, reason, and think like people-a proof-of-concept known as general AI. 1 The rapid growth in computing power and memory storage, an unprecedented wealth of data, and the development of advanced algorithms have led to substantial breakthroughs in AI. AI applications cover diverse fields, such as computer vision, voice recognition, natural language understanding, and digital pathology data analysis. Similarly, AI has been revolutionising drug discovery by extracting hidden patterns and evidence from biomedical data. Pharmaceutical companies and start-ups have used AI for drug discovery and development. 2 For example, IBM's Watson Health platform searches for drugs from vast amounts of textual data, including laboratory data, clinical reports, and scientific publications. 3 In this Review, we focus on AI technologies for a specific domain in drug discovery-that of drug repurposing-which offers rapid and cost-effective solutions for therapeutic development. These merits are especially clear in the COVID-19 global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), where de-novo drug discovery is almost infeasible (figure 1). Thus, the pandemic is a good opportunity for introducing advanced AI algorithms combined with network medicine for drug repurposing. One study estimated that pharmaceutical companies spent US$2·6 billion in 2015, up from $802 million in 2003, for the development of a new chemical entity approved by the US Food and Drug Administration (FDA). 4 The increasing cost of drug development is due to the large volume of compounds to be tested in preclinical stages and the high proportion of randomised controlled trials (RCTs) that do not find clinical benefits or with toxicity issues. Given the high attrition rates, substantial costs, and low pace of de-novo drug discovery, exploiting known drugs can help improve their efficacy while minimising side-effects in clinical trials. As Nobel Prize-winning pharmacologist Sir James Black said, "The most fruitful basis for the discovery of a new drug is to start with an old drug". 5 Drug repurposing, also termed drug repositioning, reprofiling or re-tasking, is a strategy for identifying new indications for approved or investigational (including clinically failed) drugs that have not been approved (panel) . Because the safety of these drugs has already been tested in clinical trials for other applications, repurposing known drugs can bring medications to patients much faster and with less cost than that of developing new drugs. For decades, academic institutions and science funders have championed the idea that screening libraries of existing drugs with various tests could uncover new applications, and have made observations that have led to medicines designed for one disease finding uses in another. Well known examples include sildenafil citrate for erectile dysfunction, 6 thalidomide for multiple myeloma, 7 and remdesivir for treatment of COVID-19. 8 Indeed, the increasing number of repositioned medications led to the idea that a systematic (hypothesis-free) screen of all known drugs might uncover additional compatible targets. The strategy of drug repurposing is a powerful solution for emerging diseases, 9 such as COVID-19. Yet, without foreknowledge of the complete drug-target network, development of promising and affordable approaches for effective treatment of complex diseases is challenging. 10 Because drug targets do not operate in isolation from the complex system of proteins that comprise the molecular machinery of the cells with which they associate, each drug-target interaction (panel) should be examined in an integrative context (figure 2). 11 Therapeutic interventions need to consider the perturbation of disease system properties (termed network medicine [panel]), and have little to do, functionally speaking, with genetic and genomic events alone. 12 Observations and advances in network medicine further indicate that perturbations of cellular systems and the human interactome (panel) underlie the disease, which is the essence of drug discovery and develop ment. 12 Knowledge of the interplay Review between drug targets and human diseases can provide clues for possible drug repurposing because drugs that target one disease might target another through a shared functional protein-protein interaction network. 11 For example, SARS-CoV-2 requires host cellular factors (such as angiotensin I converting enzyme 2 [ACE2], trans membrane serine protease 2 [TMPRSS2], and furin; figure 1) for successful replication during infection. 13, 14 Systematic targeting of the viral protein and host protein interactions (the SARS-CoV-2 interactome) offers a novel strategy for effective drug repurposing for COVID-19. A SARS-CoV-2 virus-host interactome that contains 332 high-confidence protein-protein interactions between 26 viral proteins and human proteins was structed using affinity purification mass spectrometry. 15 69 drug candidates were prioritised that can target the host proteins in the SARS-CoV-2-host interactome. 15 Experimental assays validated the antiviral activities of two sets of agents: mRNA translation inhibitors (ie, zotatifin) and sigma-1 and sigma-2 receptor regulators (ie, haloperidol). In another study, 16 a network-based methodology that quantifies the interplay between the virus-host interactome and drug targets in the human interactome network suggested 16 repurposed drug candidates for potential treatment of COVID-19. 16 This finding calls for a detailed approach, including AI and network medicine, and raises the question of not only which protocols to consider, but also which factors to scrutinise, and broadly, how to integrate the disciplines (figure 2). Deep learning is a subfield of machine learning that refers to the paradigm of exploring the data with layers of linear and non-linear transformations organised in a hierarchical way. 17 The most widely used deep learning model is artificial neural networks, wherein the basic building block is an artificial neuron that non-linearly transforms the weighted sum of input feature variables. Fully connected feedforward neural network (FNN) is an architecture in which the artificial neurons are connected layer-by-layer from input features to output targets. A weight is associated with each connection and is optimised by minimising the prediction loss of the output targets through backpropagation on training samples. 18 FNNs are typically used for data samples represented as vectors. For example, Aliper and collea gues 19 used FNN to classify drugs into pharma ceutical therapeutic classes based on the drugs' transcriptomic profile vectors. Lenselink and colleagues 20 compared the performance of a diverse set of algorithms on the prediction of molecule and target activity with the ChEMBL database. 20 The authors showed that the inclusion of target data can lead to better models. FNN can achieve better performance than that of con ventional machine learning methods, such as logistic regression. In the case of images being the input where each pixel is a feature variable, FNNs become infeasible as the number of weights becomes far too large. However, convolutional neural network (CNN; panel) is particularly suitable for image processing. Instead of fully connecting neurons in adjacent layers, CNN uses filters (small matrices of weights) that apply a convolution operation on local patches of the images, which greatly reduces the number of weights. CNN has been used to analyse chemical images to obtain insight into drug therapeutic functions. 21 For example, AtomNet predicts the binding affinity of small molecules to proteins on the basis of the structural information extracted by CNN. 22 Biological sequences are another widely explored type of data for drug repurposing. However, neither FNN nor CNN appropriately consider the sequential nature of the Review data. Recurrent neural networks (RNNs) are specifically designed for sequences in which the main building block is a recurrent cell appearing at each timestamp or sequence location that retains past information while learning new information in a sequence. RNN models have been used for generating focused molecule libraries for drug discovery, with the molecules represented as sequences using simplified molecular input line entry system codes. 23 Gao and colleagues 24 developed a hybrid approach of graph neural network and RNN to predict drug-target interactions. Beck and colleagues 25 developed a hybrid CNN and RNN model called Molecule Transformer-Drug Target Interaction to predict whether any commercially available antiviral drugs could work in SARS-CoV-2.The authors computationally identified several known antiviral drugs, such as atazanavir, remdesivir , efavirenz, ritonavir, and dolutegravir, for the potential treatment of SARS-CoV-2 infection. A classic way to repurpose drugs is through network medicine, which includes the construction of medical knowledge graphs containing relationships between different kinds of medical entities (eg, diseases, drugs, and proteins) and predicts new links between existing approved drugs and diseases (eg, . Methods that are based on graph embedding have been gaining attention for link prediction in graphs 26 that represent nodes and edges as low-dimensional feature vectors. [27] [28] [29] [30] Using the feature vectors of drugs and diseases, we can easily measure their similarities and therefore identify effective drugs for a given disease. One challenge for the graph embedding method is scalability. Real-world (knowledge) graphs are usually large. The number of entities in a medical knowledge graph could be as many as several million. Existing machine learning systems such as TensorFlow and PyTorch are mainly designed for data with regular structures but not for large-scale graphs. Therefore, several systems that are specifically designed for learning representations from large-scale graphs have been developed. For example, Zhu and colleagues 31 developed a high-performance system named GraphVite that could be promising for future drug repurposing because the system can efficiently process tens or even hundreds of millions of nodes. Increasing interest exists in developing graph representation learning techniques for drug repurposing. Sosa and colleagues 32 constructed a medical knowledge graph of drugs, diseases, genes, and proteins from the biomedical literature and used graph embedding techniques for predicting the links between drugs and diseases. Gysi and colleagues 33 developed a method that was based on graph neural network and presented a case study on SARS-CoV-2 with 81 potential repurposing candidates identified. BenevolentAI's knowledge graph is a large repository of structured medical information, including num erous connections extracted from the scientific Drug-target network A bipartite graph composed of approved drugs and proteins linked by drug-target binary associations. A strategy for identifying new indications for approved or investigational (including clinically failed) drugs that have not been originally approved or dedicated (also termed drug repositioning, reprofiling or re-tasking). Represents a group of nodes (ie, proteins or genes) whose perturbation can be linked to a particular disease (eg, COVID-19) phenotype. An inter-discipline that applies systems biology principles and data science techniques in pharmacology. The set of physical protein-protein interactions (the interactome) in human cells. A discipline that seeks to redefine disease and therapeutics from an integrated perspective using systems biology and network science methodologies, offering important applications to drug design. The basic unit of graphs. Usually visualised as circles (or in other shapes), nodes represent basic entities, such as drugs and proteins. A basic unit of graphs that connects two nodes. Usually visualised as lines (with arrows if directed), edges represent the relationships (eg, protein interaction) between the nodes. Measures the distances between two modules, such as drug-target and disease-gene modules. Several proximity measures have been defined, such as shortest, closest, separation, kernel, and centre measures. 11 The study of building machines or programmes that exhibit human intelligence in doing specific or general tasks. A subset of AI algorithms that can learn from data, therefore removing the need for explicit instructions on how to do certain tasks. A general term referring to multilayer neural networks. Neural network architectures specifically designed for analysing image data, which generally include multiple layers of convolutional layers and pooling layers. Specific deep learning techniques that are developed for learning feature representations of graph structure data. A new generation of visible approaches that aim to guide the structure of machine learning models with an increasingly extensive knowledge of a biological mechanism. Review literature by machine learning. 34 BenevolentAI predicted that baricitinib, a drug used to treat rheumatoid arthritis, could be a potential treatment for COVID-19 through the inhibition of AP2-associated protein kinase 1 (encoded by AAK1; figure 1) . A team constructed a compre hensive COVID-19 knowledge graph (termed CoV-KGE) that included 15 million edges across 39 types of relationships connecting drugs, diseases, proteins, genes, pathways, and expressions of genes and proteins 35 from a large scientific corpus of 24 million PubMed publications. Using Amazon Web Services' computing resources and graph representation learning techniques, the team identified 41 repurposed drug candidates (including dexamethasone and melatonin) for COVID-19 treatment. To achieve a high prediction performance, the construction of a high-quality medical knowledge graph is essential, which itself is a promising direction for future research. Several examples exist of repurposed drugs being or having been tested in clinical trials for COVID-19, including antiviral drugs and host-targeting therapies (figure 1). A detailed discussion of repurposed drugs for COVID-19 can be found in a review by Sanders and colleagues. 36 Remdesivir, a monophosphate prodrug of an active C-adenosine nucleoside triphosphate analogue, was originally discovered for the potential treatment of Ebola virus disease. 36 Remdesivir has shown promise in the treatment of COVID-19, prompting emergency use clearance from the FDA, although indication is limited to severe disease only. The FDA made this decision on the basis of early research showing that the drug might help speed up recovery for hospitalised patients with COVID-19. Mechanistically, remdesivir was shown to inhibit the viral RNA-dependent RNA polymerase (figure 1). 37 A doubleblind, randomised, placebo-con trolled trial of intravenous remdesivir in adults hos pitalised with COVID-19 showed that remdesivir significantly shortens the median recovery time to 11 days, compared with 15 days in the placebo group. 8 These preliminary findings support the use of remdesivir for patients who are hospitalised with COVID-19 and require supplemental oxygen therapy. However, another randomised, open-label, phase 3 trial involving hospi talised patients not requiring mechanical ventilation did not show a significant difference between a 5-day course and a 10-day course of remdesivir. 38 Further investigation of the clinical benefits of rem desivir for patients with COVID-19 in different patient subgroups with or without mechanical ventilation is needed to identify the shortest effective duration of therapy. Additionally, whether remdesivir can shorten the recovery course of individuals with early COVID-19 is unknown. A study using machine learning and statistical analysis approaches discovered that mefuparib (CVL218), a poly-ADP-ribose polymerase 1 inhibitor (figure 1), blocked SARS-CoV-2 replication without obv ious toxic effects in vitro. 39 The antiviral activity of mefuparib is more potent at viral entry and similar at viral post entry compared with remdesivir, sug gesting the drug to be a potential anti-SARS-CoV-2 drug candidate. Toremifene, a first-generation selective estrogenic receptor modulator that is non-steroidal, was approved for the treatment of breast cancer in 1997. 40 A network medicine analysis identified toremifene as a top candidate for the treatment of COVID-19. 16 In vitro assays indicated that toremifene blocked various viral infections at micromolar concentration, including Middle East Respiratory Syndrome coronavirus, 41 severe acute respiratory syndrome coronavirus, 42 and SARS-CoV-2. 43 A further computational biophysics study 44 suggested that toremifene might block interaction between ACE2 and the spike protein of SARS-CoV-2 and might inhibit nonstructural protein 14 of SARS-CoV-2 (figure 1), mechanistically supporting the drug's antiviral activities. The mean plasma concentration of toremifene during administration of 60 mg per day was 0·88 mg/L (2·17 µM) in post-menopausal patients with breast cancer 40 and the peak plasma concentration (>10 µM) of toremifene (60 mg per day) was approximately three-times the antiviral effect on SARS-CoV-2 (half-maximal inhibitory Review concentration of 3·58 µM). 43 In summary, toremifene, identified by CoV-KGE 35 and network medi cine 16 approaches, offers a potential drug candidate to be tested in COVID-19 clinical trials. SARS-CoV-2 causes up-regulation of systemic inflammation, 45 in some cases culminating in a cytokine storm, underscoring the high potential for treatment success by using a drug targeting inflammation and immune response (including baricitinib, dexamethasone, and melatonin; figure 1 ). By combining findings based on network medicine and large-scale patient data analysis from the COVID-19 patient registry at the Cleveland Clinic, Cleveland, OH, USA, resear chers found that melatonin intake was associated with a 50-60% reduced likelihood of a positive laboratory test result for SARS-CoV-2. 46 Dexamethasone is a gluco corticoid receptor (figure 1) agonist approved by the US FDA for a variety of inflammatory and autoimmune conditions. 47 Dexamethasone was identified as a top repur posed drug candidate by CoV-KGE. 35 The ran domised trial 48 [0·72-0·94]). However, dexamethasone did not reduce mortality in COVID-19 patients not receiving respiratory support. Altogether, these data suggest that targeting excessive host inflammation by immune modulators or anti-inflam matory drugs offers a therapeutic strategy for severe COVID-19, warranting testing in large-scale RCTs. Monotherapies, including remdesivir 8,38 and hydroxychloroquine, 49 have shown little or no clinical benefit for patients with COVID-19. Because the immune system plays key roles in the worsening health and death of patients with COVID-19, 50 combining inflammatory or immune modulators (ie, boosting host immunity) with antiviral drugs might offer an effective treatment for patients with COVID-19. Drug combinations, offering increased therapeutic efficacy and reduced toxicity, play an important role in treating infectious diseases, inclu ding COVID-19 (eg, remdesivir plus baricitinib [NCT04401579]). However, our ability to identify and vali date effective combinations is limited by a huge increase in the number of possible drug pairs. Using a network-based methodology, 51 scientists identified three potential drug combinations for COVID-19, 16 including sirolimus plus dactinomycin, mercaptopurine plus melatonin, and toremifene plus emodin. These combinations are based on theoretical analysis using the interactome and have not been tested in preclinical or clinical studies. The same team further observed that com bining melatonin and toremifene showed potential for use in the treatment of COVID-19. 52 The selective estrogen modu lation and melatonin in early COVID-19 (SENTINEL; NCT04531748) trial is being done at the Cleveland Clinic to test the clinical efficacy of combining melatonin and toremifene therapy in patients with early COVID-19. 52 Using BenevolentAI's knowledge graph, 34 baricitinib was identified as potential treatment for COVID-19. At least two phase 2 randomised, double-blind trials of baricitinib alone or as part of combination therapy with antiviral drugs (eg, remdesivir) are underway for patients with moderate and severe COVID-19 (NCT04373044 and NCT04401579). Another important aspect of using AI for drug repurposing is the use of real-world data, such as electronic health records, in searching for effective repurposed drug candidates. Electronic health records are patient clinical data that are routinely collected, such as demographics, diagnoses, medications, procedures, and laboratory test results, stored in digital form, which can be exchanged and accessed securely. 53 Extensive discussions have taken place on leveraging real-world data for drug discovery and development. 54 On the one hand, patients in real-world data are more representative of patients who will receive the prescription when the drug is on the market than patients in RCTs are, who are enrolled on the basis of strict inclusion and exclusion criteria. On the other hand, typically treatment and control groups are required to precisely estimate the treatment effects. However, for certain scenarios, such as remdesivir trials for COVID-19, only a single treatment group is possible, which makes estimating the treatment effect difficult. 55 In this case, because of the inclusion of many diverse patients, real-world data contain rich information for synthesising a potential control group, which can then be compared with the treatment group in an RCT to help estimate the treatment effects. Despite the promises of real-world data, deriving insights from real-world data that are similar to those from RCTs is challenging because real-world data have higher dimensionality (including confounding factors), a broader population, and usually lower data quality compared with RCT data. Propensity score, which calculates the likelihood of the patient receiving the treat ment from a set of potential confounding factors using logistic regression, is the standard technique to do patient matching. 56 However, the calculation of the likelihood of the patient receiving the treatment from such a set is much more complicated in real-world data because of the associated challenges such as high dimensionality, longitudinality, irregularity, and incompleteness. In this case, the advanced machine learning models can estimate propensity scores more accurately than tradi tional logistic regression-based propensity score matching Review approaches can. 57, 58 Moreover, other types of matching techniques, such as patient similarity analy tics, 59 also hold promise in these complicated scenarios. The initiative to build national or international electronic health records repositories for COVID-19 research has been undertaken. One such repository is the international consortium 4CE, which includes the electronic health records of patients from 96 hospitals across five countries. All participants' electronic health records are matched to a common data model with Integrating Biology and the Bedside 60 or Observational Medical Outcomes Partnership (OMOP). 61 A retrospective cohort study of 1438 patients with laboratory-confirmed COVID-19 admitted to hospital in metropolitan New York, USA, revealed that treatment with hydroxychloro quine, azithromycin, or both, compared with neither treatment, was not significantly associated with differences in inhospital mortality for patients with COVID-19. 49 From a COVID-19 registry of nearly 20 000 patients with 1600 COVID-19-positive patients from the Cleveland Clinic Health System electronic health records, and using a user-active comparator design and propensity score adjustment for confounding, melatonin usage was shown to be associated with a reduced likelihood of a positive SARS-CoV-2 test result by RT-PCR assay. 46 Mancia and colleagues 62 showed that angiotensin-converting enzyme inhibitors (ACEIs) or angiotensin II receptor blockers (ARBs) were not associated with the risk of COVID-19. An independent study revealed that the use of ACEIs or ARBs were not associated with an increase of the likelihood for a positive COVID-19 test or an increase in the COVID-19 severity using New York University Langone Health electronic health records. 63 RCTs are underway to test the clinical benefits of melatonin in patients with COVID-19 (NCT04409522 and NCT04353128). For decades, translational science has faced the challenge of how to translate research findings into new effective medicines and technologies that rapidly deliver the medicines. This challenge has encouraged basic and translational sciences to work together towards this pivotal aim. Generations of scientists have struggled to make headway in de novo drug discovery. In principle, a strategy involving drug repurposing, in which a drug has already been tested and approved by the US FDA, can overcome the barriers of de novo drug discovery. However, the volume of approved or clinically failed drugs is large, emphasising the difficulty of which drug to select that would be highly effective for the disease in question. Despite the enthusiasm for drug repurposing in treating COVID-19, challenges remain. Cellular or animal assays might not reflect the host environment of the virus infection in humans. Also, repurposed drugs might have been optimised for a particular target, dosing, or tissue in the original indications. Rapid clinical tests of existing antiviral, antimalarial, and immunomodulatory drugs have been done or are underway against COVID-19. Many trials did not optimise the drug's clinical benefits and biological questions because of their expedient design, lack of clinical endpoints, small number of patients enrolled (thus lack of statistical power), and more. 64 For example, hydroxy chloroquine shows potential anti-SARS-CoV-2 activities in in vitro assays. 65 However, hydroxychloroquine has shown very little or no efficiency in preclinical 66 and clinical trial studies. 49 Few reproducible preclinical animal models and gold-standard clinical outcome measures in COVID-19 trials might also result in some failures to find clinical benefits. Tools and analyses with greater sensitivity are also required to detect differences between drugs and placebos, especially as more mildly affected patients with COVID-19 are included in trials. The presence of heterogeneous populations with different genetic backgrounds might also affect outcomes of clinical results. Possible factors contributing to these clinical trial findings that should be accounted for in future trials include targeting the wrong pathobiological or pathophysiological mechanisms in COVID-19; using drugs that do not engage with the intended target (including virus proteins and virus-host and proteinprotein interactome); intervening at the wrong stage of the disease, including early, mild, moderate, and severe illnesses; lacking translatable pharmacodynamic and pharmacokinetic (ie, poor lung penetration) biomarkers; depending on in vitro antiviral activities and not using appropriate animal models with poor predictive efficacy; not addressing the rapid disease progression of COVID-19 in a short period; and not accurately monitoring the complexity of the clinical and biological characteristics to therapeutic intervention. Although AI-based drug repurposing is in the developmental stage, several examples have shown encouraging results, including baricitinib identified by BenevolentAI, 34 dexamethasone 48 predicted by CoV-KGE, 35 and melatonin from network medicine-based findings (figure 1). 46 The development of effective and robust in vitro and in vivo models can reduce the failure rate of drug repurposing between preclinical studies and clinical trials for COVID-19. 67, 68 Genotype-informed drug repurposing (termed personalised drug repurposing) might further improve the success rate of clinical trials. 69 Given the highly complex and regulated nature of drug development, a long-term vision is needed when developing AI applications in drug repurposing that could increase efficiency and effectiveness in the various processes involved, and reduce the barriers between the numerous research components in the ecosystem to create new therapy options. AI technologies, such as visible neural networks, 70 incorporate the AI model's inner For more on 4CE see https://www.covidclinical.net Review workings into real systems of biomedical sciences (eg, human and animal). For example, visible machine learning approaches might guide model structures of data heterogeneity in the life sciences and translate patient data to successful therapies. 71 Biological systems are complex and hierarchical (figure 2), com posed of multiple levels such as sequences, protein complexes, cells, tissues, organs, and organisms. Drug discovery is a complicated process involving multilevel interactions between chemical compounds and biological systems. Therefore, a potential way of building an effec tive and interpretable model of drug discovery is to enrich the biologically-inspired visible neural network model 72 with drug-related entities such as chemical compounds and diseases. The biomedical knowledge on how different entities interact with each other at different levels can be leveraged to guide the design of the corresponding computing modules. 35 Compared to cur rent deep learning models which try to model the entire system with a complex model at once, this divide-and-conquer scheme models the different components in the complex system and how these components interact with each other in an explicit and transparent way. The model parameters can be optimised in an end-to-end way as in other deep learning models. Data harmonisation refers to the process of standardising and integrating information from disparate sources to form a unified database. Data harmonisation is a crucial step for guaranteeing that the machine-learning based models that are developed are widely applicable in different scenarios. Establishing a high quality data model (which is a prerequisite for organising and standardising the data) is the foundation for the harmonisation process. National and international efforts aim to build common data models such as the national patientcentered clinical research network 72 and the observational health data science and informatics programme. 73 Fast healthcare interoperability resources 74 represent another type of standard, which defines how these data should be exchanged. In addition to data harmonisation, model harmonisation, which defines a unified standard for storing the computational models, is also an important aspect to enhance the generalisability and utility of the computational drug repurposing tools. The open neural network exchange (ONNX) is an example of such efforts aiming to build model exchanging standards that are interoperable. ONNX defines implemented models as an extensible acyclic graph model. Each node on the graph is a call to built-in operators with inputs and outputs defined using standard data types. With the enhanced availability of health-related data (especially patient data), concerns have been raised regarding data security and privacy. 75 For example, demographics and DNA sequencing data have an increased risk of making patients identifiable. Efforts should be made to scrutinise each stage in the data life cycle. For example, questions pertaining to what type of data will be collected; whether the data are necessary; who will collect the data; how the data will be used, stored, and transferred; what the rights are of the person whose data are being collected, and others, should be addressed carefully. Additionally, regulations and transparency are crucial for appropriate data collection and use, and so is an increased public aware ness. Towards this goal, federated learning 76 could be a promising direction, which trains algorithms across decentralised edge devices (eg, individual mobile phones) or servers hosting different local samples (eg, data owned by different samples). Data samples are not shared or centralised and only the trained models are communicated, which might improve data security and privacy of patient data for drug-disease outcome validation in drug repurposing. Advances in pharmacogenetics and pharmacogenomics indicated that disease treatment would be considerably improved if therapies were guided by individual's genomic profiles. This hypothesis has garnered initial success in some diseases, including cancer. 77 Responsive ness to a drug is influenced by genetic, epigenetic, and environmental factors. SARS-CoV-2 infection has shown large inter-individual variabilities, ranging from asymp tomatic to severe and lethal disease. One possible hypothesis is that human genetics might determine clinical characteristics and drug responses. 69, 78 For example, analysis of approximately 81 000 genomes and exomes from the general population suggested that hydroxychloroquine or chloroquine might only work for TMPRSS2-absent patients who are infected by SARS-CoV-2. 69 An international team showed that hydroxychloroquine has antiviral activity in the kidney cells of African green monkeys without TMPRSS2 expression (VeroE6) but not in the model of reconstituted human airway epithelium developed from primary nasal or bronchial cells. 67 Additionally, another team showed that chloroquine does not block SARS-CoV-2 infection of the TMPRSS2-positive lung cell line Calu-3. 68 These preli minary findings highlight the importance of pharma cogenomics studies in improving clinical benefits and the success rate of drug repurposing. A COVID-19 host genetics initiative is underway to generate, share and analyse data in a search for the genetic determinants of COVID-19 susceptibility, severity, and outcomes, and personalised treatment. Therefore, AI techniques could leverage massive genetic and genomic data to identify human genetic determinants of SARS-CoV-2 patho genesis, which presents a unique opportunity for drug repurposing in precision medicine and personalised treatment for individuals with COVID-19 ( figure 3) . The future of AI-informed drug repurposing Drug selection among the many approved ones, while avoiding time-consuming searches, can present uncertainty. To date, AI's potential ability to identify new candidate therapies that can be made available for clinical trials rapidly and, if approved, merged into health care is unparalleled, making AI a centrepiece of advanced tech nologies. Because of this, AI is a promising method for accelerating drug repurposing for human diseases, espe cially emerging diseases, such as COVID-19. With the availability of big data, including biological, clinical, and open data (scientific publications and data repositries), novel AI techniques capable of leveraging these large sets of biomedical data are in high demand. Pharmaceutical scientists, computer scientists, statisticians, and physi cians are increasingly involved in developing and adopting AI-based technologies for the rapid development of thera peutics. AI approaches, coupled with big data, have the potential to substantially improve the efficiency and effec tiveness of drug repur posing and aid medical decision making of therapeutic benefits with real-world evidence for various complex human diseases, such as COVID-19 (figure 1) and Alzheimer's disease. 79 However, challenges remain in developing these AI tools, such as data heterogeneity and low quality, and insufficient data sharing by pharma ceutical companies, as well as the security and inter pretability of the models. We expect future successful AI models for drug repurposing to be accurate in terms of the generated outcomes, integrative of disparate information types and sources, interoperable in diverse deployment settings, inter pretable of internal working mechanisms, and robust to noise and adversarial attacks. Computing machinery and intelligence How artificial intelligence is changing drug discovery AI-powered drug discovery captures pharma interest The $2·6 billion pill·methodologic and policy considerations New uses for old drugs Sildenafil: from angina to erectile dysfunction to pulmonary hypertension and beyond Hematology: thalidomide maintenance in multiple myeloma Remdesivir for the treatment of Covid-19-Preliminary Report Systems biology-based investigation of cellular antiviral drug targets identified by gene-trap insertional mutagenesis Target identification among known drugs by deep learning from heterogeneous networks Network-based approach to prediction and population-based validation of in silico drug repurposing Putting the patient back together-social medicine, network medicine, and the limits of reductionism A multibasic cleavage site in the spike protein of SARS-CoV-2 is essential for infection of human lung cells SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor A SARS-CoV-2 protein interaction map reveals targets for drug repurposing Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2 Deep learning Learning representations by back-propagating errors Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set Learning drug functions from chemical structures with convolutional neural networks and random forests AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery Generating focused molecule libraries for drug discovery with recurrent neural networks Interpretable drug target prediction using deep neural representation Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model LINE: large-scale information network embedding Translating embeddings for modeling multi-relational data Complex embeddings for simple link prediction Convolutional 2D knowledge graph embeddings RotatE: knowledge graph embedding by relational rotation in complex space GraphVite: a high-performance CPU-GPU hybrid system for node embedding A literature-based knowledge graph embedding method for identifying drug repurposing opportunities in rare diseases Network medicine framework for identifying drug repurposing opportunities for COVID-19 Baricitinib as potential treatment for 2019-nCoV acute respiratory disease Repurpose open data to discover therapeutics for COVID-19 using deep learning Pharmacologic treatments for coronavirus disease 2019 (COVID-19): A Review Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir Remdesivir for 5 or 10 days in patients with severe Covid-19 A data-driven drug repositioning framework discovered a potential therapeutic agent targeting COVID-19 A review of its pharmacological properties and clinical efficacy in the management of advanced breast cancer MERS-CoV pathogenesis and antiviral efficacy of licensed drugs in human monocyte-derived antigen-presenting cells Repurposing of clinically developed drugs for treatment of Middle East respiratory syndrome coronavirus infection Identification of antiviral drug candidates against SARS-CoV-2 from FDA-approved drugs Repurposing of FDA-approved toremifene to treat COVID-19 by blocking the spike glycoprotein and NSP14 of SARS-CoV-2 Extrapulmonary manifestations of COVID-19 A network medicine approach to prediction and patient-based validation of disease manifestations and drug repurposing for COVID-19 Corticosteroids: mechanisms of action in health and disease Dexamethasone in hospitalised patients with Covid-19-preliminary report Association of treatment with hydroxychloroquine or azithromycin with in-hospital mortality in patients with COVID-19 in new york state The trinity of COVID-19: immunity, inflammation and intervention Network-based prediction of drug combinations COVID-19 treatment: combining anti-inflammatory and antiviral therapeutics using a network-based approach Definition, structure, content, use and impacts of electronic health records: a review of the research literature Real-world evidence-what is it and what can it tell us? Randomized clinical trials and COVID-19: managing expectations The central role of the propensity score in observational studies for causal effects Improving propensity score weighting using machine learning Combining machine learning and propensity score weighting to estimate causal effects in multivalued treatments Supervised patient similarity measure of heterogeneous patient records Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) Advancing the science for active surveillance: rationale and design for the Observational Medical Outcomes Partnership Renin-angiotensin-aldosterone system blockers and the risk of Covid-19 Renin-angiotensinaldosterone system inhibitors and risk of Covid-19 Characteristics of registered clinical trials assessing treatments for COVID-19: a cross-sectional analysis Hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting SARS-CoV-2 infection in vitro Antiviral efficacies of FDA-approved drugs against SARS-CoV-2 infection in ferrets Hydroxychloroquine use against SARS-CoV-2 infection in non-human primates Chloroquine does not inhibit infection of human lung cells with SARS-CoV-2 New insights into genetic susceptibility of COVID-19: an ACE2 and TMPRSS2 polymorphism analysis Using deep learning to model the hierarchical structure and function of a cell Visible machine learning for biomedicine Launching PCORnet, a national patient-centered clinical research network Observational health data sciences and informatics (OHDSI): Opportunities for observational researchers Healthcare in the age of interoperability: the promise of fast healthcare interoperability resources Ethics and governance for digital disease surveillance Federated learning for healthcare informatics Review: Precision medicine and driver mutations: Computational methods, functional assays and conformational principles for interpreting cancer drivers Genomewide association study of severe Covid-19 with respiratory failure Harnessing endophenotypes and network medicine for Alzheimer's drug repurposing This work was supported by the National Institute of Aging of the US National Institutes of Health (NIH; R01AG066707 and 3R01AG066707-01S1) and the National Heart, Lung, and Blood Institute of the NIH (R00HL138272) to FC. This work was supported by the VeloSano Pilot Program (Cleveland Clinic Taussig Cancer Institute, Cleveland, OH, USA) to FC. This work has been also supported with federal funds from the Frederick National Laboratory for Cancer Research, NIH (HHSN261200800001E) to RN. This research was supported by the Intramural Research Program of the NIH, Frederick National Lab, Center for Cancer Research to RN. This work was supported by the US National Science Foundation (1750326 and 2027970) and US Office of Naval Research (N00014-18-1-2585) to FW. We are thankful for all helpful discussions and critical comments regarding this manuscript from the COVID-19 Research Intervention Advisory Committee members at the Cleveland Clinic.