key: cord-0588469-877l2ha5 authors: Cheng, Tianyue; Fan, Tianchi; Wang, Landi title: Genetic Constrained Graph Variational Autoencoder for COVID-19 Drug Discovery date: 2021-04-23 journal: nan DOI: nan sha: 056b86c54e76dc45c6a42e149763373bb5c98e23 doc_id: 588469 cord_uid: 877l2ha5 In the past several months, COVID-19 has spread over the globe and caused severe damage to the people and the society. In the context of this severe situation, an effective drug discovery method to generate potential drugs is extremely meaningful. In this paper, we provide a methodology of discovering potential drugs for the treatment of Severe Acute Respiratory Syndrome Corona-Virus 2 (commonly known as SARS-CoV-2). We proposed a new model called Genetic Constrained Graph Variational Autoencoder (GCGVAE) to solve this problem. We trained our model based on the data of various viruses' protein structure, including that of the SARS, HIV, Hep3, and MERS, and used it to generate possible drugs for SARS-CoV-2. Several optimization algorithms, including valency masking and genetic algorithm, are deployed to fine tune our model. According to the simulation, our generated molecules have great effectiveness in inhibiting SARS-CoV-2. We quantitatively calculated the scores of our generated molecules and compared it with the scores of existing drugs, and the result shows our generated molecules scores much better than those existing drugs. Moreover, our model can be also applied to generate effective drugs for treating other viruses given their protein structure, which could be used to generate drugs for future viruses. The figure above shows the statistics of the number of confirmed cases, death cases and recovered cases with its log value with respect to date (From 1/22/2020). The figure shows that the trend of the epidemic has caused severe damage to the society. In implementing GCEVAE, a batch of all active molecules must be selected for use in the first-generation sample that is to be fed into the model since using inactive molecules can reduce the effectiveness of the model and cause poorer performance. To sort out all active molecules from the original NCBI dataset, we first trained an edge-memory neural network (EMNN) [8] for selecting usable molecule that is relatively active. The EMNN selects molecules based on the PubChem Standard Value [9] , a standardized value for indicating how active is a molecule, which corresponds to how much the molecule compound inhibits the given protein at a given concentration. In addition, calculating the composite score of the molecule (Introduced in section 6, Model Evaluation), is timeconsuming. To shorten the training and inference time of the model, it only focuses on a specific atom in the molecule (Explained in section 4, GCGVAE model) instead of focusing on the entire molecule. Moreover, we use the software Autodock Vina [10] to calculate the binding affinity of a given molecule with the designated protein. Autodock vina embeds various optimizers that significantly shorten training time (Described in section 6, Model Evaluation). We faced various challenges while tackling with the optimization model, namely the graph-based genetic algorithm (GB-GA). A chromosome in GB-GA is the encoded graph of molecules. Since the length of chromosomes is not fixed but can vary to a great extent, we have to find a way to encode the molecules into chromosomes on which genetic operators can be applied. The traditional GA utilizes sample of fixed lengths, which is relatively easy to implement genetic operators. While it is possible to find a theoretical maximum molecule size and encode all molecules to have the same chromosome length as the maximum one, doing so greatly increases computational complexity. To deal with a generative problem in structural topology, I.Y. Kim et al [11] , presented a method of starting with short chromosomes and increasing the length of chromosomes with the progression of generations. However, since our GA is not for generating new molecules but for diversifying the results of our generative model, such a solution cannot be directly applied in this scenario. S. N. Pawar et al [12] presented another varied length GA for network intrusion detection, in which the length of chromosomes is independent of the number of generations. Moreover, the nonlinear structures of molecules (most medicinal molecules have ring structures, branches, etc.) brings additional challenges to our algorithms. In most of linear representations of molecules, the beginning and ending of each branch is denoted at their exact locations. In other words, if a branch originates after the third atom, the denotations will change and errors can be easily caused by confusion of the denotations while doing crossover and mutation. We solved the problem by placing all paired denotations of the molecules in the beginning when encoding them. Variational Autoencoder is a common generative model that uses one set of parameters to model the relationship between input and latent variables. It could be viewed as two coupled models that support each other. It was inspired by the Helmholtz machine (Dayan et al., 1995) [13] . The framework of VAEs provides a computationally efficient way for optimizing DLVMs (deep latent-variable models) jointly with a corresponding inference model using Scholastic Gradient Descend (SGD). We optimize the variational parameters φ such that: Here q φ (z|x) is a parametric inference model, namely the encoder of VAE.p θ (z|x) is the probability distribution function given by the model. The inference model can be any directed graphical model: P a(z j ) is the set of parent variables of z j in the directed graph. The distribution of q φ (z|x) can be parameterized using deep neural network. The decoder part that learns to generate an output which belongs to the real data distribution is given a latent variable z as an input. This part maps a sampled z (initially from a normal distribution) into a more complex latent space (the one actually representing our data) and from this complex latent variable z, it generates a data point which is as close as possible to a real data point from our distribution. [14] 3 A genetic algorithm mainly follows Darwin's rule of nature and natural selection. It is generally consisted of a fitness calculator, a mutation function, a crossover function and a selection process; however, our genetic algorithm (see section 5.2) differs from the traditional genetic model in many ways. The fitness function calculate scores of the population in the entire genetic algorithm. The selection algorithm filters the population based on rules set beforehand. The mutation algorithm changes some parts of some chromosomes to increase result variability. The crossover process randomly pick samples and swap parts of chromosomes. AI aided drug development mainly refers to generating small-molecule drug. Small-molecule drug discovery mainly focuses on chemically synthesized small molecules of active substances, which can be made into small-molecule drugs through chemical reactions between different organic and inorganic compounds. One group of AI-based drug development focuses on the discovery of new drug-like compounds at the molecular level. In [15] , Beck et al. proposed a DL-based drug-target interaction model (MT-DTI) to predict potential drug candidates for COVID-19. The authors collected the amino-acid sequences of 3C-like proteases and related antiviral drugs and drug targets from the databases of NCBI, Drug Target Common(DTC) [16] and BindingDB [17] .They also used a molecular docking tool called AutoDock Vina [18] to predict the binding affinity between 3,410 drugs and SARS-CoV-2 3CLpro. Moskal et al. used AI to analyze the comparability of COVID-19 medicine and drugs involving similar indications to screen out second-generation drugs. They used the software Mol2Vec to convert molecular formulas into multidimensional vectors. In other words, each molecular formula is converted into a sentence of information, with each part of it being a word. Next, they used VAE to generate molecules that have suppresive qualities on the 2019-n-CoV virus. Then, they used CNN, LSTM, and MLP models to generate possibly funtional molecules, using 4456 drug-like molecules as training. GNN is a neural network inspired by CNN, which involves the use of complex multidimensional matrices to account for discrete characteristics. The core of CNN is local connections -sharing weights and using multi-layer calculations. Traditional CNNs can only deal with one or two dimensional problems, usually involving images and text; and GNNs are able to break these limits. The fact that GNN is a locally connected structure means that it can also use shared weights to reduce computational complexity [19] , making it an extensively applicable neural network. Multi-layer calculations is also the answer to many problems involving hierarchical pattern. The GNN model learns a state embedding [20] h v ∈ R, which contains the information of neighboring nodes. State embedding h v is an s-dimension vector for v, which ultimately produces the result O v . Let f be a parametric equation, shared by each node, through which each node updates new information. Let g be the local output function which describes how the output is produced. Thus h and O v follow the following equation: are all features of v, respectively, edge, state, and features of the node and neighboring nodes. Let H, O, X and X n represent higher-dimension vectors that are the results stacking states. All outputs, features, and node features will be: Here, F is the global transition function, and G is the global output function. F and G stacks the f and g of each node. The loss of the GNN, based on our current GNN structure and with the target information tv as supervision, is: Here, p is the number of nodes in supervised learning, which minimizes loss by using gradient descent. Our generative model, GCGVAE, is mainly consisted of two segments: Constrained Graph Variational Autoencoder (CGVAE) and Graph-Based Genetic Algorithm (GB-GA). CGVAE is used to extract effective molecules from the ZINC dataset and generate drug molecules specifically for SARS-CoV-2 through altering the initial molecules in the ZINC dataset. CGVAE uses the process above to generate 1000 samples for the GB-GA module to begin process with, which would then perform SARS-CoV-2-specific alternations to the molecule samples in the input of GB-GA. The whole process uses N number of vectors as seeds that form a latent "specification" to let our model to generate new graphs (N is the upper limit of the number of nodes generated). Then, by employing two decision functions, namely the focus function and the expand function, our model generates new molecules by setting up new edges. The focus function is used to choose nodes to connect and the expand function is used to choose edges to connect nodes that are chosen by the focus function. In a breath-first traversal, the focus function is implemented using deterministic queue (dequeue) with random choice for the inital nodes. The major point of our study is to let the expand function learn through training and thus generate an edge that can connect nodes that are selected by the focus function. To prevent an excessively large occupancy caused by massive amount of graphs learnt by the expand function [21] , our expand function trains through partial graph structure. The figure below shows the general structure of our CGVAE model. The encoder is consisted of a GRU unit that forms the GGNN network. The decoder is a normal VAE decoder. We optimize the model by using Scholastic Gradient Descend. Initialize nodes We firstly set up state h v and every node v in an initial graph that is not connected. Specifically, We assume the initial value of the graph structure Z v , which is drawn from a standard normal that is obtained through Z v by learning from the output of the model, where f represents the whole neural network. The explainable part of h v gives us a method to set limits with force. According to these variables, we are able to calculate global representation H t (the distribution and constitutions of every node at step t) and H init = H 0 (the distribution and constitutions of every node at step 0). At the same time, we also set a stop node to terminate the training. Update nodes Whenever we obtain a new graph , we give up the original h v , calculate a new h v , and update for all nodes (achieved by updating information of neighboring nodes). Here we employed the standard gated graph neural network (GGNN) G dec for S steps. E l here is an edge-type specific neural network [8] . GRU here mainly describe how the GGNN works [22] . Selecting edge and labeling First we choose a node v from the queue. Expand function choose edges from v to u: v l ↔ u. The length of the label is l. For other nodes that are not focuses, we build a feature vector φ Here, d v,u is the graph distance between two nodes so that we can let the model obtain information on the nodes and edges. We would use the above information to generate specific distributions of edges. The above model is dependent upon trained networks (trained by a VAE architecture using the graph dataset D), as well as a latent space where the latent parameters of the model reside. Our VAE is encoded by a GGNN G enc , which is parametrised by the mean µ v and the standard deviation σ v vectors. In a latent space with d dimensions, we embed each node in an input graph G to a diagonal normal distribution, from which the latent vector samples z v are taken. The KL divergence between the encoder distribution and the standard Gaussian prior is measured by the usual VAE regulariser term: During training, each generation is conditioned by an inactive sample from the encoder distribution. Training of the overall model is supervised by generation traces obtained from graphs in D. Node Initialization Let node permutations be denoted by P, and let h v (t = 0) denote the initial node states. From each node v a sample of the node specification z v is taken, and using f (the learned function), the label τ v is generated. The likelihood of the labels τ * v (observed in the encoded graph) being re-generated is Since node type τ * v , and therefore Z v , is known, the single contribution from the encoder ordering gives a lower bound, which can undergo further improvement given a set2set model [23] . Edge Selection and Labelling During training The sequence of edge additions is supervised by basing the supervision upon breadth-oriented traversals of all graphs in D. A Monte-Carlo estimate, using a miniset of sampled traces, is taken of the marginal. This is done instead of computing the logged possibility, since doing so has the disadvantage of not being computationally tractable. Let G (0) represent the initial collection of unconnected nodes. From the function below, we can see that the lower bound of the log probability of a graph G is dependent upon the log probability of all traces. Let v denote the current focus node, and let = v ↔ u be the edge added at step t. Every trace of each generation π ∈ Π can be split into a series of steps, which are ordered as t, v, . The first term of this equation is the choice of focus node at step t of trace π, which is a uniform distribution for the first focus node, and after which follows a breadth-first search. The set of generation states S, which includes the features of both the node we are currently focusing and the neighboring nodes around it, is put into consideration in order to elevate the model further. Let |E s | denote the multiplicity of state s in Π -the number of node traces containing graph G (t) . We use E s to represent edges from focus node v that are presented in graph G. Both of the above are uniform, so we can rearrange and sum the value of them over steps. In this equation, |s|/|Π| represents the probability of observing state s in a random draw from all states in Π. We define the loss fucntion of model as: Here again, a Monte Carlo estimate is taken from the set of numbered generation traces, which causes variance estimate to be high. Local optimization of the generated graphs is needed. This is done through gradient ascent (note: this is not gradient descent) in continuous latent space using a differentiable gated regression model. g1 and g2 are neural networks, and σ is the sigmoid function. The equation is as follows: L2 distance is used to compute loss during training; and in testing, an initial latent point z v can be sampled and used to gradient-ascent towards a local optimal point z * v , which is kept within the standard normal. Q, the optimized property, is ultimately produced. The main objective of our training is L = L recon . + λ 1 L latent + λ 2 L Q , consisting of normal VAE objective. Note that VAE loss of Yeung et al. [24] is not strictly followed. Our genetic algorithm takes molecules from the CGVAE model as input, on which various genetic operators are applied, and produce desirable molecules as output. Population Initialization Since molecules share great similarities with undirected graphs, we use graphs to represent each molecule instead of using merely SMILES strings. Each node of the graph represents an atom of the molecule. And each edge represents the bond between atoms in the molecule. The graph structure is stored for further generation. Crossover & Mutation & Reproduction Our system of crossover, mutation, and reproduction is based on branchswitching, i.e. the switching of randomly chosen branches between two given molecules. Our reproduction process is embedded in the crossover function, and mutation is not directly implemented (instead, it is emulated by a small-scale crossover). Because crossing over molecular branches produce nearly unpredictable results, we create a copy of the original population each time we perform a crossover, and combine it with the crossed over population. This way, if our crossover creates undesirable results, successful individuals from the original population can still succeed into the next round. Note that in this process of crossover and copying, the population is doubled, saving the need for a sperate reproduction algorithm. For each generation, we perform two rounds of crossover and copying. In the first round, only the upper-most branches of molecules are swapped. Since the upper-most branches of molecules have little variability (most often, they are singular carbon or oxygen atoms), swapping them creates only minor differences, if any. Therefore, the population created after this wave of crossover differs little from the copied one, much like a mutated population in the traditional GA. Thus, the mutation process is emulated by this round of crossover. In the next round of crossover and copying, more significant changes are applied. Instead of uppermost branch switching, universal branch switching is performed. In other words, the choice range of switchable branches is changed from only uppermost branches to all branches. Note that after this round, the population is four times that of the original one; of this current population, one quarter is identical with the original one, one quarter has minor differences with the original one (this is the quarter crossed over in the first round and copied in the second), and half (the part of the population crossed over in the second round) has great differences with the original. Thus, the current population is a mixture of (with respect to the original one) no changes, minor changes, and significant changes. The figure above illustrates the process of mutation and crossover in our GA model. A molecule that is to be mutated is input into our mutation algorithm. When a crossover/mutation process takes place, two functional groups (circled in the figure) in one pair swap each other. It will generate a new molecules as a results. We define the final score of a molecule by calculating a composite score due to different factors. The composite score is the sum of weighted score of the binding affinity of a molecule (Explained in section 6, model evaluation), the validity test function (checking the validity of a molecule) and a size penalty score (We give penalty to larger molecules since they are hard to be synthesized and unstable, explained in section 6). The scoring function returns the composite score for selecting population. Our selection function uses a combination of Elite Retention Selection and Simple Random Selection. It is basically about individuals within the population with the highest fitness scores (the top 25%) is automatically retained, and another 25% is randomly selected from the remaining population. The Elite Retention Selection is implemented in order to retain good results that have been produced, and the Simple Random Selection ensures result variability (since changes to molecule structure produce largely unpredictable results, and low-scoring molecules may crossover into high-scoring ones). Atomic valency is indicative of the number of molecular bonds an atom can make to form a stable molecule, and there are rules predicting it that can be used to construct structurally sound molecules. In our model, every type of node is given a fixed valency based on current chemical knowledge. For example, Cl (a chlorine atom) has valency 1. M and m are used to ensure that the number of bonds b v remain at a reasonable level and does not exceed b * v [25] . This ensures that only valid molecules (which are parsable via RDKit [25] ) are generated. Morover, M (t) v↔u , which addresses this problem, is: F is an indicator function. And m is defined using l to represent the type of bond. Virtual screening of drugs has become the standard technology in modern day drug discovery, which can be done through molecular docking. Molecular docking is a technology performed on computers in order to indicate how molecules bind with each other, in this case especially large protein receptors and small ligands. Molecular docking is realized by using the "lock and key" principle, which compares large protein molecules to locks and small ligands to keys. Each ligand fit into appropriate positions in protein molecules according to their special quantities such as conformation, orientation, and quantities, like keys fitting into locks. What molecular docking tries to achieve is in nature an optimization problem, seeking to find a best-fit of ligands into receptors. Molecular docking simulates the process of binding between receptors and ligands and show their mode of binding as well as their binding affinity, which can thus be used to do virtual screening of drugs. The specific molecular docking software used in this paper is Autodock, an open-source software that enables its users to execute ligand-protein docking effectively [18] . Autodock is widely used in the field of computer science and has been cited for most times compared to other molecular docking software. The reason we use Autodock is that other scientists in the field are already using it to test effectiveness of drugs inhibiting SARS-CoV-2, which proves its feasibility on drug discoveries. The process of Autodock is as followed [26] : First, a protein molecule is input into the docking software and waiting to be scanned. The purpose of scanning is to look for amino acid residues surrounding the active site of the protein (receptor). A box is then formed surrounding the protein with grid points and to be scanned by atom probes: atom probes are different types of atoms within the structure of ligands. The software would take the atom probes and scan the grid points within the box and calculate energy needed to bind with atoms in the receptor. The last step is that the software take the ligand to do a conformational search within the box and calculate a score correspondent to a specific ligand using a scoring function according to different conformations, orientations, positions, and energy [27, 28] : The components of the equation are solvent effects, conformational changes in the protein and ligand, free energy due to protein-ligand interactions, internal rotations, association energy of ligand and receptor to form a single complex and free energy due to changes in vibrational modes. A low (negative) energy indicates a stable system and thus a likely binding interaction. To increase the speed of evaluating effectiveness of drugs, we used Autodock Vina, Autodock's successor. Autodock Vina has its advantages and limitations. Autodock Vina, compared to Autodock, has an improved local search routine and makes use of multi-CPU setups, which is the main cause for its increased speed. Autodock vina also utilizes several optimizing methods, including genetic algorithm, simulated annealing, and particle swarm optimization [10, 29, 30, 31, 32] . One major difference between Autodock and its successor is that distances between grid points within the box in Autodock can be defined to smaller sizes than 1 (e.g. 0.37), while the distance between grid points in Autodock Vina can only be set to 1, which might lead to leaving out possible best-fits. Binding Affinity Binding affinity is a quantitative index used to measure the strength of molecular interactions, especially between a large protein molecule and a small ligand binding partner. The better binding between protein and ligand molecules the larger the binding affinity, which is measured by Kcal/mol. Binding affinity is influenced by non-covalent intermolecular interactions such as hydrogen bonding, hydrophobic, electrostatic interactions and Van der Waals forces between the molecules. In addition, binding affinity between a ligand and its target molecule may be affected by the presence of other molecules (which cannot be observed in our situation). The measuring method of binding affinity is to measure the equilibrium dissociation constant K D . Equilibrium dissociation constant is a specific type of equilibrium constant that measures the propensity of a larger object to separate (dissociate) reversibly into smaller components. The larger the equilibrium dissociation constant is, the smaller the absolute value of the binding affinity is, the easier the ligand is to separate from the large protein molecule, or the less likely the ligand is to bind with the large protein molecule; the smaller the equilibrium dissociation constant is, the larger the absolute value of the binding affinity is, the harder the ligand is to separate from the large protein molecule, or the more likely the ligand is to bind with the large protein molecule. For drug discovery, measuring binding affinity can help to rank order hits binding to the target and design drugs specifically for protein molecules. By using Autodock Vina, we are able to calculate the binding affinity between the SARS-CoV-2 main protease (6LU7) and the drug ligands on computers and output the rank of molecules based on whether they are able to bind with the main protease. However, the binding affinity measured does not convey a specific or absolute value, but rather a comparative rank that only determines whether a molecule is better at binding with a protein molecule than another one. However, there is still a method of comparing the drug molecules we generated to the molecules that already exist now and to molecules generated by other people. By retrieving the SMILES of drugs of other sources, we are able to convert them into pdbqt format and do docking with the main protease of SARS-CoV-2 again on our device and compare with the drugs generated by our model. SwissADME [33] is a website that enables the computational prediction of ADME properties, physiochemical descriptors, druglike nature and other parameters related to drug discovery. It is one of the most well-regarded software in the field of pharmacokinetics, and the research paper describing it has more than one thousand citations according to Microsoft Academic. In the context of our study, four parameters are of the greatest concern (in order of importance): the Brenk structural alert, Synthetic Accessibility, Bioavailability Score, and the PAINS alert. All other parameters, such as lipophilicity and water solubility, are less relevant in that they are covered by the above five or do not apply well to our problem. The Brenk structural alert is a list of 105 molecule fragments identified to be putatively toxic, chemically reactive, metabolically unstable, or has other undesirable qualities by Brenk R. et al [34] . This parameter is important because molecules that set off this alert has a higher risk of being toxic or in other ways unsuitable for drug use. Synthetic accessibility (SA) is the estimated difficulty to generate a molecule in laboratory conditions. The lower the SA score, according to estimation, the easier it is to create this molecule, and the lower the costs of manufacturing drugs based on it. The Bioavailability score, or Abbtt bioavailability, predicts the percentage of drug molecule that might reach the systemic circulation [35] . More specifically, it is the probability that a compound will have above 10% oral bioavailability in the rat. This metric matters because drugs with high bioavailability can achieve greater results with smaller doses, this influencing manufacturing costs. The PAINS (pan assay interference compounds) alert, developed by Baell et al [36] ., indicates the presence of PAINS functional groups, such as toxoflavin, isothiazolones, hydroxyphenyl hydrazones, and curcumin [37] -parts of molecules that react nonspecifically with numerous targets instead of only the desired one, and frequently yield a false positive output. The table above is the results of SwissADME check. The analysis below provides detail explanation of each parameters. SwissADME provides some valuable parameters that affect the quality of a drug molecule, whereas these parameters are not the primary focus of our study. While a drug that performs well in these parameters doubtless has its merits, and a drug that performs poorly has its drawbacks, we nevertheless deem binding affinity with the virus -the core factor that represents the upper limit of the drug's effectiveness. Due to these reasons, we do not directly incorporate ADME variables in our GCGVAE model, but instead merely add a size penalty to our GA (because large molecules generally have poor ADME performance). According to our docking results with 6LU7, the crystal structure of SARS-CoV-2 main protease in complex with an inhibitor N3 [38] , the binding affinity of Remedsivir, Favipiravir, Arbidol [5, 7] , and several widely tested and produced drugs that target to cure SARS-CoV-2 with 6LU7, is significantly less than the binding affinity of the drug that we generated using our model with 6LU7. Our model of generating drug molecules comes with less cost and is more reliable compared to traditional methods. Because our results can be easily stored and tested on ordinary computers using particular software, the cost of testing whether the drug molecules are effective is comparatively low. Also, because virtual screening is more stable than the traditional way of testing drugs since docking ligands of drug molecules using computers cannot be interfered by confounding variable that may take place in traditional labs, it guarantees the reliability of our result. Also, our model is not restricted to generating drugs inhibiting SARS-CoV-2. Unlike traditional methods of discovering drugs, which usually requires a specific and detailed analysis on the structure of the virus, the model we used is not restricted to a specific virus, which, in this case, is SARS-Cov-2. GCGVAE enables us to generate drugs correspondent to inhibit any virus if the structure of the main protease of the virus is entered into the model, which gives us high flexibility and capability in generating drugs targeted to cure viruses other than SARS-CoV-2 in the future. Another unique characteristic of our model is that it increases the variability of ordinary CGVAE. While employing CGVAE alone does not have high flexibility and novelty in generating output, which limits the output to a relatively small range, we employ a GA after the process of CGVAE by using an innovative method of crossover and mutation. This action enables our model to generate a more unique set of output than traditional CGVAE models and thus increase the number of drug structures to be tested. )cccc3)O) Cc1ccccc1)CO)/O)CO)OC.O.O -11 =O)N2)Cc1cccc(c1)NC(=O)C1=CN(N C1)CC1CC1)c1ccc2c(c1)OCO2)C( =O)NN O1)NNO)C(=O)N2)OC Cc1cccc(c1)OC)C=C The covid-19 epidemic Estimated effectiveness of symptom and risk screening to prevent the spread of covid-19 Clinical characteristics of covid-19 patients with digestive symptoms in hubei, china: a descriptive, cross-sectional, multicenter study A survey on applications of artificial intelligence in fighting against covid-19 Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-ncov) in vitro The clinical pharmacology of ribavirin Sars-cov-2: recent reports on antiviral therapies based on lopinavir/ritonavir, darunavir/umifenovir, hydroxychloroquine, remdesivir, favipiravir and other drugs for the treatment of the new coronavirus Neural message passing for quantum chemistry Pubchem chemical structure standardization Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Variable chromosome length genetic algorithm for progressive refinement in topology optimization. Structural and Multidisciplinary Optimization Genetic algorithm with variable length chromosomes for network intrusion detection An introduction to variational autoencoders Auto-encoding variational bayes Self-attention based molecule representation for predicting drug-target interaction Drug target commons 2.0: a community platform for systematic analysis of drug-target interaction profiles Bindingdb: a web-accessible database of experimentally determined protein-ligand binding affinities Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Deep learning Spectral graph theory. Number 92 Learning deep generative models of graphs Gated graph sequence neural networks Order matters: Sequence to sequence for sets Tackling over-pruning in variational autoencoders Rdkit: Open-source cheminformatics Automated docking of flexible ligands: applications of autodock Performance comparison of generalized born and poisson methods in the calculation of electrostatic solvation energies for protein structures Automated docking with grid-based energy evaluation Adaptation in natural and artificial systems (john h. holland) A new optimizer using particle swarm theory Particle swarm optimization Optimization by simulated annealing. science Bioavailability prediction of phytochemicals present in calotropis procera (aiton) r. br. by using swiss-adme tool Lessons learnt from assembling screening libraries for drug discovery for neglected diseases A bioavailability score New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays Chemistry: Chemical con artists foil drug discovery Structure of m pro from sars-cov-2 and discovery of its inhibitors