key: cord-0997460-e4vsj67u authors: Lipinski, Christopher A. title: Capter 11 Filtering in Drug Discovery date: 2005-10-05 journal: Annu Rep Comput Chem DOI: 10.1016/s1574-1400(05)01011-x sha: a47f07421f76f51f970845c682d787af7650fd16 doc_id: 997460 cord_uid: e4vsj67u This chapter discusses the concept of filtering in drug discovery. Multiple filters may be incorporated into a definition of drug-likeness and this leads to tradeoffs among compound properties in compounds intended for screening. The optimization of compound properties may require some type of multiparameter optimization scheme in library design. Fingerprint algorithms can be used to guide diversity. Filters also need to be employed in the chemistry synthesis planning process so that good quality compounds are made. Differences in property ranges between oral and injectable drugs are summarized in the chapter. Oral drugs are lower in MWT and have fewer H-bond donors, acceptors, and rotatable bonds. A scheme for separating central nervous system (CNS)- from non-CNS-active drugs in the WDI allowed the discovery of simple parameters relating to passive blood brain barrier (BBB) permeability and the prediction of p-glycoprotein (PGP) affinity. The PGP transporter is a major barrier to the entry of compounds to the CNS. Appropriately determined PGP efflux ratios can be used as a measure of compound affinity to PGP. To this author there seems to be much more agreement as to what is drug-like than there is as to what is diverse. Defining what is drug-like and non-drug-like requires some type of reference point. The property distributions of commercially available databases have been examined. The available chemicals directory (ACD) seems to be the most common non-drug-like database. The pesticide manual has been used as an alternate standard for non-drug-likeness because it ANNUAL REPORTS IN COMPUTATIONAL CHEMISTRY, VOLUME 1 q 2005 Elsevier B.V. ISSN: 1574-1400 DOI 10.1016/S1574-1400(05)01011-X All rights reserved is composed primarily of compounds designed to cause fatality to the primary organism. The Comprehensive Medicinal Chemistry (CMC), Derwent Word Drug Index (WDI) and Modern Drug Data Report (MDDR) are among the more commonly used drug-like databases [1, 2] . In addition, the 10,000 or so Phase II compounds are used to define drug-like compounds [3] . A compound's drug-like index has been calculated based upon the knowledge derived from known drugs selected from the CMC database [4] . The property distributions in combinatorial compounds compared to drugs or natural products largely reflect combinatorial chemistry synthesis constraints such that there are fewer chiral centers and complex ring systems [5] . The distribution of ring systems across multiple databases has been described [6] and a program was written and tested on the MDDR database [7] to identify candidate chemical ring replacements (bioisosteres). From the study of a database of commercially available drugs it is clear that the diversity of molecular framework (ring) shapes is extremely low. The shapes of half of the drugs in the database are described by the 32 most frequently occurring frameworks [8] . The diversity that side chains provide to drug molecules is quite low since only 20 side chains account for over 70% of the side chains [9] . Defining drug-like by what exists in databases leads to the criticism that most of chemistry space will be undefined and that discovery opportunities in unexplored chemistry space will be limited. A solution is to populate chemistry space with non-drug-like markers akin to the way point in a GPS navigation system [10] . Multiple filters (properties) may be incorporated into a definition of druglikeness and this leads to trade-offs among compound properties in compounds intended for screening [11] . Optimization of compound properties may require some type of multi-parameter optimization scheme in library design [12] . Fingerprint algorithms can be used to guide diversity [13] . Filters also need to be employed in the chemistry synthesis planning process so that good quality compounds are made [14] . Differences in property ranges between oral and injectable drugs have been summarized [15] . Oral drugs are lower in MWT and have fewer H-bond donors, acceptors and rotatable bonds. Property profiles of oral drugs are independent of the year in which the drug was approved to market and to some degree independent of target. Polar surface area (PSA) in one definition is the solvent accessible surface covered by oxygen, nitrogen and the hydrogens attached to oxygen and nitrogen. As a compound progresses through clinical trials there is a steady change in properties, e.g., MWT, Log P and PSA all decline with a MWT of about 340 found for marketed drugs [16, 17] . The reason for this pattern is unclear since properties related to oral absorption would be expected to have reached a plateau by Phase II and hence selection pressure for properties related to oral absorption should have disappeared by then [18] . Pulmonary drugs tend to have higher PSA because pulmonary permeability is less sensitive to polar hydrogen-bonding functionality [19] . Anatomically this makes sense since lungs are a closed compartment and any accumulating fluid and compounds in terminal alveoli must be cleared. Discrimination between antibacterial and nonantibacterial activity has been achieved based on 3D molecular descriptors. The overall classification rate was around 90% on a data set of 661 compounds using 2 -3 variables selected from log P, charged-weighted negative surface area, positive surface area of heavy atoms and maximum donor delocalizability. Threedimensional geometry variations had little impact on the discriminatory performance [20] . Descriptors for drug-like are most effective if they have physical meaning so as to facilitate chemists designing in drug-likeness [21] . Drug-likeness in the design of combinatorial libraries [22, 23] involves the use of rule-based filters like the rule of 5 [24] , the use of exclusionary filters to remove undesired chemistry functionality [25] and the capture of privileged structure information, e.g., from natural product collections [26] or from retro synthetic analysis of collections of bioactive molecules [27] . Natural product structural features are particularly well represented in the cancer chemotherapy and infectious disease areas [28] . Exclusionary filters have been described that remove reactive chemical functionality based on the premise that compounds having covalent chemistry possibilities have no place in drug discovery [29] . Filters are also necessary to remove cross reactivity in pooled compounds [30] . Pooling is a procedure in which single well-characterized compounds are deliberately mixed to speed screening. Components of the mixture must neither contain structural features causing assay false positives nor must they contain common substructural elements that would confuse the deconvolution of activities of the individual components. The magnitude of the number of poor quality screening compounds is emphasized by the report that only 37% of 1.6 million unique commercially available compounds are drug-like [31] . A very similar result was found in a virtual screen for SARS-CoV protease against commercially available and academic compounds. Of the 0.07% virtual hits against 3.6 million compounds, 47% failed three or more of 13 druggability criteria [32] . The criteria were based on physical, chemical and structural properties. Providing high-quality chemistry subject matter is now supported under the NIH molecular library small molecule repository initiative which aims to collect one million drug-like molecules from commercial, industry and government sources [33] . Emphasizing the point that drugs must contain adequate functionality to achieve acceptable receptor interactions, a single filter separates drug-like from non-drug-like compounds based on the observation that non-drugs are often under-functionalized [34] . Privileged structures, e.g., benzodiazepines are recurring structures active against targets unrelated by target family. They can be viewed as molecular filters selecting for desirable chemistry subject matter. As such they are rich sources for screening libraries and have recently been reviewed [35] . Privileged structure features have been employed in the combinatorial design of GPCR libraries [36] , in the combinatorial synthesis of privileged bicyclic structures [37] and in the combinatorial synthesis of cyclic peptides [38] . Homology modeling suggests a parallelism between common privileged GPCR ligand features and complementary deeply buried protein features in class A GPCRs [39] . Grouping by target family is also another method helping focus on particular target-directed privileged structures [40] . The idea is that structurally similar target family members will bind structurally similar small molecule ligands [41] . NMR screening helps identify privileged protein binding elements albeit of smaller size [42] . Although not strictly speaking a privileged structure, privileged structural elements such as the hydroxamate moiety found in many metalloprotease inhibitors can be identified [43] . Discernment of privileged structures has historically largely been a data mining exercise. However, very similar recurring structural motifs, so-called 'molecular anchors' have been described based on structure-based ligand binding considerations [44, 45] . Rigid small molecule ligands (the molecular anchors) are incapable of hydrophobic collapse and a single non-collapsible ligand conformer binds at a protein cavity site which is also often incapable of hydrophobic collapse. This concept explains the frequent occurrence of non-collapsible spiro structures in privileged structures/molecular anchors. Chemistry design principles directed to the very difficult goal of small molecule interference with protein -protein interactions via an allosteric interaction have been described [46] . An intriguing aspect is the hypothesis that chemistry emphasis should be placed on compound cores capable of interacting with relatively fixed protein hinge regions rather than on elaboration of lipophilic side chains attached to the core. The thermodynamic penalty attendant to ligand binding to a non-lowest energy protein conformer suggests that screening should allow for slow binding with adequate assay equilibration time. An implication is that for this type of target it is better to make larger numbers of smaller libraries than fewer numbers of large libraries. This trend to smaller libraries is now well documented [47] . Taken to its extreme this approach takes the typical dense chemistry space coverage of the traditional combinatorial library (target-oriented synthesis) towards the direction of the diversity-oriented synthesis approach to chemistry lead generation which populates diverse single molecules broadly through chemistry space [48] . This direction is of course in the direction of less efficient, more difficult chemistry. The focus on biological information content richness suggests natural products as combinatorial library starting points [49] . Chemical content richness is found in compounds produced by multicomponent reactions (MCR) which are chemical transformations in which as many as four components form a new compound in a single chemistry reaction step. The Ugi reaction of a carboxylic acid, amine, aldehyde and isonitrile is a classic example. In theory while offering an efficient approach to synthesis of diverse compounds, MCR in high-throughput mode currently suffers from significant chemistry limitations [50] . The difference between drug-like and lead-like has been described [51] . Leads are less complex in most parameters than drugs, which is understandable in that medicinal chemistry optimization almost invariably increases MWT and Log P [52] . However, the structural resemblance between a starting lead and a drug is marked [53] . The implication is that a quality lead as opposed to a flawed lead is far likely to lead to a real drug [54] . Lead-like discovery also refers to the screening of small molecule libraries with detection of weak affinities in the high micromolar to millimolar range. The process usually by itself does not lead to an acceptable chemistry starting point. Something else has to be added after the primary screen. Generally, multiple small molecules do not bind to non-adjacent target sites [55] , so the screening is that of small MWT singletons. However, binding of two components to the same receptor site is possible as attested by the discovery of sub-nanomolar ligands in what is termed click chemistry [56] . In this process an acetylene and azide terminus from two receptor site independently bound molecules cyclize to a single compound with the two components linked via a 1,2,3 triazole ring. Filtering in the context of lead-like small molecule screening implies control of the properties of drug starting points that eventually result from this process. A rule of three [57] has been coined for small molecule fragment screening libraries; MWT , 300; Log P , 3; H-bond donors and acceptors , 3 and rotatable bonds , 3. Small fragment screening can be by NMR [58 -60] , by X-ray [61, 62] , or in theory by any method capable of detecting weak interactions. The topic of filtering in human therapeutic drug discovery has received numerous frequent reviews [23,63 -65] as well as criticism if fundamental medicinal chemistry principles are neglected [66] . The 'rule of 5' describes four simple parameters associated with improved prospects for oral activity. Poor solubility or poor permeability are more likely if there are . 5 H-bond donors (expressed as sum of OH and NH); . 10 H-bond acceptors (expressed as sum of O þ N); MWT . 500 and Log P . 5. There are only four rules. The 5 in rule of 5 arises Filtering in Drug Discovery from the frequent appearance of a 5 in the cutoff parameters. Compounds classes such as natural products, infectious disease drugs, etc. where transporter affinity is prevalent are exceptions [24] . Rotatable bond count is now a widely used filter following the finding that greater than 10 rotatable bonds correlates with decreased rat oral bioavailability [67] . The mechanistic basis for the rotatable bond filter is unclear since the rotatable bond count does not correlate with in vivo clearance rate in the rat but the filter is reasonable from an in vitro screening viewpoint since ligand affinity on average decreases 0.5 kcal for each two rotatable bonds [68] . Compounds indexed in medicinal chemistry journals show the recent trend towards poor properties. Over 50% of medicinal chemistry compounds with activities above 1 nM have MW . 425, Log P . 4.25 and Log Sw , 24.75, indicating that these compounds are larger, more hydrophobic and less soluble when compared to time-tested quality leads [52] . The concept of the importance of compound properties (e.g., rule of 5 compliance) beyond potency is widely accepted [69] . although there are notable occasional exceptions where an orally bioavailable compound is found that lies well outside the rule of 5 limits [70] . Can the rule of 5 be bypassed by delivering drug by a non-oral route, e.g., pulmonary, intra-nasal or dermal? The answer depends very much on the dose. If the total dose is 20 mg or less then alternative delivery routes begin to be feasible. However, a limitation is that only about 10% of current clinical candidates have sufficient potency in the 0.1 mg/kg range to result in such a low dose and finding such very potent compounds seems to be mostly a matter of luck [71] . Beyond chemistry-based features, oral drugs can also be defined by their biological target. It is striking that the 100 best selling (mostly oral) drugs are ligands for proteins encoded by only a very small subset of genes and that a very considerable portion of the targets for orally active drugs may have already been discovered [72] . The term 'druggable genome' has been coined to describe the severe restriction that chemistry considerations related to oral activity superimpose on possible biology target space [73] . A scheme for separating CNS from non-CNS active drugs in the WDI allowed discovery of simple parameters relating to passive blood brain barrier (BBB) permeability and prediction of p-glycoprotein (PGP) affinity [74] . The PGP transporter is a major barrier to the entry of compounds to the CNS [75] . Appropriately determined PGP efflux ratios can be used as a measure of compound affinity to PGP. However, the value of filters based on PGP efflux ratios from the commonly used high-throughput mode Caco-2 colonic cell permeability cell culture assay have been questioned as efflux ratios do not correlate with in vivo rat brain penetration [76] . A PSA value of less than 60 -70 Å 2 tends to identify CNS active compounds [77] . A very simple set of two rules predicts CNS activity: If N þ O (the number of nitrogen and oxygen atoms) in a molecule is less than or equal to 5, it has a high chance of entering the brain. The second rule predicts that if log P 2 (N þ O) is positive then the compound is CNS active [78] . More complex commercially available software programs have been compared as to their ability to predict CNS log BBB ratio [79] . Experimental and theoretical reasons support the belief that surface tension measurements can be predictors for blood brain permeability [80] . Predictors for absorption, distribution, metabolism and excretion (ADME) currently appear most useful in global models. Limitations in local models likely reflect a lack of quality experimental data sets [81] and user dissatisfaction may result from unrealistic expectations given the magnitude of experimental ADME errors [82] . An additional limitation to schemes for separating CNS from non-CNS compounds is the complexity of the BBB. Compounds with affinity to transporters are exceptions to physicochemically based filters like the rule of 5. This is a problem for the CNS since it is estimated that about 15% of all genes selectively expressed at the BBB encode for transporter proteins and that only about 50% of BBB transporters are currently known [83] . PSA in rather simple models is a commonly used parameter to predict intestinal permeability [84] . Its rule-based calculation (TPSA) is very fast and does not require 3D structure [85] . A better prediction of intestinal permeability has been reported when PSA is partitioned into smaller molecularly based components [86] . Using molecular surface properties compounds selected from the World Health Organization's (WHO) list of essential drugs could be classified with 87% accuracy as to permeability and solubility using a six bin scheme similar to that in the FDA biopharmaceutical classification system [87] . Pharmacokinetic parameters including permeability can also be generated for filtering or ligand affinity prediction through the Volsurf software [88] . An analysis of small drug-like molecules suggests a filter of log D . 0 and , 3 enhances the probability of good permeability [89] . A collection of 222 commercially available drugs was used to determine the exclusion criteria that differentiate poorly absorbed drugs from well-absorbed drugs. Similar to the rule of 5, MWT , 500 and log P , 5 were associated with better absorbed compounds. Exceptions to the MWT criteria were compounds with a sugar moiety, high atomic weight and large cyclic structure [90] suggesting the involvement of absorptive biological transporter systems. Based on the intestinal absorption of 158 drug and drug-like compounds in rats there is a significant relationship between rat intestinal absorption, and by extrapolation human absorption, to drug hydrogen-bond acidity and basicity [91] . Poor aqueous solubility is a wide spread problem in combinatorial libraries as opposed to poor intestinal permeability which is much less of a problem. About one-half of poor solubility is due to large size/lipophilicity. Log P . 5 identifies 75% of these compounds. The other 50% of poor aqueous solubility is due to crystal packing considerations for which there is no computational filter [92] . Melting point is an experimental indicator of crystal packing. Aqueous solubility decreases about 10 £ for each 100 8C rise in melting point and so melting point, if available, is a valuable parameter in solubility prediction [93] . Progress toward a computational melting point is suggested by the 63% success in qualitative ranking of compounds into low-medium and high-solubility bins. Descriptors for hydrophilicity, polarity, partial atom charge and molecular rigidity were found to be positively correlated with melting point whereas non-polar atoms and high flexibility within the molecule were negatively correlated [94] . Volume of distribution (VD) is a key pharmacokinetic parameter. A low VD of less than 1 l/kg identifies drugs residing in the plasma compartment. A VD greater than 1 l/kg identifies compounds accessing tissue compartments outside the plasma compartment, e.g., many CNS drugs have VD values in the tens or higher. A recently developed computational approach to predict VD for neutral and basic drugs works as well as the in vivo experimental measurement provided that accurate experimental compound log D and pK a are available. Predictivity is retained if computed log D and pK a are used but accuracy declines somewhat [95, 96] . Approximately 80% of drugs are oxidized by the cytochrome P450 (CYP) family of enzymes; hence a decision tree for CYP substrate affinity is important. This has been described in that characteristics of CYP substrates, such as lipophilicity, MWT and hydrogen-bonding potential, govern selectivity towards individual CYPs [97] . Compounds with a marked propensity to bind to multiple targets, so-called nuisance compounds, are of little value in drug discovery. Such compounds can be experimentally identified by their binding to fetal calf serum [98] . It has long been known that compounds could be identified as reproducible actives in HTS screens that could not be optimized in chemistry. Such compounds often appear active in multiple screens that have no biological relationship. An analysis of such promiscuous compounds from HTS hits led to the conclusion that colloidal aggregates in the 50 -1000 nm size range were responsible. The apparent HTS screen activity was due to a biophysical effect rather than due to a normal ligand receptor affinity and hence the hits were unoptimizable in chemistry [99] . This promiscuous aggregation effect was found among 8 of 15 kinase inhibitors widely used in biology screening [100] emphasizing the importance of exclusionary filters to prevent wasting of biology research time by testing compounds with flawed properties. The aggregation phenomenon has been found among known drugs, albeit only when tested at high non-physiological concentrations and a predictive model was developed [101] . Filtering has also been applied to agrochemicals. Compared to drugs intended for human use, agrochemicals tend to have fewer hydrogen-bond donors [102] . For agrochemical screening computationally intensive surface area parameters offered no advantage over the rule of 5 [103] . Analogous to drug-likeness, agrochem-likeness for large compound collections has been explored using support vector machines (SVM). In this study SVM performed better than neural networks [104] . Property distribution of drug-related chemical databases An overview of the diversity represented in commercially-available databases Drug-like properties and the causes of poor solubility and poor permeability Drug-like index: a new approach to measure drug-like compounds and their diversity Property distributions: differences between drugs natural products, and molecules from combinatorial chemistry Drug rings database with web interface. A tool for identifying alternative chemical rings in lead discovery programs The most common chemical replacements in drug-like compounds The properties of known drugs. 1. Molecular frameworks Properties of known drugs. 2. Side chains Pharmacokinetically based mapping device for chemical space navigation Optimizing the size and configuration of combinatorial libraries Multiobjective optimization of combinatorial libraries Methods for compound selection focused on hits and application in drug discovery Chemical information management in drug discovery: optimizing the computational and combinatorial chemistry interfaces Characteristic physical properties and structural fragments of marketed oral drugs A comparison of physiochemical property profiles of development and marketed oral drugs Examination of the computed molecular properties of compounds selected for clinical development The rule of five revisited Pulmonary absorption rate and bioavailability of drugs in vivo in rats: structure-absorption relationships and physicochemical profiling of inhaled drugs Modeling discrimination between antibacterial and nonantibacterial activity based on 3D molecular descriptors Design strategies for building drug-like chemical libraries High-speed chemistry libraries: assessment of druglikeness Computational approaches towards the rational design of drug-like compound libraries Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings Recognizing molecules with drug-like properties Protein secondary structure templates derived from bioactive natural products RECAP -Retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry Natural products as sources of new drugs over the period 1981-2002 Reactive compounds and in vitro false positives in HTS Strategic pooling of compounds for high-throughput screening Drug-like annotation and duplicate analysis of a 23-supplier chemical database totaling 2.7 million compounds Virtual screening for SARS-CoV protease based on KZ7088 pharmacophore points Simple selection criteria for drug-like chemical matter Privileged structures -an update Privileged structure-based combinatorial libraries targeting G protein-coupled receptors The combinatorial synthesis of bicyclic privileged structures or privileged substructures Exploring privileged structures: the combinatorial synthesis of cyclic peptides Recognition of privileged structures by G-protein coupled receptors Target information in lead discovery Similarity metrics for ligands reflecting the similarity of the target proteins Privileged molecules for protein binding identified from NMR-based screening Medicinal chemistry of target family-directed masterkeys Unraveling principles of lead discovery: from unfrustrated energy landscapes to novel molecular anchors Molecular anchors with large stability gaps ensure linear binding free energy relationships for hydrophobic substituents Implications of protein flexibility for drug discovery Small molecule lead generation processes for drug discovery A planning strategy for diversity-oriented synthesis From protein domains to drug candidates: natural products as guiding principles in the design and synthesis of compound libraries Multi-component reactions: emerging chemistry in drug discovery from xylocain to crixivan Is there a difference between leads and drugs? A historical perspective Current trends in lead discovery: are we looking for the appropriate properties? Drugs, leads, and drug-likeness: an analysis of some recently launched drugs Compound properties and drug quality The consequences of translational and rotational entropy lost by small molecules on binding to proteins The growing impact of click chemistry on drug discovery A 'rule of three' for fragment-based lead discovery? Design of small molecule libraries for NMR screening and other applications in drug discovery Applications of SHAPES screening in drug discovery Strategies for NMR Screening and Library Design Structure-based screening of low-affinity compounds Application and limitations of X-ray crystallographic data in structure-based ligand and drug design Molecular recognition: the fragment approach in lead generation Filtering databases and chemical libraries Computational Methods for the Analysis of Molecular Diversity Opinion: drug research: myths, hype and reality Molecular properties that influence the oral bioavailability of drug candidates Functional group contributions to drugreceptor interactions Virtual screening in lead discovery: a viewpoint Large dimeric ligands with favorable pharmacokinetic properties and peroxisome proliferator-activated receptor agonist activity in vitro and in vivo Advancing new drug delivery concepts to gain the lead Knockouts model the 100 best-selling drugs -will they model the next 100? Opinion: the druggable genome Blood -brain barrier permeation models: discriminating between potential CNS and non-CNS drugs including P-glycoprotein substrates Passive permeability and P-glycoprotein-mediated efflux differentiate central nervous system (CNS) and non-CNS marketed drugs Caco-2 permeability, P-glycoprotein transport ratios and brain penetration of heterocyclic drugs Polar molecular surface as a dominating determinant for oral absorption and brain penetration of drugs Computational approaches to the prediction of the blood -brain distribution Recent advances in the prediction of bloodbrain partitioning from molecular structure Surface activity profiling of drugs applied to the prediction of blood -brain barrier permeability In silico ADME prediction: data, models, facts and myths In silico ADME/Tox: why models fail Blood -brain barrier genomics and the use of endogenous transporters to cause drug penetration into the brain Experimental and computational screening models for the prediction of intestinal drug absorption Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties Intestinal Absorption: the Role of Polar Surface Area. Methods and Principles in Medicinal Chemistry Absorption classification of oral drugs based on molecular surface properties Surface descriptors for protein -ligand affinity prediction A structure-permeability study of small drug-like molecules Molecular and pharmacokinetic properties of 222 commercially available oral drugs in humans Quantitative relationship between rat intestinal absorption and Abraham descriptors Integration of physicochemical property considerations into the design of combinatorial libraries Prediction of drug solubility by the general solubility equation (GSE) Molecular descriptors influencing melting point and their role in classification of solid drugs Prediction of human volume of distribution values for neutral and basic drugs. 2. Extended data set and leaveclass-out statistics Prediction of volume of distribution values in humans for neutral and basic drugs using physicochemical measurements and plasma protein binding data Substrate SARs in human P450s Affinity-based high-throughput screening of orphan targets: practical solutions for removing promiscuous binders, Abstracts of Papers A specific mechanism of nonspecific inhibition Kinase inhibitors: not just for kinases anymore Identification and prediction of promiscuous aggregating inhibitors among known drugs Selecting the right compounds for screening: does Lipinski's rule of 5 for pharmaceuticals apply to agrochemicals? Selecting the right compounds for screening: use of surface-area parameters Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions