key: cord-0962259-bplbgrbo authors: Huryn, Donna M.; Cosford, Nicholas D.P. title: Chapter 26 The Molecular Libraries Screening Center Network (MLSCN): Identifying Chemical Probes of Biological Systems date: 2007-11-07 journal: Annu Rep Med Chem DOI: 10.1016/s0065-7743(07)42026-7 sha: 4bd1459498dfaebcf6baf63e6fb40c532f3b2b1b doc_id: 962259 cord_uid: bplbgrbo The NIH Molecular Libraries Screening Center Network (MLSCN) is a subset of the Molecular Libraries Initiative (MLI) component of the NIH Roadmap for Medical Research. The ultimate goal of the MLSCN and the MLI is to expand the availability, flexibility, and use of small-molecule chemical probes for basic research. A number of aspects of the MLSCN make this initiative unique from other academic screening center. First, all researchers have access to the screening centers through the NIH X01 and R03 funding mechanisms. Second, because of the diverse source of assays and the wide expertise available within the MSLCN, specific biological systems investigated and screened will include: (1) “high risk” targets—that is, proteins or biological systems whose function is unknown; (2) targets implicated in orphan diseases or diseases not typically addressed by the private sector; (3) novel or uncommon assay systems; and (4) “non-druggable” targets, such as inhibitors of aggregation and protein–protein interactions. Third, the small molecule screening library contains structures not typically found in commercial collections or those housed in pharmaceutical companies. Fourth, as the goal of the MLSCN is to develop selective chemical probes and small molecule tools that will interrogate novel biochemical pathways, the criteria for an acceptable class of molecules is broader for the MLSCN than for those involved in drug discovery and development. Fifth, is the inclusion of integral medicinal chemistry within each MLSCN Center that allows the network to produce chemical probes with particular properties, rather than simply identifying apparent activities from the screening collection. The NIH Molecular Libraries Screening Center Network (MLSCN) is a subset of the Molecular Libraries Initiative (MLI) component of the NIH Roadmap for Medical Research [1, 2] . It consists of a consortium of 10 centers, each having expertise in assay development, high-throughput screening (HTS), chemistry and informatics. Using a centralized screening library of approximately 100,000 small molecules and assays from the research community, scientists at each center optimize assays, carry out high-throughput screens, and deposit the results into PubChem (vida infra). Based on the data from those assays, chemists at each center further optimize the initial hits in order to develop unique, small molecule probes of biological systems [3] . Informaticists contribute to data handling and analysis throughout this process. The chemical probes developed are available to researchers (both public and private sectors) via data deposition into PubChem, and in the future, via access to samples and synthetic protocols to prepare the probes. These efforts support the ultimate goals of the MLSCN and the MLI, which is to ''expand the availability, flexibility, and use of small-molecule chemical probes for basic research'' [1] . A number of aspects of the MLSCN make this initiative unique from other academic screening centers [4, 5] , as well as from screening and lead optimization activities being undertaken at pharmaceutical and biotechnology companies. First, all researchers (public and private sector) have access to the screening centers through the NIH X01 and R03 funding mechanisms [6] . Second, due to the diverse source of assays and the wide expertise available within the MSLCN, specific biological systems investigated and screened will include: (a) ''high risk'' targets, that is proteins or biological systems whose function is unknown; (b) targets implicated in orphan diseases or diseases not typically addressed by the private sector; (c) novel or uncommon assay systems (e.g., zebrafish, high content screening); and (d) ''non-druggable'' targets, such as inhibitors of aggregation and protein-protein interactions. Third, the small molecule screening library contains structures not typically found in commercial collections or those housed in pharmaceutical companies. Sources of these unique structures include natural products and novel compound libraries prepared by academic investigators through the Pilot Scale Libraries (PSL) granting mechanisms [7] , compounds generated by the Centers for Chemical Methodologies and Library Development (CMLD) (vida infra) [8] and those obtained through solicitation by the NIH [9] . Fourth, as the goal of the MLSCN is to develop selective chemical probes and small molecule tools that will interrogate novel biochemical pathways, the criteria for an acceptable class of molecules is broader for the MLSCN than for those involved in drug discovery and development. Therefore, chemical probes are not subject to the same constraints on physical properties, functional groups or metabolic profiles that are common in the pharmaceutical industry, and necessary for successful clinical candidates. An example of a compound that would fit the definition of a valuable chemical probe, but which would not adhere to the commonly prescribed criteria of ''drug-like'' is a staurosporinederived ruthenium complex, shown below, which is a selective, sub-nanomolar inhibitor of the kinase, Pim1 [10] . Fifth, and particularly important for this review, is the inclusion of integral medicinal chemistry within each MLSCN Center, that allows the network to produce chemical probes with particular properties, rather than simply identifying apparent activities from the screening collection. Finally, unlike all other screening efforts in both industry and academia, all data are available in PubChem with no delay in publication. In addition to HTS protocols and primary screening results, secondary assay data, ''profiling data'' (e.g., aggregation evaluation, Cytochrome P450 inhibition, spectroscopic profiling, solubility measurements), follow-up compound libraries and their associated biological data, and synthetic protocols are accessible. The NIH Chemical Genomics Center (NCGC) [11] is an ultrahigh-throughput screening (uHTS) and chemistry center that applies the tools of small molecule screening and discovery to develop chemical probes for the study of protein and cell functions. Using a process called quantitative high-throughput screening (qHTS), chemical libraries are screened at multiple concentrations (typically seven) to generate a concentration-response curve for each compound that covers a range of five orders of magnitude (typically 1 nM to 100 mM). qHTS comprehensively and efficiently characterizes biological activities of large chemical libraries to yield high-quality datasets for chemical probe development and compound profiling [12] . This process has been applied successfully to both cell-free and cell-based assays. The Kalypsys robotic system uses multimodal detectors, the ViewLux and Envision systems, and a plate-based laser cytometry system (Acumen Explorer) for high-capacity screening (100,000+wells/day) in the reagent-sparing 1536-well plate format. Assay detection capabilities include absorbance, luminescence, fluorescence resonance energy transfer (FRET), timeresolved FRET (TR-FRET), fluorescence polarization (FP), fluorescence intensity (FI), and AlphaScreen TM , as well as cell-based imaging assays that employ fluorescent proteins such as GFP (green fluorescence protein). The NCGC also develops new paradigms for screening, informatics, and chemical probe development that extend the application of small molecule technology to new areas of the genome. Collaborations with academic investigators worldwide, as well as pharmaceutical and biotechnology companies produces public domain data, thereby allowing sharing of best practices to enable both chemical genomics and downstream drug development. The NCGC accepts chemical libraries for screening from academic and industrial investigators, and produces its own focused libraries for specific projects that are added to the screening collection. With its miniaturized qHTS process, as little as 0.1-0.5 mg of compound will support several years of screening against hundreds of diverse biological assays. The NCGC is part of the intramural NIH program of the National Human Genome Research Institute. The Penn Center for Molecular Discovery (PCMD) [13] approaches the high volume screening challenge with unique capabilities. A key technology for the Center is the ability to print thousands of molecules on a glass surface the size of a business card, and then rapidly test these molecules against proteases and other enzymes purified from human or animal cells, bacteria, parasites, insects, or viruses [14] . Scientists at the Penn Center are also able to test compounds in thousands of miniature wells each containing a millimeter-sized Danio rerio (zebrafish), an unlikely organism that has proven its worth in studies of heart and nerve function, as well as in cancer biology, because the transparent fish is easily imaged. The PCMD, based in Philadelphia, is surrounded by local HTS industrial labs including those of Merck, GSK, Wyeth, and Johnson & Johnson. These industrial connections help transfer HTS skills into the university environment of this center. The Emory Chemical Biology Center in the MLSCN has the capability to adapt and optimize all target-based and phenotypic assays selected by the MLSCN, but has identified protein-protein interactions for small molecule probe discovery as the Center's theme. With two general screening platforms established, the Center is able to perform both HTS and high content screening (HCS) using a variety of in vitro biochemical assays, cell reporter assays, and cell phenotype-based assays. In particular, this center is experienced in assays for monitoring protein-protein interactions and enzyme activities with fluorescence-based assays, including FI, FP, and FRET. Examples of assays that are within this center's capacity include protein-protein interactions (FI, FP, FRET, AlphaScreen TM ), enzyme assays (FI, FP, and other coupled assays), receptor-ligand interaction assays (FI, FRET, Ca 2+ imaging), reporter assays (luciferase, GFP, etc.), viability assays and protein translocation assays (e.g., receptor internalization and membrane and nuclear localization). In common with the other MLSCN centers, the overall goal of the Pittsburgh Molecular Libraries Screening Center (PMLSC) [15] is to provide the scientific community access to a facility that is designed to optimize, validate, and implement assays for HTS based on optical-based detection methods to identify chemical probes, and to deposit these data into the PubChem database. The PMLSC has been designed for maximum flexibility with regard to target classes and assay formats. They develop and implement cell based, biochemical and model organism [Danio rerio (zebrafish) and Drosophila melanogaster (fly)] assays preferentially in the 384-well plate format. Existing optical-based detection capabilities include automated HCS imaging platforms, absorbance, FI, FP, timeresolved fluorescence (TRF), FRET, and luminescence. The PMLSC also examines the structure-activity relationships of active small molecules and synthesizes probe molecules that demonstrate significant potency and target selectivity. The Southern Research Molecular Libraries Screening Center (SRMLSC) is based at Southern Research Institute (SRI), where more than 20 anti-cancer agents have been discovered and entered into clinical trials, six of which received FDA approval and proceeded to market. The SRMLSC brings extensive drug discovery and development expertise to the network, especially in the areas of cancer, neurological diseases/CNS disorders, and infectious disease (HIV, hepatitis, TB, and emerging pathogens including influenza, H5N1 Avian flu, West Nile virus, and SARS coronavirus). The screening center has broad capabilities to implement any cellular, molecular, or target-based assay including those which require BSL-3 containment, and optimizes or miniaturized them as necessary. The HTS facility is equipped with state-of-the-art instrumentation to screen in up to 1536well plate format, including two ORCA robotic rails, multiple plate readers, and two Biomek FX liquid handlers, a BioRaptr, and an Echo 550 for nanoliter volume dispensing. In addition, the Center uses a high-speed automated Evotec Opera confocal microscope for high-throughput imaging assays. Data analysis is performed by scientists with expertise in molecular modeling, predictive algorithms, and QSAR analysis, using a robust assortment of chemoinformatics software packages. The San Diego Center for Chemical Genomics (SDCCG) [16] , located in the biotech-rich heart of La Jolla, California, has broad expertise in biochemistry and the ability to run almost any assay type. Specific biological themes include targets involved in regulating cell death, using a variety of biochemical and cell-based assays, with particular emphasis on kinases, phosphatases and proteases. Another area of expertise is in phenotypic assays for stem cell differentiation, using fluorescent reporters. Two main technological themes are incorporated into the SDCCG. First, the Center has special expertise in high-throughput microscopy as a tool for performing high-content, cell-based screens where cellular phenotypes drive compound selection in an unbiased manner. Second, the Center is unique across the network in having the capability to perform NMR-based small-molecule screening and optimization. NMR-based methods are exceptionally valuable when investigating molecular targets that are not easily tractable by other methods, such as protein-protein interactions and protein targets that cannot be formatted for the classical HTS environment. The Scripps Research Institute Molecular Screening Center [17] spans the Scripps campuses in La Jolla and West Palm Beach, Florida. Scripps has brought together an integrated combination of infrastructure, people and technologies that can support the identification of proof-of-concept small molecules in the academic setting. These small molecules comprise chemical probes of adequate potency, selectivity, physical properties and stability to show robust activities in cell-based assays and in vivo, allowing pre-competitive advancement to fields of breaking biology. The center is equipped with a fully automated Kalypsys screening system, with plate hotels and incubators, 200 nL to 20 mL volume dispensing 1536-well aspiration, and 1536 pintool heads for compound delivery for uHTS screening of larger compound decks. Plate readers and detectors include the ViewLux CCD-based plate reader and the EnVision multimode detector equipped with Alpha Screen TM . Assay formats include TRF, FP, FI, FRET, luminescence, and absorbance. This center is vertically integrated with enterprise-scale data management and chemoinformatics, high throughput LC/MS for compound quality assurance and rodent pharmacokinetics, and facilities for downstream synthetic follow-up by modular, library or linear chemical approaches. The goal of the Scripps Center is the rapid, collaborative publication of interesting compounds that advance the understanding of biological problems, or illuminate new nodal control points in physiology by short-term chemical perturbation. The strength and experience of the MLSCN Center at Columbia University are in cell biology, high content/high-resolution automated cellular imaging and image analysis, and phenotypic assay design and implementation. The main imaging platform of this center is the INCell Analyzer 3000 (GE Healthcare), a state of the art high throughput cell imaging system. The INCell instrument uses three laser lines for excitation: a Krypton laser (647 nm) and an Argon laser (364 and 488 nm). Three fluorescence channels can be recorded by three independent highspeed 12-bit CCD cameras, and emitted light in the wavelength range from 420 to 720 nm can be captured. Connected to a Kendro Plate Hotel and a Mitsubishi robotic arm, the system can image and analyze 222 plates (96/384 well) without supervision. Depending on the specific assay, up to 50,000 wells can be processed per day. A whole array of different image analysis modules is available, and analysis is performed at high speed on the fly. This imaging system enables the center to screen and analyze a very broad variety of assays monitoring a wide spectrum of biological processes. has developed innovative flow cytometry tools for discovery research that enable homogeneous analysis of ligand binding and protein-protein interactions, high throughput sample handling, high content analysis, and real-time measurements of cell response. Using their novel HyperCyt s screening technology, virtually any molecular assembly or cell response can be displayed in a HTS format compatible with flow cytometry, and assessing both cellular and molecular activities of small molecules is possible. Moreover, by creating a suspension array of particles, assays and responses can be highly multiplexed or performed on complex cell populations without loss of throughput. It is likely that no single competing technology offers the versatility of flow cytometry for MLI screening or has the potential of being available to such a large number of laboratories that house flow cytometers (20,000 world-wide). The Center brings together expertise that spans biomedical, biophysical, chemical, computational, instrumentation, and engineering disciplines, and is particularly interested in enhancing the overall discovery process through the integration of physical screening and computational tools that include virtual screening, chemoinformatics, and data mining. The goal of the Vanderbilt Screening Center for G-protein Coupled Receptors (GPCRs), Ion Channels, and Transporters [19] is to enable investigators to discover and develop a new generation of small molecule probes to promote our understanding of physiological and disease processes, with a particular emphasis on the structure and function of GPCRs, ion channels, and transporters. Measurements for biochemical, cellular and cell-free assays are made using a wide variety of commercially available and novel technologies. The suite of detection modalities includes two Hamamatsu FDSS kinetic imaging plate readers. These instruments are capable of collecting data from all wells of 96 or 384 plates simultaneously, and during integrated reagent addition at up to 10 frames per second over wavelengths from UV to far red with dual excitation, emission, and fluorescence polarization modes. The FDSS also supports ultra low-light detection for aequorin and other kinetic/flash luminescence formats. Additionally, the Vanderbilt Center supports high-content screening through the use of the BlueShift Isocyte, a laser scanning fluorimeter that generates two-dimensional anisotropy data. This combination of capabilities paired with robust automation provides tremendous flexibility for measuring the action of a test compound on a wide range of targets. The investment in infrastructure, the combination of basic and industrial research expertise, the dedication to translational and chemical biology, and the establishment and maintenance of a highly collaborative environment make the Vanderbilt Center well suited to support the MLSCN. PubChem is a comprehensive, publicly accessible database developed by the National Center for Biotechnology Information at the National Library of Medicine that contains information on the biological activities of small molecules [20] . As of March 2007, PubChem contained more than 15 million records, 10 million unique structures, and data from over 400 assays. The database is linked to other Entrez databases such as PubMed and PubChem Central [21] . All data (e.g., assay results, secondary assays, structures of compounds synthesized) generated within the MLSCN is deposited into PubChem. Access to this range of data on a large library of diverse compounds has enormous potential for use by the Informatics community for the development of computational models, pharmacophore models, and other algorithms to predict biological activities and properties of small molecules. Two distinct aspects of the MSLCN require participation by, and input from chemists: first, synthetic chemistry is a source of compounds within the screening library, and therefore of the assay hits; and second, expertise in synthetic and medicinal chemistry is required to optimize the hits into usable probes of biological systems. Each screening center has medicinal and synthetic chemistry expertise in order to optimize hits identified from HTS campaigns and develop them into chemical probes. Specific capabilities vary, however typical strategies employed include parallel synthesis, computational and informatics analysis, and analytical capabilities such as LC/MS techniques. The structures of novel compounds that are prepared, their synthetic protocols, analytical data and biological data are all available, and samples of final probes developed are deposited into the MLSMR. A Working Group comprised of chemists from each center meets regularly to share information, best practices, and insure optimal use of resources. Another key component of the Molecular Libraries Initiative is the development of novel technologies for generating chemical diversity, the application of those The Pittsburgh Molecular Libraries Screening Center reported the identification of SID 3717140 as an inhibitor of Mitogen-activated Protein Kinase Phosphatase 1 (MKP-1) [24] . This initial hit exhibited only modest potency (IC 50 ¼ 19.2 mM), however it appeared to display some selectivity against other phosphatases, and excellent selectivity against the 52 other targets tested [25] . A small library of compounds was designed and prepared in an effort to identify compounds with improved potency. Towards that end, several new uracil-based compounds, such as SID 14715524, exhibited improvements in potency. MKP-1 is a dual-specificity phosphatase involved in a number of processes related to cell proliferation. The availability of potent, selective and cell permeable probes would help enable a thorough understanding of the role this enzyme plays in cell cycle, signal transduction, oncogenesis, and apoptosis. While small molecule inhibitors of MKP-1 have been previously reported [26] , they have been hampered by low in vitro potency, lack of cellular activity, and poor selectivity. As such, these uracil quinolines from the PMLSC represent a novel structural class which, based on their promising physicochemical properties, may provide an improvement over those inhibitors previously reported. Bcl-2 family proteins play a crucial role in tissue homeostasis and apoptosis (programmed cell death). The BH3-interacting domain death agonist (BID) is a proapoptotic member of the Bcl-2 family, promoting cell death when activated by caspase-8, which cleaves BID to its truncated active form, tBID. NMR-based screening of a library composed of 300 fragments followed by SAR optimization by interligand NOE led to the identification of two chemical fragments that bind on the surface of BID. Covalent linkage of the two fragments provided highaffinity bidentate derivatives such as BI-11A7 [27, 28] . In vitro and cellular assays showed that these compounds prevent tBID translocation to the mitochondrial membrane and the subsequent release of proapoptotic stimuli, and inhibit neuronal apoptosis in the low micromolar range. These compounds may lead to therapeutic agents with the potential to treat disorders associated with BID activation including neurodegenerative diseases, cerebral ischemia, and brain trauma. Screening of over 66,000 compounds from the MLSMR by scientists at the PCMD for inhibitors of Cathepsin B resulted in the identification and characterization of an alternate substrate, SID 16952359 [29] . This study also describes issues relating to the nucleophilicity of dithiothreitol (DTT) and cysteine, reductants frequently used in HTS protocols, and the potential for reactivity with electrophilic sites of probe molecules. The development of quantitative high-throughput screening (qHTS) paradigms that provide concentration-response curves for large chemical libraries in a single experiment is a major focus of the NIH Chemical Genomics Center [12] . This strategy was applied to a screen for inhibitors of the enzyme pyruvate kinase, and allowed SAR development directly from primary screening data, and rapid analysis and triage of active clusters. From this analysis, a class of cyanooxazole inhibitors, exemplified by SID 862236, was identified that exhibited activity in the nanomolar range (AC 50 ¼ 30 nM). Importantly, structurally related analogs exhibited a range of potency from the nanomolar to inactive ranges. Activators were also identified in the same HTS experiment: SID 3712493 activated pyruvate kinase at an AC 50 concentration of 600 nM [30] . Gaucher's Disease is an inherited disorder characterized by deficiencies of glucocerebrosidase activity. An assay to identify small molecule inhibitors of glucocerebrosidase was developed [31], and three probes were identified. NCGC00092410 was identified by testing a series of purchased analogs of an initial hit. SID 4264637 and SID 847960 were members of the initial screening library [32] . Several of these probes have been shown to restore glucocerebrosidase activity in cultured cells from Gaucher patients, a result consistent with the correction of trafficking of misfolded glucocerebrosidase. 3.6 S1P 1 antagonist probe (Scripps) Sphingosine 1-phosphate (S1P) regulates vascular barrier and lymphoid development, as well as lymphocyte egress from lymphoid organs, by activating highaffinity S1P 1 receptors. Based on phosphate esters (such as the structure below where X ¼ O), the reversible S1P 1 antagonist (X ¼ CH 2 ) was designed to provide a non-reactive chemical probe with in vivo activity [33] . This compound was used to gain mechanistic insights into S1P systems organization not accessible through genetic manipulations and to investigate their potential for therapeutic modulation. Vascular (but not airway) administration of the preferred R enantiomer of this compound induced the loss of capillary integrity in mouse skin and lung, but did not affect the number of constitutive blood lymphocytes. Instead, alteration of lymphocyte trafficking and phenotype required supraphysiological elevation of S1P 1 tone and was reversed by the antagonist. In vivo two-photon imaging of lymph nodes confirmed requirements for obligate agonism, and the data were consistent with the presence of a stromal barrier mechanism for gating lymphocyte egress. Chemical modulation revealed differences in S1P-S1P 1 'set points' among tissues and highlights both mechanistic advantages (lymphocyte sequestration) and risks (pulmonary edema) of therapeutic intervention. Researchers at SRMLSC recently developed a HTS that allowed the identification of potential inhibitors of the severe acute respiratory syndrome coronavirus (SARS CoV) from large compound libraries [34] . The luminescent-based assay, which measured the inhibition of SARS CoV-induced cytopathic effects (CPE) in Vero E6 cells, was validated with two different diversity sets of compounds against the SARS CoV. The hit rate for both libraries was approximately 0.01%. The validated HTS assay was then employed to screen a 100,000-compound library against SARS CoV. The hit rate for the library in a single-dose format was determined to be approximately 0.8%. Screening of the three libraries resulted in the identification of several novel compounds that effectively inhibited the CPE of SARS CoV in vitro. Three hit compounds, shown below, were identified as promising lead candidates for further evaluation. The team at SRMLSC also recently developed a screen for pantothenate synthetase (PS). PS (EC 6.3.2.1) is encoded by the panC gene and catalyzes the essential adenosine triphosphate (ATP)-dependent condensation of D-pantoate and b-alanine to form pantothenate in bacteria, yeast, and plants. Pantothenate is a key precursor for the biosynthesis of coenzyme A (CoA) and acyl carrier protein (ACP). Because the enzyme is absent in mammals, and both CoA and ACP are essential cofactors for bacterial growth, PS is an attractive chemotherapeutic target. An automated high-throughput screen was developed to identify drugs that inhibit Mycobacterium tuberculosis PS. The activity of PS was measured spectrophotometrically through an enzymatic cascade involving myokinase, pyruvate kinase, and lactate dehydrogenase. The rate of PS ATP utilization was quantitated by the reduction of absorbance due to the oxidation of NADH to NAD+ by lactate dehydrogenase, which allowed for an internal control to detect interference from compounds that absorb at 340 nm. This coupled enzymatic reaction was used to screen 4080 compounds in a 96-well format. This led to the discovery of a novel inhibitor of PS that exhibits potential as an antimicrobial agent [35] . 3.9 GPCR30 antagonist (New Mexico MLSC) Researchers at the New Mexico MLSC used a combination of virtual and biomolecular screening to discover a selective agonist of GPR30. Estrogen is a hormone critical in the development, normal physiology and pathophysiology of numerous human tissues. The effects of estrogen have traditionally been solely ascribed to estrogen receptor a (ERa) and more recently ERb, members of the soluble nuclear ligand-activated family of transcription factors. However, it was recently shown that the GPR30 binds estrogen with high affinity and resides in the endoplasmic reticulum, where it activates multiple intracellular signaling pathways. To differentiate between the functions of ERa, ERb and GPR30, the New Mexico MLSC team used a combination of virtual and biomolecular screening to isolate compounds that selectively bind to GPR30. Further studies led to the identification of the first GPR30-specific agonist, G-1 (shown below) capable of activating GPR30 in a complex environment of classical and new estrogen receptors [36] . The completion of the human genome project, a coordinated effort between government, academia, and industry, has prompted a vast expanse of medical research focused on the understanding of the fundamental causes of human disease. The Molecular Libraries and Imaging initiative is a natural extension of that groundbreaking effort. By providing access to assays, HTS capabilities, small molecule libraries and chemical optimization expertise, novel chemical probes are being developed that will allow the study of gene function, biochemical pathways, and cellular biology. It is also possible that through this initiative, starting points for new drugs, particularly of rare diseases, will be identified. The data deposited into PubChem via this effort will also serve as an unprecedented source of information for scientists in all biomedical disciplines. The availability of large datasets of biological activity on a common set of compounds should serve to stimulate advances in computational and predictive models of biological activities and chemical properties. This dataset should also expedite biomedical research through its use to evaluate selectivity, toxicity, and off-target activities of compounds. The value and success of this initiative may not be obvious or measurable for a number of years, but the adoption of some of the technologies and approaches (e.g., HTS, hit optimization) typically available only inside the pharmaceutical industry should provide training, opportunities and inspiration to the wider scientific community. Proc. Natl. Acad. Sci. USA Proc. Natl. Acad. Sci. USA Proc. Natl. Acad. Sci. USA Proc. Natl. Acad. Sci