id author title date pages extension mime words sentences flesch summary cache txt cord-127741-h23w89h2 Babuji, Yadu Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release 2020-05-28 .txt text/plain 2439 149 49 In this first data release, we make available 23 datasets collected from community sources representing over 4.2 B molecules enriched with pre-computed: 1) molecular fingerprints to aid similarity searches, 2) 2D images of molecules to enable exploration and application of image-based deep learning methods, and 3) 2D and 3D molecular descriptors to speed development of machine learning models. For example, these data now include the 2D and 3D molecular descriptors, computed molecular fingerprints, 2D images representing the molecule, and canonical simplified molecular-input line-entry system (SMILES) [6] structural representations to speed development of machine learning models. We expect forthcoming data releases to extend to molecular conformers; incorporate the results of natural language processing extractions of drugs from COVID-related literature; provide the results of molecular docking simulations against SARS-CoV-2 viral and host proteins; and include the trained machine learning models that the team is building to identify top candidates for running various, more expensive calculations. ./cache/cord-127741-h23w89h2.txt ./txt/cord-127741-h23w89h2.txt