key: cord-1032266-mjgef3i6 authors: Liu, Bing; Zhou, Jiaju title: SARS‐CoV protease inhibitors design using virtual screening method from natural products libraries date: 2005-02-03 journal: J Comput Chem DOI: 10.1002/jcc.20186 sha: 16058badf249485ec705cf285aa7f3ea4606d7b2 doc_id: 1032266 cord_uid: mjgef3i6 Two natural products databases, the marine natural products database (MNPD) and the traditional Chinese medicines database (TCMD), were used to find novel structures of potent SARS‐CoV protease inhibitors through virtual screening. Before the procedure, the databases were filtered by Lipinski's ROF and Xu's extension rules. The results were analyzed by statistic methods to eliminate the bias in target‐based database screening toward higher molecular weight compounds for enhancing the hit rate. Eighteen lead compounds were recommended by the screening procedure. They were useful for experimental scientists in prioritizing drug candidates and studying the interaction mechanism. The binding mechanism was also analyzed between the best screening compound and the SARS protein. © 2005 Wiley Periodicals, Inc. J Comput Chem 26: 484–490, 2005 Severe acute respiratory syndrome (SARS) is a serious epidemic disease dispersed in many countries during the period of the time from March to May 2003. In that extraordinary period, 5327 persons were infected, of whom 349 died (6.6%) in China (http:// www.china.com.cn/chinese/zhuanti/feiyan/318261.htm). The main symptoms of SARS are hyperpyrexia, chilling, cough, and dyspnea. Confronted with this new human coronavirus, many scientists devoted themselves to related research. Marra 1 and his coworkers discovered the genome sequence of the SARS-associated coronavirus, which lit a lamp for the perplexed investigators. Based on Marra's work, Anand et al. 2 built a main proteinase structure using the homology modeling approach. Jenwitheesuk 3 indicated that existing HIV-1 protease inhibitors have high binding affinity to the SARS coronavirus (SARS-CoV) proteinase. His findings may help scientists to design SARS inhibitors. De Groot 4 also believed that there were some similarities between SARS-CoV and HIV. Xiong et al. found 73 available protease inhibitors from the MDDR database (MDL Drug Data Report, http://www.mdl.com/) by virtual screening. 5 Lee et al. 6 identified four potent compounds taken from 16 antiviral drugs in the NCI database [National Cancer Institute (NCI) Database (http://cactus.nci.nih.gov/ncidb2/)]. In a recent report, Sirois et al. screened 3.6 million compounds through virtual screening using the MOE software package. 7 Wu et al. 8 also contributed their excellent work by a cell-based assay, and 15 compounds with potent anti-SARS-CoV activity were found, including two existing drugs. At present, there are still no effective SARS-CoV protease inhibitors on the market, and the available ligand databases used for virtual screening are usually those of Western medicine databases, such as MDDR, NCI, ACD, etc. Traditional Chinese medicine has been playing an important role in China for thousands of years, and now it is an valuable source of complements in Western medicines. During the SARSspreading period, people in China used TCM to prevent the disease, with positive results. Along with the scale of isolating compounds from natural sources being bigger and bigger, many novel structures have been continually found. People are paying more attention to finding new effective drugs from natural sources, especially from medicinal plants and halobios. Unlike land-dwelling living beings, owing to their unique habitat such as high salinity, very little light, and high pressure, the Correspondence to: J. Zhou; e-mail: jjzhou@home.ipe.ac.cn marine organisms have different metabolism routes. These facts result in remarkable structural diversity of marine natural products. It is the character that makes marine natural products an invaluable treasure. There are more than 12,000 new compounds that have been isolated from sea living beings, adding 500 to 800 new compounds each year. 9,10 A new antineoplastic drug, ET-743, has been synthesized by PharmaMar. 11 The compound originally was isolated from tunicates. It is undergoing its clinical test period II in both Europe and America, and is expected to come onto the market this year. In this article, we used two new natural products databases: the traditional Chinese medicines database (TCMD), and the marine natural products (MNPD), for virtual screening. MNPD was constructed by our laboratory. 12 There are 8078 compounds isolated from halobios, among them 3200 compounds are with bioactivity data, some 1200 with CAS Registry Numbers, and about 3700 with physical property data. This database runs on an ISIS/Base (MDL Information Systems, Inc.) platform. TCMD is a commercial database built by our laboratory (http:// products.cambridgesoft.com/family.cfm?FIDϭ57). [13] [14] [15] It has 9127 entries. A typical entry includes detailed 3D molecular structures, English names and synonyms, physical properties, natural sources, and references information. Bioactivity data are available for 3000 of the entries. There are 3922 traditional Chinese medicine plant species including standard expression on TCM effects and indications. There are many antivirus and anti-HIV compounds in TCMD. According to Jenwitheesuk and De Groot's findings, they are helpful in finding SARS-CoV protease inhibitors. 3, 4 Honeysuckle, Indigowoad Root, Forsythia, Swordlike Atractylodes, and Licorice are also important effective components of the anti-SARS TCM prescription. TCMD is run on an ISIS/Base platform. All compound 2D structures in MNPD and TCMD were transformed to 3D molecules files by CONCORD standalone 4.0 16 at an SGI workstation. According to the Lipinski's Rule of Five (ROF), drug-like compounds should have an appropriate molecular weight (MW), H- 17 It has been difficult for molecules with larger molecular weight and lower LogP value to cross through the cell membrane. With larger LogP value, the drug will be difficult to dissolve in water, which is a necessary condition for drugs to be absorbed by an organism. Xu has extended that rule by definition of a drug-like cluster center. 18 The criteria we adopted here (see Table 1 ) for the filter of the two databases combined the ROF with Xu's regulations. Due to SARS-CoV protease's larger active pocket, we expanded the MW to less than 900. Compounds with all their parameters meeting the druglike rules were picked out and written into a single molecules file. Then the databases were quickly narrowed to 3861 (MNPD) and 5454 (TCMD), respectively. LogP was calculated by the XLogP program. 19 Rao 20 and his coworkers reported a 1.90-Å crystal structure of the SARS-CoV protease (PDB entry code: 1UJ1). It is a dimer, and the active pocket located at protomer A, which contains three domains and the substrate-binding site, is in a cleft between domain I and II (residues 8 -101 and 102-184, respectively). The active site has a Cys-His catalytic dyad, which is composed of Cys145 and His41. Also, the substrate-binding pocket consists of the side chains of His163 and Phe140, and the main-chain atoms of Met165, Glu166, and His172. The Dock4.02 package of the Linux version was used in the first step of the virtual screening procedure, 21, 22 and computations were carried out in the cluster of the PC servers. Each computational node has an HP/Compaq DL360 industrial standard server with dual pentium III 1.13-GHz CPUs and a 512-kb L2 cache. Residues around the sulfur atom of Cys145 at a radius of 13 Å were isolated for constructing the grids of the docking screening. Energy scoring grids were obtained using an all-atom model and a distancedependent dielectric function with a 10-Å cutoff. The macromolecule was a loaded Kollman charge, with Gasteiger-Huckel charges for small molecules on the SYBYL6.8. 23 An anchor fragment orientation method was performed, and 25 conformations were produced per cycle. The top 200 candidates filtered by the Dock procedure in both databases were then studied by the AutoDock3.05 program, respectively. The computations were processed on an SGI Octane 2 graphics workstation. The grid has the space of 0.375 Å and a size of about 60 Å ϫ 49 Å ϫ 63 Å. The macromolecule and the small molecules were loaded on Kollman and Gasteiger-Huckel charges, respectively, on the SYBYL6.8. The GA-LS method was adopted using the default settings. Compounds with better AutoDock scores and binding conformations will be selected as lead compounds of SARS for next project. Energy score (ES) is an important criterion to evaluate the binding affinity for the target protein with a ligand of certain orientation and conformation. With the shortcoming of scoring functions of the Dock program, 24 -32 the energy score is biased toward the selection of high molecular weight. 33 Liu has provided a new arithmetic to eliminate this bias (unpublished data, Liu, Z. M; Shi, L; Lai, L. H. Considering Molecule Weight in Virtual Docking Screening: Implication for Inhibitors Selection). He found that most molecules with heavy atom number (HA) between 5 and 15 interact with the target protein in a proper binding mode. Those compounds usually represent the correct interacting mode of the whole data sets, whereas the energy scores of the molecules with a larger HA spread out partly because of more choices to escape from the "binding site" or pocket capacity limitation. HA numbers and the average energy scores have the relationship described as in eqs. (1) (MNPD) and (2) (TCMD). The HA-AE exponential decay fit curve of MNPD: The HA-AE exponential decay fit curve of TCMD: Figures 1 and 2 give the distribution curve of heavy atoms and energy scores. After repetitive evaluation of by AutoDock, 18 compounds of high affinity in silico were selected (7 from MNPD and 11 from TCMD). They will be useful for experimental scientists in prioritizing drug candidates and studying the interaction mechanism. These structures and their drug-like parameters are listed in Table 2 . The coronavirus family exhibits one main protease, called 3CL, because of the nature of the catalytic site that acts a crucial role in the regulation of the virus life-cycle. 2 Cys145 and His41 residues are considered to be essential for the normal function of SARS protein. The docking simulation of compound M4367, which was isolated from Pseudomonas sp. or Alteromonas sp. in sponge Dysidea fragilis (Black Sea), showed that the inhibitor is folded into a ring-like structure in the active site that was similar with that of Wu's compound 2 (Wu-2). 8 The K i value of Wu-2 against the SARS-CoV 3CL protease is 0.6 m. One phenyl group of com- pound M4367 fits into the pocket defined by Leu27, Thr25, etc. One carbonyl instead of Wu-2's phenyl fits into the pocket defined by the hydrophobic residues (Met165, Pro-168, and Leu-167). The M4367 groups interact with Cys145 and His41 directly by hydrogen bond interaction and hydrophobic contact. There are also four other hydrogen bonds between M4367 and Phe140, Ser144, Cys44, and Thr25, respectively. The complex was analyzed by the Ligplot 4.22 to identify some specific contacts (Fig. 3) . 34 The SARS target was so novel that there are still no effective inhibitors available in the market. Marra 1 reports that the derivatives of AG7088 might be good starting points for the design of anticoronavirus drugs. AG7088 has already been clinically tested for treatment of the common cold. Its docking complex with SARS-CoV protease also has multiple interactions, which is similar to that of our recommended compounds (Fig. 4) . Eighteen novel-structure compounds with best binding affinities and conformations were found via virtual screening and statistic methods. The interaction and binding mechanism were elucidated by the complex structure of SARS-M4367. The similarity of the protein binding mode between our screened compounds and Wu-2, AG7088, which were reported as possible molecules of SARS inhibitors, showed certain values of our research for experimental scientists in prioritizing drug candidates. The results show that high-affinity drugs for the SARS protein may have the characteristic of direct interaction with the functional residues, His41 and Cys145, which act as a crucial role in the regulation of the SARS life cycle. Chin Marine Drugs We thank Dr. Zhenming Liu and Prof. Luhua Lai of Beijing University for many useful discussions in the result analysis. We also thank Hao He (ChemBay Technology Ltd., China), who helped to transform MNPD and TCMD to 3D molecules files with CONCORD.