key: cord-0804271-s0ps9rd0 authors: Xu, Jun; Huang, Sichao; Luo, Haibin; Li, Guoji; Bao, Jiaolin; Cai, Shaohui; Wang, Yuqiang title: QSAR Studies on Andrographolide Derivatives as α-Glucosidase Inhibitors date: 2010-03-02 journal: Int J Mol Sci DOI: 10.3390/ijms11030880 sha: 2181d2f31f1f4b8c6203668dee4f6b3f382af799 doc_id: 804271 cord_uid: s0ps9rd0 Andrographolide derivatives were shown to inhibit α-glucosidase. To investigate the relationship between activities and structures of andrographolide derivatives, a training set was chosen from 25 andrographolide derivatives by the principal component analysis (PCA) method, and a quantitative structure-activity relationship (QSAR) was established by 2D and 3D QSAR methods. The cross-validation r(2) (0.731) and standard error (0.225) illustrated that the 2D-QSAR model was able to identify the important molecular fragments and the cross-validation r(2) (0.794) and standard error (0.127) demonstrated that the 3D-QSAR model was capable of exploring the spatial distribution of important fragments. The obtained results suggested that proposed combination of 2D and 3D QSAR models could be useful in predicting the α-glucosidase inhibiting activity of andrographolide derivatives. Andrographis paniculate is a plant widely used as a traditional Chinese medicine in China, India, and other Asian countries [1, 2] . Extracts and constituents of Andrographis paniculate exhibit broad pharmacological activities, such as anti-bacterial, ant-malarial, anti-inflammatory, anti-tumor, immunological regulation, and hepatoprotective effects [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] . Lately, some andrographolide derivatives were reported to decrease blood glucose level by inhibiting α-glucosidase [13, 14] . It has been well known that α-glucosidase is a key enzyme in the absorption of sugar in the small intestine mucous membrane, and its activity is closely related to blood glucose levels. Studies also indicated that α-glucosidase might be involved in diabetes [15] [16] [17] [18] [19] [20] . Accordingly, α-glucosidase is considered an important target for the design of antidiabetic drugs. Recently, efforts had been made in modification and synthesis of novel andrographolide derivatives to find more potent and safer α-glucosidase inhibitors. Knowledge about the relationships between structures of andrographolide derivatives and their inhibitory activities on α-glucosidase could greatly facilitate the drug discovery process. QSAR [21] has been widely used for years to provide quantitative analysis of structure and activity relationships of compounds. Statistical methods are applied in QSAR modeling to establish correlations between chemical structures and their biological activities. Once validated, the findings can be used to predict activities of untested compounds. Recently, computer-assisted drug design based on QSAR has been successfully employed to develop new drugs for the treatment of cancer, AIDS, SARS, and other diseases [22] [23] [24] [25] [26] [27] [28] [29] . With the availability of large commercial databases and highly efficient programs including Sybyl, Discovery studio, MOE and so on, it is estimated that QSAR modeling as a tool could remarkably reduces the cost of drug discovery [30] . In this study, 2D QSAR models were constructed to describe the important fragments in andrographolide derivatives and 3D QSAR models were established to explore the spatial distribution of important groups. The combination of 2D and 3D QSAR models could better summarize the QSAR of andrographolide derivatives in inhibiting α-glucosidase. The structures and inhibitory activities (IC50) of 25 andrographolide derivatives ( Figure 1 ) were collected from the literature, and served as the database to build QSAR models [13, 14, 31] . PLogIC50 was used as the dependent variable of QSAR model. PCA, HQSAR, CoMFA, CoMSIA were performed by Sybyl7.03 (Tripos Co., LTD) program. Principle Component Analysis (PCA), employed to select the training set, could be applied to explain the differences among the 25 andrographolide derivatives through diversities of the structures' parameters and to exhibit their distribution on a 2D plot [32] . Furthermore, the most descriptive compounds (MDC) or the largest minimum distance (LMD) methods were applied to select the training set according to the distribution of these compounds. Hologram QSAR (HQSAR) offers the ability to rapidly generate QSAR models of high statistical quality and predicted value by SYBYL line notation (SLN), cyclic redundancy check (CRC) and partial least squares (PLS) [33] [34] [35] . The premise of HQSAR is that since the structure of a molecule is encoded within its 2D fingerprint and that structure is the key determinant of all molecular properties (including biological activity), it should be possible to predict the activity of a molecule from its fingerprint. The training set was used to establish 2D-QSAR model by HQSAR, and the best 2D-QSAR model was applied by the criterion of cross-validation R 2 . The test set's biological activity was predicted by the best 2D-QSAR model, whose predictability was validated by correlation coefficient between the predicted and experimental values. The most common structure (MCS) could be calculated by HQSAR. Based on the MCS of andrographolide derivatives, the contributions of molecules' fragments to biological activity should be analyzed for describing the QSAR of andrographolide derivatives as α-glucosidase inhibitors. The three-D QSAR model applies PLS to explore the relationships between the physicochemical variables and biological activity. Cross-validation is used to estimate the QSAR model's predictability. In general, a LOO cross-validated coefficient Q 2 (higher than 0.5) can be considered as statistically high predictive ability [36] . CoMFA, which is widely utilized in 3D-QSAR research, claims that if a group of similar compounds are ligands of the same receptor, their bioactivities depend on the differences of the molecules' fields surrounding them [37] . CoMFA can exhibit a contour map in a 3D graph, which makes it easier to distinguish differences between compounds with strong and weak activities. CoMSIA is another 3D-QSAR method that adopts a Gaussian function instead of traditional Coulomb and Lennard-Jones' function used in CoMFA [38] . Therefore, CoMSIA efficiently avoids the shortcomings of CoMFA in which only the steric and electrostatic fields are used. The leave-oneout (LOO) method is employed to validate the predictability of the models and Y-Randomization test is used to validate the robustness of the models [39] . In this study, CoMFA and CoMSIA were both utilized to generate 3D-QSAR models, and then the relative higher predictive 3D-QSAR models were selected by comparison. Subsequently, the selected models were further optimized by the Focusing method [40] . This method describes the different contributions of different grids in CoMFA and CoMSIA to the bioactivities of the compounds by weighting, which was expected to selectively enhance or impair the contributions of different grids and improve the resolution. Moreover, the biological activities of test set were predicted by the optimized QSAR model. The best QSAR model was determined by comparing the parameters of the model and correlation between the predicted and experimental values of the test sets. The selection of the training set is one of the most important steps in QSAR modeling, since the establishment and optimization of a QSAR model are based on this training set. Predictability and applicability of a QSAR model also depend on the training set selection [41, 42] . Usually, the compounds serving as the training set should have three characteristics: (1) maximum structural diversity; (2) maximum activity diversity; (3) similarity of interactions [43] . Besides, both molecular structures and biological activities of the test set should be covered by the ranges of the training set. In this research, PCA was applied to select a training set from among 25 andrographolide derivatives. PCA is a statistical technique useful for summarizing all the information encoded in the structures of compounds. It is also very helpful for understanding the distribution of the compounds. The distribution pattern of the 25 andrographolide derivatives is shown in Figure 2 . There were different population densities in the Figure. Eighteen compounds (1, 3-8, 11, 13, 16-21 and 23-25) were selected as the raining set by the MDC method. The rest of them (compounds 2, 9, 10, 14, 15 and 22) were used as the test set whose biological activities were covered by the training set. The best cross-validation r 2 (0.731) and standard error (0.225) illustrated that the 2D-QSAR model could be applied to predict the biological activity of andrographolide derivatives as α-glucosidase inhibitors. The predicted and experimental biological activities of andrographolide derivatives are shown in Table 1 . The results of the correlation coefficient R 2 , standard error of the training set (0.840, 0.174) and test set (0.949, 0.104) suggested that the 2D-QSAR model could be used to explain the QSAR of andrographolide derivatives as α-glucosidase inhibitors. The PLS coefficient was the standardization for judging which fragment was the key fragment. The larger the PLS coefficient, the more important the fragment was for andrographolide derivatives' biological activity. According to the criterion, C (=C©C)C=C or C [1] :C:C:C(:C:C:@1)C=C attached to C 3 of andrographolide ( Figure 4) and C [1] :N:C:C(:C:C:@1)C(=C)O attached to C 17 of andrographolide were suggested as the key fragments. The 18 compounds were energy minimized, added charges and aligned ( Figure 5 ). CoMFA and CoMSIA were used to develop a number of QSAR models based on the properties of compounds belonging to different fields (steric, electrostatic, hydrophobic, H-donor and acceptor, Table 2 ). Since the QSAR model was employed to predict unknown compounds' activity, the model's predictability was the criterion to judge which QSAR model was the best. Predictability of a QSAR model was not only expressed by cross-validation (q 2 ) but also by validation of the test set. The results illustrated that four models (4, 8, 10 and 11 ) had the top four predictabilities, so the Focus method was then applied to optimize these models, and further improved predictability for model 4, 10 and 11, but not for model 8. Among these models (model 8, 13, 15 and16) , model 16 exhibited the best predictability as indicated by the highest Q2 value. Predictability of these models (8, 13, 15 and 16) was further evaluated using a test set. Model 16 also provided the best prediction with a correlation coefficient R 2 (0.941) (Table 3) . Overall, this model represented the best QSAR model (q 2 = 0.794, R 2 cv = 0.915, SE cv = 0.127, R 2 test set = 0.941, SE test set = 0.104). Y-Randomization test (q 2 = 0.199) suggested that the model also had a good robustness. Table 4 showed Comparison between predicted PLogIC50 of database and experimental values by using Model 16. Model 16 used steric field, hydrophobic field and H-acceptor field together to describe the relationship between activities and structures of andrographolide derivatives. H-bond receptive atoms and groups in the region marked by blue lines (Figure 6 ) were favorable for the activities of the compounds, while the atoms and groups in the region marked by yellow lines impaired the activities. Hydrophobic groups were desirable in the region marked with blue lines but not the region marked by dark lines (Figure 7 ). In addition, the activities of the andrographolide derivatives were enhanced by the presence of steric groups in the region marked by purple lines instead of the region marked by green lines (Figure 8 ). The compounds with structures fitting well into the 3D contour maps derived from the model 16 usually exhibited potent inhibitory activity (e.g., compounds 20, 21, 22 and 23). In contrast, weak inhibitors such as compounds 3, 4, 13 and 16 did not have a good fit to the 3D contour maps. Compound 21 (potent α-glucosidase inhibitor PLogIC 50 = 5.222) was layed in the 3D contour maps of model 16 to illustrate the key groups (marked by red dashed lines in Figures 5, 6, and 7) correlating with biological activity. C [1] :N:C:C(:C:C:@1)C(=C)O was a key group in all the 3D contour maps (steric, H-accept, hydrophobic) and C [1] :C:C:C(:C:C:@1)C=C was a key group in both steric and hydrophobic 3D contour maps. Both the groups were also calculated as key groups in HQSAR. Combining the results of HQSAR and CoMSIA, the two groups were considered as the key groups associated with biological activity and the result can also be used to screen potent α-glucosidase inhibitors from various databases by virtual screening. In our research, 2D QSAR and 3D QSAR models have been successfully established to quantitatively describe the relationship between structures and activities of andrographolide derivatives as α-glucosidase inhibitors. The 2D QSAR model was based on the atomic connection of molecules and suggested that there might be three key groups associated with biological activity. Furthermore, the 3D QSAR model was based on molecular properties belonging to steric, hydrophobic and H-acceptor fields and indicated that compounds with structures fitting better into the 3D contour maps of model 16 had more potent activities. Combining 2D and 3D QSAR models, the key fragments and their spatial distribution could be efficiently identified. The convinced predictability of the model was demonstrated not only by internal validation but also by external validation using a test set. Overall, these results suggested that the developed QSAR model could be used to predict the inhibitory activities of unknown andrographolide derivatives on α-glucosidase. Application of this model would greatly facilitate the discovery of better α-glucosidase inhibitors. Effects of 14-deoxyandrographolide and 14-deoxy-11,12-didehydroandrographolide on nitric oxide production in cultured human endothelial cells Intraspecific variation in active principle content and isozymes of Andrographis paniculata (kalmegh): A traditional hepatoprotective medicinal herb of India Glycosidases in cancer and invasion The alpha-glucosidase I inhibitor castanospermine alters endothelial cell glycosylation, prevents angiogenesis, and inhibits tumor growth Inhibition of experimental metastasis by castanospermine in mice: Blockage of two distinct stages of tumor colonization by oligosaccharide processing inhibitors The alpha-glucosidase inhibitor 1-deoxynojirimycin blocks human immunodeficiency virus envelope glycoproteinmediated membrane fusion at the CXCR4 binding step The combination of interferon alpha-2b and n-butyl deoxynojirimycin has a greater than additive antiviral effect upon production of infectious bovine viral diarrhea virus (BVDV) in vitro: Implications for hepatitis C virus (HCV) therapy Alpha-Glucosidase inhibitors. New complex oligosaccharides of microbial origin Valiolamine, a new alpha-glucosidase inhibiting aminocyclitol produced by Streptomyces hygroscopicus New potent alpha-glucohydrolase inhibitor MDL 73945 with long duration of action in rats Effect of two alpha-glucosidase inhibitors, voglibose and acarbose, on postprandial hyperglycemia correlates with subjective abdominal symptoms Synthesis of alpha-glucosidase I inhibitors showing antiviral (HIV-1) and immunosuppressive activity Studies on the novel alpha-glucosidase inhibitory activity and structure-activity relationships for andrographolide analogues Synthesis of andrographolide derivatives: A new family of alpha-glucosidase inhibitors Chemistry and biochemistry of microbial alpha-glucosidase inhibitors Effects of graded alpha-glucosidase inhibition on sugar absorption in vivo Genistein, a soy isoflavone, is a potent alpha-glucosidase inhibitor A new approach to the treatment of nocturnal hypoglycemia using alpha-glucosidase inhibition Alpha-glucosidase inhibitors with a 4,5,6,7-tetrachlorophthalimide skeleton pendanted with a cycloalkyl or dicarbacloso-dodecaborane group Alpha-glucosidase inhibitors: New therapeutic agents for chronic heart failure Correlation of biological activity of phenoxyacetic acids with Hammett substituent constants and partition coefficients Rational design of potent sialidase-based inhibitors of influenza virus replication Bis tertiary amide inhibitors of the HIV-1 protease generated via protein structure-based iterative design Structure-based inhibitor design by using protein models for the development of antiparasitic agents Conformation-activity relationship study of 5-HT3 receptor antagonists and a definition of a model for this receptor site 3-Hydroxy-3-methylglutaryl-coenzyme. A reductase: Molecular modeling, three-dimensional structure-activity relationships, inhibitor design A 3D model of SARS_CoV 3CL proteinase and its inhibitors design by virtual screening A novel strategy for improving ligand selectivity in receptor-based drug design Coronavirus main proteinase (3CLpro) Structure: Basis for design of anti-SARS drugs Current topics in computer-aided drug design The 3D-QSAR Studies on Andrographolide Derivatives Inhibiting α-Glucosidase Applied Multivariate Statistical Analysis SYBYL line notation (SLN): A versatile language for chemical structure representation Architecture Reference Manual Prediction of Product Quality from Spectral Data Using the Partial Least-Squares Method Beware of q2! Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins Comparative Molecular Similarity Index Analysis (CoMSIA) to study hydrogen-bonding properties and to score combinatorial libraries Chemometric Methods in Molecular Design Development of CoMFA, advance CoMFA and CoMSIA models in pyrroloquinazolines as thrombin receptor antagonist In silico ADME modelling: Prediction models for blood-brain barrier permeation using a systematic variable selection method In silico ADME modelling: Computational models to predict human serum albumin binding affinity using ant colony systems In silico ADME modeling 3: Computational models to predict human intestinal absorption using sphere exclusion and kNN QSAR methods This study was supported in part by grants from the China Natural Science Fund (30772642 and 30973618 to Y. W, and 30572209 and 30973565 to S. C) and the 211 project of Jinan University.