key: cord-0041989-th8wzqta authors: nan title: Abstracts of QSAR‐related Publications 5/2005 date: 2005-07-15 journal: QSAR Comb Sci DOI: 10.1002/qsar.200590028 sha: 7b6ccc524c1a968333599f9119723446c0e9aba7 doc_id: 41989 cord_uid: th8wzqta nan protease papain was obtained from the Brookhaven Protein Data Bank (PDB code 1KHQ), this crystal structure contains a covalently bonded diazomethylketone inhibitor connected to the sulfur atom of residue cystein 25 , which was removed before building complexes]. Computational methods: Molecular modeling [the protein-ligand complexes used for molecular dynamics (MD) simulations were generated by docking the ligand into the active site by the FlexX, Cscore calculations were used for ranking and the docked conformers the active site was defined by a radius of 6.5 around the inhibitor of the crystallographic structure, N benzoylglycine ligands were built by PC Spartan, the coordinates of hydrogens were optimized by molecular mechanics minimization using the Tripos force field and Sybyl v6.91, MD simulations were used to obtain optimal conformation for complexes, the Amber ff99 and GAFF force fields were used for describing the protein and ligands, respectively, the atomic charges for ligands were calculated by carrying out single point calculation on the HF/6 31G* level using Gaussian98 and were fitted using the standard RESP method in Amber v7, the protein was reported to be positively charged and neutralized by tleap by adding five chlorine anions, a TIP3 water cap of 20 radius around the Cys 25 residue was added]; Component Analysis, respectively, were performed using the program MVSP v3.12); FlexX (program for automatic protein-ligand docking based on incremental construction without manual intervention implemented in Sybyl v6.91); MLR (Multivariate Linear Regression analysis was performed using Xlstat pro7.5). Data calculated: Geometry descriptors [generation of geometrical descriptors for regression analyses (Fig. 1) : D1-D3: dihedral (torsion) angles between atoms C1ÀC3ÀO1ÀC7, C3ÀO1 C7ÀC8, and O1ÀC7ÀC8ÀN1, respectively; Z ¼ van der Waals distance from O1 to the end of the molecule through the axis of O1ÀC3]; C2, O1 (charge on atoms C2 and O1); WS_O1 (average watershell around atom O1); WdW_Vol (van der Waals volume ( 3 )]; Z (Z distance as depicted in Fig. 1 ). Chemical descriptors: p (Hansch-Fujitas substituent constant characterizing hydrophobicity); s (Hammetts constant characterizing the electron-withdrawing power of the substituent); F, R (Swain-Luptons electronic parameters, characterizing the field and resonance effects, respectively. Results: MD simulations and full structure LocalSCF semi-empirical quantum mechanics calculations of receptorligand complexes were carried out to investigate structure-activity relationship for the papain hydrolysis of a set of 23 Nbenzoylglycine esters of type I and various atomic charge and structural parameters of complexes as descriptors. Traditional QSAR descriptors were reinterpreted using detailed structural information. A modreately significant linear regression equations were calculated (Eqs. 1, 2) and it was shown that the pattern of charge distribution on the ester group was different if charges of free or complex ligands were analyzed. The charges of two significant atoms, namely the O1, which is at the reaction center and the C2, were used which are the closest independent non-substituted atom to the substituent, water-shell and torsional angle values as descriptors. The effects of complexation on the electronic structure of ligands were also studied by multivariate analyses (PCA and PCO) and a trend was found for the change of inter-ligand correlation of atomic charges by complexation. The correlation among these charges after complex formation was different for the case of molecules substituted at para-or at dimeta positions. It was suggested that the results can help to understand how traditional QSAR descriptors, e.g., F or s, interact with other electronic effects during complex formation. Compounds: 10 polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs), 24 polychlorinated naphthalenes (PCNs), 4 polycyclic aromatic hydrocarbons (PAHs), 6 chlorobenzenes (CBs), and 13 polybrominated diphenyl ethers (PBDEs). Data taken from the literature: logK OA (logarithm of the 1-octanol/air partition coefficient). Computational methods: MLR (stepwise Multivariate Linear Regression analysis). Data calculated: n c m (n th order, m th type molecular connectivity indices proposed by Kier and Hall). Results: Octanol-air partition coefficients of semivolatile organic compounds (PCNs, CBs, PAHs, PCDD/Fs and PBDEs) have been modeled based on molecular connectivity indices. The following statistically significant linear regression equations were calculated using stepwise MLR. lgK OA ¼ 4.066 1 c v À 2.376 2 c p þ 1. 387 (1) n ¼ 24 r ¼ 0.991 s ¼ 0.127 F ¼ 598 (2) n ¼ 4 r¼ 0.999 s ¼ 0.061 F ¼ 827 (3) n ¼ 10 r ¼ 0.990 s ¼ 0.173 F ¼ 394 lgK OA ¼ 1.488 2 c p À 1.217 (4) n ¼ 6 r¼ 0.996 s ¼ 0.082 F ¼ 558 lgK OA ¼ 2.25 2 c p À 0.86 5 c vp À 4.879 (5) n ¼ 13 r ¼ 0.963 s ¼ 0.275 F ¼ 64 Higher accuracy has been obtained in comparison with the models based on theoretical molecular descriptors. The correlation coefficients are greater than 0.99 except that for PBDEs, and the standard deviation is less than 1.83 log units, which is less than error measured by Harner et al. In addition, the computation of molecular connectivity index used in this paper is very simple and the acquisition of data is very easy. Therefore, this method is a viable alternative to predict the octanol-air partition coefficients from molecular structures. (B. B.) Computational methods: Model calculation (EC 50 values of the test compounds were estimated using an house program system and correlated with anti-HIV activity taken from the literature). Data calculated: EC 50 [effective concentration of the test substance (mM) required for 50% inhibition of HIV RT was estimated using an house program system]; W (Wiener index); 1 c (first order molecular connectivity index); A c c [augmented eccentric connectivity index calculated as A c c ¼ S i ¼ 1 n (M i /E i ), where M i is the of all degrees of all vertices (Ü j ), adjacent to vertex i, E i is the eccentricity, and n is the number of vertices in graph G, for a molecular graph (G), Ü 1 , Ü 2 , ..., Ü n are vertices and the number of first neighbors of a vertex Ü i is this vertex]. Results: HIV RT inhibitory activity of 61 acylthiocarbamates of types I and II has been modeled using topological descriptors (W, 1 c, and A c c ) Resulting data were analyzed and predictive models were developed after identification of the active ranges. In the next step biological activity was assigned to each of the compounds using these models which were compared with the reported anti-HIV activity. Very high accuracy of prediction ranging from 95% to 98% was achieved using the developed topological models (n, r, s, and F as well as the analytical form of the model based upon the augmented eccentric topological index is not given). Comparison of the three QSAR models reveals the following data (descriptor, % classification, accuracy of prediction): W, 72.13, 97.72; 1 c, 55.73, 97.06; A c c , 100, 95.08. Analysis of the structure-activity relationships of the compounds in the active range revealed the following features: activity is shown only by the O-2-(phthalimidoethyl) for the acylthiocarbamates of type II. All compounds of type I having other groups are inactive. Presence of a halogen or a nitro group at position 4 of N-aryl group is important for activity. Among the halogens, chloride is the most suitable. Compounds having halogen or nitro group at other positions are inactive. Compounds having methyl or ethyl group or no substitution at this position are active but are relatively less potent. In general, five compounds in the set are more suitable as acyl group (type II), although compounds displaying a six-member ring, with or without heteroatom, are also active. Amongst five-member heterocyclic rings, activity is better with furfuryl group. The developed topological models possess a useful potential for designing lead structures for development of further potent anti-HIV agents. (B. B.) Title: Geometry, topology, and atom-weights assembly descriptors to predicting A1 adenosine receptors agonists. Authors: Gonzá lez*, M. P.; Terá n, C.; Teijeira, M.; Besada, P. Service Unit, Experimental Sugar Cane Station Villa Clara-Cienfuegos Ranchuelo, Villa Clara C. P. 53100, Cuba. [Topological, Galvez Topological Charges indexes, Randic Molecular Profiles, Geometrical, WHIM, calculated using DRAGON software, and GETAWAY descriptors: CIC1, complementary information content (neighborhood symmetry of 1-order); SP12, shape profile no.12; SP03, shape profile no.03; SP04, shape profile no.04; SP12 shape profile no.12; FDI, folding degree index; H8v, H autocorrelation of lag 8/weighted by atomic van der Waals volumes; REIG, first eigenvalue of the R matrix; R2u þ , R maximal autocorrelation of lag 2/unweighted; R7u þ , R maximal autocorrelation of lag 7/unweighted; R5v, R autocorrelation of lag 5/weighted by atomic van der Waals volumes; R1v þ , R maximal autocorrelation of lag 1/ weighted by atomic van der Waals volumes; T(N..S), sum of topological distances between N..S; piPC09, molecular multiple path count of order 09; MPC09, molecular path count of order; VRA1, Randic-type eigenvector-based index from adjacency matrix; MSD, mean square distance index (Balaban); DP01, molecular profile no.01; SP07, Shape profile no.07; SP13, shape profile no.13; JGI5, mean topological charge index of order 5; GGI10, topological charge index of order 10; GGI9, topological charge index of order 9; GGI8, topological charge index of order 8; GGI3, topological charge index of order 3; GGI2 topological charge index of order 2; W3D, 3D-Wiener index; AGDD, average geometric distance de-gree; DDI, D/D index; ADDD, average distance/distance degree; MAXDP, maximal electro topological positive variation; Gu, G total symmetry index/unweighted; E2 s, second component accessibility directional WHIM index/ weighted by atomic electro topological states; L2 s, second component size directional WHIM index/weighted by atomic electro topological states; G3 m, third component symmetry directional WHIM index/weighted by atomic masses; P2 m, second component shape directional WHIM index/weighted by atomic masses; E3u, third component accessibility directional WHIM index/unweighted]; s cv (cross-validated residual standard deviation of the predicted value); q 2 (cross-validated correlation coefficient). Results: The GETAWAY descriptors have been used for modeling the quantitative structure-activity relationships of 32 A1 adenosine receptors agonists of type I. A regression model (Eq. 1) has been developed accounting for more than 77% of the variance in the experimental activity (K i ). Five further regression models employing topological, Galvez topological charges indexes, Randic molecular profiles, geometrical, and WHIM descriptors, failed to give satisfactory models (R 2 ¼ 0.70) for this property with the same number of variables in the equation (descriptor type, r, s, F, q 2 , s cv ): topological, 0. 837, 0.439, 9.8, 0.491, 0.573; Galvez topological charges indexes, 0.816, 0.464, 8.3, 0.427, 0.608; Randic molecular profiles, 0.800, 0.482, 7.4, 0.403, 0.620; geometrical, 0.789, 0.493, 6.9, 0.366, 0.640; WHIM, 0.879, 0, 383, 0.481, 7.5, 0.371, 0.637 [generation of electronic parameters such as dipole moment (m), orbital energies (E HOMO , E LUMO ) and partial atomic charges (e.g. net charge on the epoxide oxygen atom, Q O ), electrostatic potential minima (V min ) were calculated using the semiempirical AM1 method); logP (logarithm of the partition coefficient in 1-octanol/water was estimated using via CLogP software). Results: Molecular modelling of human microsomal EH by homology with A. niger EH crystal structure has been carried out to 1.8 resolution. In the homology model the active site lies in a well-defined, essentially hydrophobic, pocket within the enzyme structure. Two tyrosine residues, that are conserved in all known mammalian EH sequences, form hydrogen bonds with the epoxide oxygen atom on the known EH substrate, styrene oxide. A small hydrophobic cleft is present in the active site region, where the phenyl group of styrene oxide can bind, but this cleft appears to be of limited size such that the bulky side-chains of an incoming epoxide molecule may obstruct binding. The inhibitory activity of epoxide hydrolase substrates results from an association between a relatively low Q O value for hydrogen bonding to the active site tyrosines, and a fairly high lipophilicity in the form of logP which is related to a combination of molecular size of the substituent and its overall hydrophobic character. This feature set optimizes affinity for the essentially hydrophobic active site region lined by complementary amino acid residues such as leucine, methionine and tryptophan. Consequently, logP/QO provided a relevant descriptor for the explanation of potency differences with the set of epoxides, as shown by Eqs. 1 -4. Biological material: Human cell lines expressing cyclooxygenase 2 (COX-2). Data taken from the literature: IC 50 [concentration of the test substance (mM) required for 50% inhibition of COX-2]. Computational methods: Molecular modeling [structures were pre-optimized using the MM þ molecular mechanics force field implemented in HyperChem, full optimization was performed using the semiempirical PM3 method implemented in MOPAC employing the Polak-Ribiere algorithm); SVM [Support Vector Machine approach for classification and regression problems based on the "structural risk minimization" principle which defines the trade-off between the approximation quality of a given dataset and the complexity of the approximating function as opposed to the empirical risk minimization concept, that concentrates on the approximation quality for the dataset (theory is given), all calculation programs implementing SVM were written in R-file using Libsvm based on R script, the kNN algorithm was also performed by R software, all scripts were run on a Pentium IV PC with 256 MB RAM]; CODES-SA (Comprehensive Descriptors for Structural and Statistical Analysis, a chemical multipurpose QSAR/QSPR statistical analysis and prediction program package for the calculation of constitutional, topological, geometrical, electrostatic, quantum mechanical, and thermodynamic descriptors solely from the structure of compounds and searching for the best multiple linear relationships between the calculated descriptors and experimental property data, CO-DESSA v2.61 employs the "The Heuristic Method" and "The Best Multilinear Regression Method"); kNN [k-Nearest Neighbors, a pattern recognition method, where the distance (usually Euclidean) between the pattern vector of an unknown and each of the pattern vector of the training set is first computed, the k nearest samples to the unknown are selected and it is classified in the group to which the majority of the k (k ¼ 3) samples belongs]. Data calculated: Descriptors [selected CODESSA descriptors: RPCS relative positive charged SA (RPCS); HACA-2/ TMSA (HACA2/T); HACA-1 (HACA-1); average structural information content (order 0) (ASIC0); number of benzene rings (NBR); PNSA-1 partial negative surface area (PNSA 1); average bond order of a N atom (ABON)]; rmse [root mean square deviation (mM) between the experimental and predicted IC 50 values]. Results: QSAR and classification models have been developed for a novel set of COX-2 selective inhibitors of types I -V employing SVM methodology. Each compound was described by CODESSA descriptors. The heuristic method was used to search the descriptor space and select the descriptors relevant to inhibitory activity. The approach yielded a seven-descriptor model based on SVMs with root mean-square errors of 0.107 and 0.136mM for training and prediction sets, respectively. The best classification results were derived using SVMs. . Results: A QSAR study of novel antibacterial BSFQs obtained by the derivatization of N4-piperazinyl atom of ciprofloxacin (CIP) has been performed. The behavior of the new BSFQ series was similar to the studied previously norfloxacin (NOR) analogs allowing the QSAR analysis of a complete set of BSFQs. Hansch analysis of the activity data showed a linear correlation of the activity with electronic and steric parameters. Small electron-donor groups increased the in vitro activity against Gram-positive bacteria. Hydrophobic properties played a relatively minor role in modulating MIC values, while lipophilic parameters obtained from HPLC turned to be more reliable descriptors. In this study the amino-and the methylamino derivatives were the most active analogs. Single parameter regression equations for pMIC were statistically poor. The best multilinear regression equations were the followings (Eqs. 1 -5): Descriptors (22 MOE descriptors including hydrophobicity descriptors, refractivity descriptors, atomic partial charge descriptors, and topological descriptors, as well as 65 ISIS keys); E-state index (non-empirical description of an atom in a molecule in which both electronic and topological attributes are unified into a single index for each skeletal atom (or hydride group)); logP (logarithm of the partition coefficient in 1 octanol/water was calculated by the recent method of Wildman and Crippen as implemented in MOE); RMSEC, RMSECV (root-mean-square error of calibration and root-mean-square error of cross-validation, respectively); RMSE (root-mean-square error between calculated and observed logS w value). Results: Linear and nonlinear methods (PLS, CR, and NN) have been employed for the modeling the aqueous solubility of 930 diverse organic compounds described by 1D and 2D descriptors from MOE package in combination with Estate or ISIS keys. The best model was derived using linear PLS analysis employing a combination of 22 MOE descriptors and 65 ISIS keys, showing r 2 ¼ 0.935 and RMSE ¼ 0.468 (logS w ) statistical parameters. The model was validated using a test set of 177 compounds that were not included in the training set, displaying r 2 ¼ 0.911 and RMSE 0.475 logS w parameters. Fig. 1 shows the plot of the predicted versus experimental solubility values for the test set. The descriptors were ranked according to their relevance to the model displaying 22 MOE descriptors at the top of the list. The results produced by the CR model was statistically as good as the PLS model, owing to the cross-validation method employed (REMSEC, REMSEP, RMSD). The nonlinear methods did not performed better than the linear ones. The good statistical quality of the linear PLS and CR render them the method of choice to be used for predictions when it is difficult or impossible to make experimental measurements e.g., for virtual screening, combinatorial library design, and lead optimization. (B. B.) Title: A rapid screening tool for estimating the potential of 2-hydroxypropyl-b-cyclodextrin complexation for solubilization purposes. Authors [molecular descriptors: X1: molecular weight, MW (g/mol); X2: molecular volume, MV (cm 3 ); X3: melting point (MP); X4: molecular refractivity (CMR); X5: logP; X6: topological surface area, TPSA ( 2 ); X7: total number of hydrogen bonds (H tot ); X8: number of oxygen and nitrogen atoms (n ON ); X9: number of OH and NH groups (n OHNH ); X10: total solubility parameter (d tot ); X11: partial solubility parameter, dispersion component (d d ); X12: partial solubility parameter, polar component (dp); X13: partial solubility parameter, hydrogen bonding component (d h ); X14: combined partial solubility parameter (d v )]. Results: A rapid screening tool has been developed for the estimation of the potential of HP-b-CD complexation for solubilization purposes. Quantitative structure-property relationships (QSPRs) models were developed for predicting the solubility enhancement of compounds in 45% (w/v) aqueous solution of HP-b-CD. A set of 25 structurally different drugs with known experimental log(S/S 0 ) were used as a training set for building the models. The molecular descriptors, including parameters for size, lipophilicity, cohesive energy density and hydrogen-bonding capacity were used in multivariate analysis. Eight relevant variables were identified by using PCA and CA. The first and second components extracted by PCA explained 46% and 20% of the variance in log(S/S 0 ), respectively. Seven calculated predictors (MW, MV, CMR, ClogP, TPSA, H tot , and d tot ) and the experimental one (MP) were selected by PCA and CA. Two statistically significant four-descriptor models were generated using MLR and by PLS method. (2) n ¼ 25 r ¼ 0.873 s not given F not given q 2 ¼ 0.605 External model validation was carried out employing a test set of six compounds that were representative of the training set used. Out of the six compounds in the testing set, only one compound, namely Zolpidem, had a residual of 1.09 log unit, whereas the remaining five compounds were predicted with a residual of < 1 log unit. Fig. 1 shows the plot of the predicted versus observed log(S/S 0 ) values, where bars represent the standard error of prediction. These equations can allow formulation scientists to rapidly estimate the potential of HP-b-CD in increasing solubility of poorly water-soluble drugs at the early stage of drug development. The role of the selected descriptors in model description selected could be reasonably rationalized. As for the effect of hydrogen bonding and cohesive forces on the solubility of drug-CD complexed species, it seems very complicated and difficult to understand. (B. B.) Title: QSPR treatment of the soil sorption coefficients of organic pollutants. Authors (maximum p-p bond order); PNSA-1 (partial negative surface area, AM1); q 2 (cross-validated correlation coefficient). Results: General and class-specific QSPR models have been developed for soil sorption (logK OC ) of 344 organic pollutants (0 < logK OC < 4.94) using a large diverse set of theoretical molecular descriptors based only on molecular structure. Two general QSAR models were obtained. The two-parameter Model_1 (logP and h ) was derived for a structurally representative set of 68 chemicals showing R 2 ¼ 0.76 and s ¼ 0.44 statistical parameters. The four-parameter Model_2 (logP, PNSA-1, h , p max p-p ) involved a total of 344 compounds displaying R 2 ¼ 0.76 and s ¼ 0.41 parameters. Fig. 1 shows the plot of the calculated versus experimental logK OC values by Model_2 for 344 pollutants, where three compounds denoted by lines (near circles) had the largest residuals. Model_1 was validated using the test set comprising the remaining 276 pollutants (R 2 ¼ 0.70, s ¼ 0.45). An additional validation of both models was carried out employing an inde-pendent set of 48 pollutants. Both Model_1 and Model_2 predicted the logK OC values at the level of experimental precision. The theoretical molecular descriptors used in the QSPR models yielded further insight into the mechanisms of soil sorption. Analysis of the distribution of the residuals of the logK OC values calculated by both general models indicated the need and possible advantages of modeling soil sorption for smaller data sets of individual classes of organic pollutants. Accordingly, QSPR models were also developed for the 14 individual chemical classes. The descriptors used in these models were examined and related to the possible interaction mechanisms in soil sorption. The major molecular properties relevant to soil sorption were hydrophobicity, size and shape of the compounds and charge distribution. A larger size and bulkier shape favor nonspecific interactions with the soil constituents and the humic matrix. The charge distribution describes nonspecific polar and specific interactions. Computational methods: MOL-GEN-QSPR (software combining descriptor generation with calculations of many molecular descriptors with data processing employing various multivariate and statistical methods); CODES-SA (Comprehensive Descriptors for Structural and Statistical Analysis, a chemical multipurpose QSAR/QSPR statistical analysis and prediction program package for the calculation of constitutional, topological, geometrical, electrostatic, quantum mechanical, and thermodynamic descriptors solely from the structure of compounds and searching for the best multiple linear relationships between the calculated descriptors and experimental property data, CO-DESSA employs the "The Heuristic Method" and "The Best Multilinear Regression Method"); MLR (Multivariate Linear Regression analysis); LOO (Leave-One-Out cross-validation). Data calculated: [all substructures of one to four bounds in the fluoroalkanes considered: CÀC, CÀF, CÀCÀC, FÀCÀF, CÀCÀF, CÀCÀCÀC, CÀC(C)ÀC, F 2 À C À C, F À C À C À F, F 3 À C, C À C À C À C À F, CÀCFÀC, F 3 ÀC, F 2 ÀCÀF, CÀCFÀCÀC, CÀCÀCÀCÀF, CÀCF(C)ÀC, CÀC(C)ÀCÀF, F À C À C À C À F, F À C À CF À C, F 2 À C À C, C À CF 2 C]; Descriptors [further descriptors: bip ¼ n(CHF 2 ) þ n(CH 2 F), tbip ¼ bip þ n(CÀCHFÀC); xsF ¼ N f À n(CH 2 F) À N H þ n(CHF 2 ), unless this number is negative, in which case xsF is set to 0; n(CF 3 ) 2 , n(CH 3 ) 2 , n(CF 3 · n(CH 3 ); (rel(N F ) 2 ¼ (N F /number of all atoms) 2 , (F rate ) 2 ¼ (N F /(2N C þ 2)) 2 ; ( 2 TC V and 3 TM1 ¼ Bonchev overall indices; HCCF man ¼ number of HCCF fragments as counted by Wolf; 3 P ¼ number of paths of length 2 in the H-suppressed molecular graph); Xu m ¼ (modified Xu index); S i (E-state index for atom i, a structure representation based on atom level topological and electronic information, e.g., S(ssssC) sum of E state indices of carbon atoms bearing no hydrogen); T i (zero order valence type molecular connectivity index; 0 c v , fourth order path molecular connectivity index c P ); r 2 cv (cross-validated correlation coefficient); s cv (cross-validated residual standard deviation of the predicted value). Results: MOLGEN-QSPR, has been used for QSPR modeling of the boiling points of lower fluoroalkanes. The derived models are based exclusively on simple descriptors derived from molecular structure. The following multilinear MOL-GEN-QSPR models were calculated for the set of fluoroalkanes (Eqs. 1 -3). BP ¼ 83.2226 Xu m À 25.1841 0 c v À 12.2045 4 c P À 5.21304 S(ssssC) À 34.247 n(FÀCÀF) À 9.2969 n(F À C À C À F) À 41.3515 The models described the BPs of nearly twice as many fluoroalkanes to a higher precision than did and previous at-tempt by the same statistical method and the same number of descriptors. The predictive capability of the models were validated by LOO cross-validation. (B. B.) Title: Theoretical analysis of the retention behavior of alcohols in gas chromatography. Authors . Results: Quantitative structure-retention relationship (QSRR) analysis of the gas chromatographic retention times of alcohols have been developed. The QSPR model has been developed using MLR and a set of molecular descriptors of the alcohols calculated using AM1 and ab initio methods. Statistically highly significant liner regression equations were obtained for the retention times. (Partial Least Squares projections to latent structures analysis performed using SIMCA-P v9.0); LOO (Leave-One-Out cross-validation). Data calculated: Descriptors (a total of 162 descriptors were calculated using the AM1 method and 121 descriptors using the B3LYP/6-31G** method: 4 descriptors based on molecular orbital energies, 31 descriptors based on charge distribution, 75 descriptors based on molecular surface area and net atomic charges, 41 descriptors based on molecular orbital wave functions and energies, 11 geometrical descriptors); logP (logarithm of the partition coefficient in 1 octanol/water calculated using PALLAS v3.0); RMSE (root mean square error of the fit); r 2 cv (cross-validated correlation coefficient performed using MINITAB v13.1); q 2 (cross-validated correlation coefficient for the PLS model). Results: The electronic descriptors of the compounds in a large and chemically diverse organic compound set was calculated using semiempirical and ab initio methods for the development of toxicological quantitative structure-activity rela-tionships models employing MLR and PLS analyses. The use of logP in a single parameter regression equation yielded the following regression equation (8 outliers were omitted) (Eq. 1). pLC 50 ¼ 0.700 (AE 0.022) logP À 0.720 (AE 0.058) (1) n ¼ 560 r ¼ 0.806 s ¼ 0.803 F ¼ 1034 r 2 cv ¼ 0.647 Fig. 1 shows the plot between pLC 50 versus logP, where the outliers were denoted by empty circles. Statistically similar and significant models were obtained using AM1 and B3LYP calculated descriptors and in addition calculated logP values. The quality of the models derived using the two sets of quantum chemical descriptors depended mainly on the type of descriptors employed. It was found that for modeling large data sets (n ¼ 568) irrespective of the mechanism of toxic action, the use of precise but computer time demanding ab initio descriptors did not offer considerable advantage over the semiempirical ones. For the best MLR models the following statistical parameters were obtained (model, descriptor calculation method, number of parameters, r 2 , r 2 cv , s, (the PETRA program package employs various physicochemical methods for the calculation of atomic physicochemical properties of organic compounds, e.g., total, s-and p-charge distribution, atomic polarizability, lone-pair sand p electronegativities, and molecular properties such as hat of formation, molecular polarizability, etc.); T i (Hotelling score calculated by a method utilized for the determination of the prediction space covered by the developed models based on the Hotellings T 2 statistic including PCA). Results: Structure descriptors were employed to discriminate between modes of toxic action of phenols utilizing two classification models based on a data set of 220 phenols with four associated MOAs. CPG NN and multinom were used as classification methods. The combination of topological autocorrelation of empirical p-charge and s-electronegativity and of surface autocorrelation of hydrogen-bonding potential yielded a 21-dimensional model that significantly discriminated between the four MOAs. Fig. 1 shows the PCA-score plot of the training set with 21 descriptors, where the ellipse defines the 95% confidence region (full circles, full triangles, x signs, and open triangles denote polar narcotics, uncouplers, proelectrophiles, and soft electrophiles, respectively. The calculation of such descriptors was very fast which made them ideal for screening of databases. The predictive power of the approach was found to be equal to previously published quantum mechanical based descriptors. The overall predictive power of the approach was estimated to be 92% using 5-fold CV. In the next step, a simple score value for the distance to the training data was used to determine the prediction space of the model and employed in a study of the phenols included in the NCI database. The use of a prediction space metric was proved to be an essential requisite for the screening of this very diverse database. The prediction space covered by the developed model was rather limited due either to the limited diversity and size of the training set or to the high dimensionality of the descriptors employed. (B (sum of the squared deviation between the predicted and measured binding affinities for every molecule); LOO (Leave-One-Out cross-validation); q 2 (cross-validated correlation coefficient). Results: Upon sunlight exposure, PAHs undergo rapid structural modification generally via oxidation reactions. The modified products are in many cases more toxic than their parent compounds. The toxicity is due to the p-orbital system of PAHs, which strongly absorbs in the UV and visible regions of the solar spectrum. In this study a QSAR study of 67 PAHs has been performed and a prediction rule for the phototoxicity of these compounds was proposed. EHOMO, ELU-MO and GAP were used as descriptors. The relationships between these molecular descriptors and the photo-induced toxicity were found to be non-linear, and Gaussian type functions were used to linearize them. Statistically significant PLS models were calculated and the models were validated by predicting the phototoxicity for a set of molecules not used in model development. Pentaphene of type I, benzo[b]chrysene of type II, and dibenz [a.j] anthracene were among the compounds potentially phototoxic as predicted by the model. A new GAP range (7.2 AE 0.7 eV) was proposed for the classification of phototoxic compounds, and a larger cutoff was suggested for the normalized lethal time as log(1/ALT) À 2.95. An unscaled regression equation was obtained from the PLS model (Eq. 1) by using a routine for unscaling the regression vector obtained from autoscaled data. (1) n ¼ 13 r ¼ 0.92 s not given F not given q 2 ¼ 0.79 The photo-induced toxicity for 53 PAHs was estimated from the QSAR study. Based on the results, a new scale for toxic compounds was proposed and the predicted values of phototoxicity allowed the classification of these molecules into toxic or non-toxic. (B. B.) Title: Quantitative structure-activity relationships of nitroaromatics toxicity to the algae (Scenedesmus obliguus). Authors 2nitroanisole, 3nitrophenol, 2, 2, 1, 1, 1, 2, 2, 2, 3, (charge of the nitro group). Results: Quantitative structure-activity relationships of the toxicity of a set of 25 nitroaromatics toxicity against algae (S. obliguus) has been performed using MLR. The descriptors logP, E LUMO , and Q NO2 were used to as independent variables to develop the QSAR models. For 18 mononitro derivatives, the hydrophobicity parameter logP explained the toxic activity successfully (Eq. 1). The use of E LUMO and Q NO2 yielded Eq. 2 for all the compounds (Eq. 2). Omitting three outliers gave the statistically highly singificant Eq. 3. Eqs. 1 and 3 indicate that there are two different mechanisms for the mono-and dinitro aromatic compounds. The toxicity of mononitro aromatics to the algae was found to be proportional to their hydrophobicity. However, for dinitro aromatic compounds, due to the electrophilic nature of the parent compounds, there is probably a mechanism of stepwise reduction of À Q NO2 . It was suggested that Eq. 3 can be used to predict the toxicity of nitro aromatic compounds against algae studied here. (B. B.) Title: Toward an optimal procedure for PC-ANN model building. Prediction of the carcinogenic activity of a large set of drugs. Authors [root mean square error of the cross-validated model). Results: The performances of the three novel QSAR algorithms, ER-PC-ANN, CR-PC-ANN, and PC-GA-ANN were compared by application of these models to the prediction of the carcinogenic activity of a diverse set of 735 drugs, each described by a total number of 1350 theoretical descriptors. The data matrix (735  1350 dimension) was subjected to PCA. PCA explained 95% of the variances in the matrix by the first 137 principal components (PCs). ER, CR, and GA were employed to select the best PCs for PC-ANN modeling. In the ER-PC-ANN approach, the PCs were stepwise entered into the ANN based on their decreasing eigenvalue. In the CR-PC-ANN, the ANN was first employed to model the nonlinear relationship between each one of the PCs and the carcinogen activities separately. In the next step, the PCs were ranked based on their decreasing correlating ability and entered to the input layer of the ANN one after another. Finally, GA was used to find the best set of PCs. Both external and cross-validation methods were used to validate the predictive performances of the derived models (model, r 2 cv , RMSE cv ): ER-PC-ANN, 0.714, 1.89; CR-PC-ANN, 0.911, 0.95; PC-GA-ANN, 0.922, 0.88). Fig. 1 shows the plot of calculated versus experimental activity for the test set compounds using the PC-GA-ANN model. It was found that the PC-GA-ANN and CR-PC-ANN procedures outperformed the EV-PC-ANN procedure. The results revealed that the results yielded by the PC-GA-ANN algorithm were better than those produced by CR-PC-ANN. However, the difference was not significant. Biological material: Kidney, brain, muscle, lung, liver, heart, and fat. Data taken from the literature: PC t (tissue-blood partition coefficient of 36 organic chemicals for human fat, kidney, muscle, lung, and heart, and of 10 compounds for rabbit fat, kidney, muscle, lung, and heart). Computational methods: MLR (Multivariate Nonlinear Regression analysis was performed using the standard regression analysis program GFA BASIC v4.38); LOO (Leave-One-Out cross-validation). Data calculated: Descriptors (molecular descriptors were calculated using the HYBOT/HYBOT-PLUS-98 program pack-age: molecular polarizability, a; maximum positive charge, q þ max ; sum of all positive partial atomic charges for all acceptor substructures in the molecule, SCa; sum of the H-bon factor values for all donor atoms in a molecule, SCd; maximum H-bond acceptor descriptor in a molecule, Ca max ); f ui [fraction of unionized and ionized basic compounds is calculated using Eq. 1. PRESS (sum of the squared deviation between the predicted and measured binding affinities for every molecule). Results: Recently the author has developed a nonlinear model for the tissue-blood partition coefficients of neutral compounds. In this study a new approach is presented for the tissue-blood partition coefficients of ionized compounds partitioning into kidney, brain, muscle, lung, liver, heart, and fat. In this paper a nonlinear model equation based on tissue composition (a content of lipids, proteins, and water) for the tissue-blood partition coefficients of compounds was further developed to account for the neutral and ionized forms of the compounds (the complex equations of the nonlinear model are given). Using the developed model a nonlinear regression analysis for neutral and ionized compounds partitioning into kidney, brain, muscle, lung, liver, heart, and fat yielded equations with high predictive power for the training set (n ¼ 201, r ¼ 0.905, s ¼ 0.291, q ¼ 0.890) and test set compounds (n ¼ 64, r ¼ 0.906, s ¼ 0.247). Fig. 1 shows the plot of the calculated versus experimental PC t values for 201 data points in the training set. The results demonstrated that the equilibrium distribution of a compound in a several tissues is essentially the equilibrium distribution of the compound in tissue (chemical) compositions. It was also shown that neutral and ionic forms of a compound as well as in different tissue (chemical) compositions have a different mechanism of action in vivo. The nonlinear model equation may be considered as the expressive form of the Hansch equation in nonlinear spaces (or multiphase system). (B. B.) [concentration of the test substance (dimension not given) required for 50% inhibition of GP a ]. Computational methods: LUDI (program implemented in InsightII to determine possible binding geometries for a ligand that interacts with hydrogen bonding and hydrophobic sites of the receptor using statistical data from small-molecule crystal structures); ComFA [Comparative Molecular Field Analysis of the molecules was carried out represented by their steric and electrostatic fields sampled at the intersections of a three-dimensional lattice (2 grid increment) using an sp 3 carbon atom probe with a charge of þ 1, H-bonding fields, indicator fields, and parabolic fields were also used, and all regression analyses were done using PLS algorithms in SYBYL v6.9]; CoMSIA [Comparative Molecular Similarity Indices Analysis of the molecules was carried out as an alternative approach to CoMFA based on similarity indices calculated at the intersections of a three dimensional lattice, the five physicochemical properties for CoMSIA (steric, electrostatic, hydro-phobic, and hydrogen bond donor and acceptor) were evaluated using a common probe atom with 1 radius, þ 1.0 charge, and hydrophobicity and hydrogen bond property values of þ 1, the value of an attenuation factor a was 0.3 for the Gaussian-type distance dependence]; LOO (Leave-One-Out cross-validation); MLR (Multivariate Linear Regression analysis). Data calculated: CFH (the common features generated by the Common Feature Hypothesis approach: A hydrogen-bond acceptor; R, aromatic ring; H, hydrophobic); Cont (steric contact scores); Lipo Score (Ludi score function value); No.rot.bonds (number of rotatable bonds); (sum of all scores); s PRESS (standard deviation of cross-validated predictions); r 2 pred (predictive correlation coefficient); q 2 (cross-validated correlation coefficient). No.HB (number of hydrogen bonds). Results: Glycogen phosphorylase (GP a ) is an attractive target for the design of inhibitors that may prevent glycogenolysis at high glucose levels in type II diabetes. The carboxamides represent one of the major classes of GP inhibitors other than glucose derivatives. In this study CoMFA methodology and docking of ligands into GP a was performed in order to elucidate the essential structural and physicochemical requirements responsible for binding to the GPa enzyme and to develop predictive models for designing indole-2-carboxamide derivatives. The Common Feature Hypothesis approach generated 10 hypotheses containing the RHDA, RHDD, HHDA, or HHDD features. All 25 molecules belonging to the training and test sets mapped every feature of all hypotheses. Fig. 1 shows the structure of the template molecule of type II with R 1 ¼ Cl, R 2 ¼ F, and R 3 ¼ CO(1-piperidin-4-ol), onto which all the other 24 molecules were superimposed using the RMS fitting procedure. Compounds: a) 57 COX-2 inhibitors comprising the cyclopentene-, spiroheptene-, benzene-, pyrazole-pyrrole-, imidazole-, pyrimidine-, isoxazole-thiazolone-, thiadiazole-, and oxadiazole class of compounds, e.g., 5 cyclopentene class of compounds of type I, where R 1 ¼ diverse aromatic moieties, R 2 ¼ NH 2 , CH 3 ; b) SC-558 a selective COX-2 inhibitor. (Model 1), a second mode of alignment was carried out using docking strategies with the DOCK v4.0 program (Model 2), the ligand structures were refined using the BFGS method, the molecules aligned in these manner were subjected to CoMFA analysis); DOCK [program for finding potential docking sites on proteins of known structure by starting with solvent accessible surface, and filling cavities with overlapping spheres to make binding pockets, ligands of known structure (e.g., found by searching a database) are then automatically docked into this site]; AFFINI-TY (program based on both Monte Carlo and simulated annealing strategies, to generate simultaneously the conformations and alignment needed for the CoMFA study); ComFA [Comparative Molecular Field Analysis of the molecules was carried out represented by their steric and electrostatic fields sampled at the intersections of a three-dimensional lattice (2 grid increment) using an sp 3 carbon atom probe with a charge of þ 1, all regression analyses were done using PLS algorithms in SYBYL v6.7]; PLS (Partial Least Squares projections to latent structures analysis); LOO (Leave-One-Out cross-validation). Data calculated: SDEP (Standard Deviation Error in Prediction); PRESS (sum of the squared deviation between the predicted and measured binding affinities for every molecule); q 2 (cross-validated correlation coefficient). Results: A CoMFA study of a diverse training set of 53 COX-2 inhibitors has been performed employing two different alignment methods. The first method of alignment of the molecules was based on the binding information obtained from a crystallographic study, that yielded CoMFA Model 1. The second mode of alignment was generated by docking the inhibitors in the binding pocket using the DOCK v4.0 and AFFINITY suite of programs, that yielded CoMFA Model 2. The Model 2 was slightly better than Model 1 in terms of the statistical parameters r 2 and q 2 (model, n, N opt , q 2 , PRESS, r 2 , s, F, SDEP, steric, electrostatic molecular fields): Model 1, 33, 6, 0.624, 0.985, 0.971, 0.274, 145.1, 0.455, 0.327, 0.673; Model 2, 34, 6, 0.733, 0.804, 0.989, 0.160, 418, 6, 0.509, 0.297, 0. 703. Fig. 1 shows the plot of the predicted versus observed pIC 50 values for the test set calculated with Model 2. Title: 3D-QSAR and preliminary evaluation of anti-inflammatory activity of series of N-pyrrolylcarboxylic acids. Authors Crystal structure (atomic coordinates of the of structure of SC-558 bound to COX-2 were taken from the Brookhaven Protein Data Bank). Computational methods: Molecular modeling (modelling calculations were performed using Sybyl v6.6 software, geometry optimizations were performed employing the Tripos force field with the Powell method, the AM1 semiempirical method implemented in Mopac v6.0 was applied for quantum chemistry calculations, the crystal structure of SC-558 bound to COX-2, obtained from was used as a template structure for building SC-558 derivatives, the following structural features were used for the alignment of the compounds: (i) the centroid of the R 1 substituted phenyl ring, (ii) the carbon atom of the two or three fluoro substituted methyl group and the sulfur atom of phenylsulfonamide group); ComFA [Comparative Molecular Field Analysis of the molecules was carried out represented by their steric and electrostatic fields sampled at the intersections of a three-dimensional lattice (2 grid increment) using an sp 3 carbon atom probe with a charge of þ 1, all regression analyses were done using PLS algorithms in SYBYL v6.6]; CoMSIA [Comparative Molecular Similarity Indices Analysis of the molecules was carried out as an alternative approach to CoMFA based on similarity indices calculated at the intersections of a three dimensional lattice, the five physicochemical properties for CoMSIA (steric, electrostatic, hydro-phobic, and hydrogen bond donor and acceptor) were evaluated using a common probe atom with 1 radius, þ 1.0 charge, and hydrophobicity and hydrogen bond property values of þ 1, the value of an attenuation factor a was 0.3 for the Gaussian-type distance dependence]. Data calculated: pK a (negative logarithm of the acidic dissociation constant were calculated with ACD/Chem-Sketch software); SEP (Standard Error of Prediction); q 2 (cross-validated correlation coefficient). Results: The study aimed at the development of new potential inhibitors of COX-2. 3D-QSARs of compounds of type I were investigated using CoMFA and CoMSIA methodologies. Statistically significant CoMFA and CoMSIA models were calculated using SC-558 as template structure (model, n, N opt , molecular field, q 2 , SEP): CoMFA, 14, 3, electrostatic, 0.761, 0.284; CoMFA, 14, 5, steric, 0.667, 0.374; CoMSIA, 14, 4, electrostatic, 0.828, 0.253; CoMSIA, 14, 5, steric, 0.625, 0.397; CoMSIA, 15, 3, electrostatic, 0837, 0.564; CoMSIA, 15, 5, hydrophobic, 0.902, 0. 483. Fig. 1 shows the predicted COX-2 activities of the investigated compounds calculated using the developed CoMSIA model (electrostatic field, q 2 ¼ 0.837) where the compound predicted to be most active (16) is type I with R 1 ¼ 4-CO 2 H, R 2 ¼ CHF 2 . The derived new compounds, with high similarity to the template of already recognized selective COX-2 inhibitors, will be an object of forthcoming evaluation of COX-2 selectivity in vitro. (B. B.) Title: Three-dimensional quantitative structure-activity and structure-selectivity relationships of dihydrofolate reductase inhibitors. Authors -(3,4) -dimethyl-4-(3-hydroxyphenyl)piperidines of type II, a class of opioid antagonists, recently provided selective antagonists for three subtypes opioid receptors (MOR and KOR). Molecular modeling modeling studies have been performed indicating strong structural similarity between the parent of this series of compounds and 2amino-1,1-dimethyl-7-hydroxytetralin of type I. It has been established that type I represents a new scaffold for opioid receptor antagonism. In binding and in vitro functional assays, the aminotetralin derivatives showed some overlap in structure-activity relationships with that previously reported for the phenylpiperidine series, providing evidence for a common binding mode for the two series at these type opioid receptors. Fig. 1 shows the overlay of type 4a with type II with R 1 ¼ cinnamyl, R 2 ¼ H (IIa). Introduction of a methoxy group in the 3-position of the skeleton increased potency at MOR and KOR receptors, suggesting that this aminotetralin derivative offers an alternative scaffold for the design of further receptor selective opioid ligands. The ligands designed and synthesized displayed comparable opioid binding affinity to their analogues of type II. (B. B.) Title: Molecular-modeling based design, synthesis, and activity of substituted piperidines as g-secretase inhibitors. Authors Computational methods: Molecular modeling (a conformational search based on the Monte Carlo method in Macromodel was performed on type I, the MMFF94 force field was used in the energy minimization steps, the low-energy conformer of type I was used as the query, using a Tanimoto coefficient cut-off of 0.7, the search of an in house database employing ROCS yielded about 500 hits); ROCS (program for finding new scaffolds for lead development, the ROCS program identifies molecules that have a similar 3D shape, the similarity search requires the input of a compound query to make comparisons against a compound database, in this study the degree of shape similarity was scored by a Tanimoto coefficient). Data calculated: where N(AB) is the number of bits set in common by A and B, N(A) is the total number set by A, and N(B) is the total number set B]. Results: It has been hypothesized that inhibition of g-secretase, one of the enzymes responsible for A b production, may be a useful strategy for the treatment of AD. In this study molecular-modeling based design, synthesis, and evaluation of activity of substituted piperidines of type I, as g-secretase inhibitors, has been performed. A pharmacophore has been derived based on type I and used in a ROCS search of a corporate database yielding type II. A set of g-secretase inhibitors has been designed based on the scaffold of type II. Fig. 1 shows the schematic view of the developed pharmacophore overlaid on compound of type I, shown as the low-energy conformer. Three of the designed analogs of type III showed moderate g-secretase inhibitory activity (EC 50 ¼ 3.5 -20.9mM; EC 50 of type I % 0.1mM). The data suggested that the compounds acted via g-secretase inhibition. The measured g-secretase inhibitory activity within this small library validated the usefulness of the ROCS search design strategy. (B. B.) Title: Inhibitory effects of 2-substituted-1-naphthol derivatives on cyclooxygenase I and II. Authors (6), 2167 -2175. Compounds: a) 13 a-naphthol compounds of type I carrying diverse substitients in the b position; b) 4 Known COX-inhibitors: flurbiprofen, naproxen, aspirin, SC-558 of type II. Biological material: a) Two isoforms of the cylcooxygenase enzyme (COX-1 and COX-2); b) Immortalized mouse PGHS-1 and PGHS-2 cells; c) Vero cells (ATCC CCL-81). Data taken from the literature: Crystal structure [atomic coordinates of COX-2 complexed with SC-558 and COX-1 bound to flurbiprofen were taken from the Brookhaven Protein Data Bank (pdb codes: 1eqh and 1cx2, respectively)]. Data determined: IC 50 [concentration of the test substance (mM) required for 50% inhibition of the production of PGHS-1 and PGHS-2 using immortalized mouse PGHS-1 and PGHS-2 cells]; IC 50 [concentration of the test substance (mg/mL) required for eliciting 50% cytotoxic effect measured using colorimetric method]. Computational methods: Molecular modeling [initial structures of eight naphthol compounds of type I were generated by molecular model-ing software Sybyl v6.8.21, geometry optimization of the structures was performed using the semiempirical AM1 method, binding conformations of naphthol derivatives with COX-2 and COX-1 were studied employing AutoDock v3.0.5 using LGA in conjunction with an empirical force field to calculate ligand free energy of binding, the Kollman-all-atom charges were assigned to enzyme electrostatic contributions whereas Gasteiger Hückel charges were assigned to all ligands, the docking simulations reproduced the co-crystals of burbiprofen and SC-558 bound to COX-1 and COX-2 enzymes with RMSD value of 0.7 for both cases (references: 1eqh and 1cx2, respectively), the Au-toDock method and the parameter set employed was extended to search the enzyme binding conformations for other inhibitors]; AUTO-DOCK (automated docking program using a Metropolis Monte Carlo algorithm of simulated annealing for positional and conformational searching in combination with a rapid energy evaluation through precalculated grids of molecular affinity potentials allowing the inclusion of van der Waals, electrostatic, and hydrogen bonding interactions); GA (Genetic Algorithm is a stochastic optimization method that mimics the process of evolution by manipulating a collection of data structures called chromosomes); LGA (Lamarckian Genetic Algorithm). Data calculated: rmsd [root mean square deviation () of the position of the corresponding atoms of two superimposed molecular structures]. Results: Structure-activity relationships of the inhibitory effects of 2-substituted-1-naphthol derivatives against COX-1 and COX-2 were investigated using receptor docking methods. Five compounds showed preferential inhibition of COX-2 over COX-1, while three compounds lacked inhibitory effect on either the COX-1 or COX-2 isozyme. Results of the docking experiments with these naphthols indicated that the presence of hydroxyl group at C-1 position on the naphthalene nucleus enhanced the anti-inflammatory activity towards COX-2 via hydrogen bonding to the COX-2 Val 523 side chain. When this hydroxyl group was replaced by methoxy group, there was no inhibition. Introduction of C-2' dimethyl substituents onto the propyl chain also increased the inhibitory activity. All active compounds possess the C-1 hydroxyl group aligned so as to form hydrogen bond with Val 523 . Fig. 1 shows the orientation of the docked compound of type I substituted with (CH 2 ) 3 )OH group in the position b. The hydrogen bonding interaction with Val 523 is shown near arrow. The results provided a model for the binding of the naphthol derivatives to COX-2 which should be very useful to design more potent COX-2 and selective COX inhibitors. (3D structures of the compounds were generated and subsequently refined by energy minimization using Tripos force field and Gasteiger-Hückel charges with distance dependent dielectric and conjugate gradient method as implemented in SYBYL 6.9); VolSurf [computer program that automatically converts 3D molecular fields into simpler molecular descriptors from numerous tesserae containing the same information, VolSurf builds a unique framework (a volume and/or a surface) related to specific molecular properties, each 3D molecular field map is made of a regular lattice of boxes called voxels which represents attractive and repulsive forces between an interacting partner and a molecule, each voxel is defined by a volume, a surface, and by an interaction energy value, by contouring the voxels at different energy levels, different images can be obtained]; PCA [Principal Component Analysis yielding principal components (PCs)]; PLS (Partial Least Squares projections to latent structures analysis); LOO (Leave-One-Out cross-validation). Data calculated: SDEP (Standard Deviation Error in Prediction); SDEC (Standard Deviation of Error of Calculations); q 2 (cross-validated correlation coefficient). Chemical descriptors: Descriptors (72 descriptors were calculated using Volsurf v3.0 employing the OH2, O, and DRY probes). Results: Volsurf analysis was applied to a set of 70 carbapenem compounds acting as antibacterials using S. aureus and E. coli representing Gram positive and Gram negative bacteria, respectively. PCA of the data matrix of 70 carbapenem analogs and 72 Volsurf descriptors yielded 5 PCs (PC, explained variance, cumulated value): PC1, 45.62, 45.62; PC2, 15.09, 60.72; PC3, 9.62, 70.33; 7.44, 77.79; 6.64, 84.41 , respectively. The PC score plots showed clustering of compounds according to the activity. The PC loading plots explained the Volsurf descriptors responsible for the separation and behaviour of the compounds. PLS analysis yielded two significant models (species, q 2 , r 2 , N opt , SDEP, SDEC, % of variance): S. aureus, 0. 684, 0.883, 7, 0.276, 0.168, 79.13; E. coli, 0.514, 0.756, 6, 0.271, 0.192, 81.22 . Both the PCA and PLS models were validated by an external test set of 15 carbapenem analogs. All the compounds of the test set were predicted fairly well showing residual values less than one log unit. Fig. 1 shows the PLS coefficient plot for the correlation of Volsurf descriptors for S. aureus. The MIC activity data of S. aureus (Gram positive) was better explained than E. coli (Gram negative) by the PLS models. Fig. 2 shows the plot of experimental versus calculated activities of S. aureus obtained by PLS analysis, where the triangles show the predictions of the training set and stars show the data for the test set compounds. It was concluded that the Volsurf approach is highly efficient in predicting the biological activities and pharmacokinetic behaviour of these carbapenem antibiotics. (B. B.) Fig. 2 University of Medicine and Dentistry of New Jersey, Robert Wood Johnson Medical School (UMDNJ-RWJMS) 675 Hoes Lane Title: Evaluation and application of multiple scoring functions for a virtual screening experiment E-mail: li.xing@pfizer.com; Fax: 1-636-247-7607 Dx-9065a (Factor Xa ligand) Fig. 1 larity indices calculated at the intersections of a three-dimensional lattice, the five physicochemical properties for CoMSIA (steric, electrostatic, hydrophobic, and hydrogen bond donor and acceptor) were evaluated using a common probe atom with 1 radius, þ 1.0 charge, and hydrophobicity and hydrogen bond property values of þ 1, the value of an attenuation factor a was 0.3 for the Gaussiantype distance dependence, the X-ray crystal structures of three inhibitors bound to pcDHFR were used for defining the alignment rule, scaled MNDO ESP-fit partial charges were calculated with MOPAC v6.0 using atomic coordinates obtained by energy minimizing the aligned molecules with the MMFF94S force field and MAXIMIN2 routine in Sybyl, the regression analyses were done using PLS algorithms in SYBYL v6.81]; PLS (Partial Least Squares projections to latent structures analysis); CV[Leave-One-Out (LOO), Leave-Ten-Out (LTO) cross-validation, and Leave-Several-Out (LSO)]. Data calculated: rmsd [root mean square deviation () of the position of the corresponding atoms of two superimposed molecular structures]; MAE (Mean Absolute Error); s PRESS (standard deviation of cross-validated predictions); q 2 (cross-validated correlation coefficient). Results: 3D-QSAR CoMSIA modelling has been applied to set of 406 structurally diverse pcDHFR and rlDHFR inhibitors. A QSAR model containing 6 components was developed for pcDHFR employing LTO cross-validation (n ¼ 240, q 2 ¼ 0.65), and a 4-component model was calculated for rlDHFR (n ¼ 237, q 2 ¼ 0.63), both including steric, electrostatic and hydrophobic contributions (DHFR, r 2 , s, F, q 2 , s PRESS , N opt , MAE test , steric, electrostatic, hydrophobic fields): pcDHFR, 0. 80, 0.52, 157.5, 0.65, 0.69, 6, 0.63, 0.18, 0.43, 0.39; rlDHFR, 0.63, 186.9, 0.63, 0.77, 4, 0.75, 0.19, 0.37, 4, 0.19, 0.37, 0.44. Fig. 1 shows the plot of the measured versus predicted pIC 50 values for pcDHFR, where training and test set compounds are shown using filled and empty circles, respectively; CoMSIA contour maps of the contributions for the significant molecular fields were used to identify important ligandreceptor interactions in 3D. Classification models were also developed predicting selectivity for pcDHFR over rlDHFR using SIMCA methodology, with a selectivity ratio of 2 (IC 50 rlDHFR/IC 50 pcDHFR) for delimiting classes. A 5-component model including steric and electrostatic molecule field contributions displayed cross-validated and test set classification rates of 0.67 and 0.68 for selective inhibitors, and 0.85 and 0.72 for unselective inhibitors. It was concluded that the predictive power of the CoMSIA and SIMCA classification models, together with the structural insights derived from them, might aid in the design of novel inhibitors used against P. carinii infections in immuno-compromised states. (B. B.) Title: Identification of a new scaffold for opioid receptor antagonism based on the 2-amino-1,1-dimethyl-7-hydroxytetralin pharmacophore.Authors Compounds: 48 Compounds of type I, where R 1 , R 2 are diverse substituents, R 3 ¼ Me, Bn; X ¼ÀOÀ, ÀSÀ, ÀNHÀ, ÀN(Me)À, or missing; Y ¼ÀC(¼O)OÀ, ÀCH 2 OÀ, ÀC(¼O)NHÀ.Biological material: Sigma-2 receptor present in the CNS as well as in various peripheral tissues, and are involved in several physiological effects.Data taken from the literature: [all molecular modeling calculations were performed on an Silicon Graphics O2 R10000 workstation, 3D structures of the ligands were generated using fragment libraries and/or the builder module of the InsightII 2000 package, energy minimizations were carried out using the conjugate gradient method with AMBER force field parameterized in vacuum and the Discover module of InsightII 2000, compounds were generated in the non-protonated form, the conformational search was performed using a simulated annealing procedure, followed by cluster analysis yielding subsets of conformers for a specified molecule based on a defined rmsd value, combination of FILO and GA techniques provided sets of chromosomes formed by the combination of conformers (one for each molecule of the dataset) for which the R 2 value was maximized, PCA and the PLS algorithms were used as implemented in the Almond program (47) . Results: A GRIND-derived pharmacophore model has been developed for a set of a-tropanyl derivative ligands of type I of the sigma-2 receptor using GRIND descriptors. Statistically significant PLS models for were calculated for sigma-2 affinity. Sigma-2 model: r 2 ¼ 0.83, q 2 ¼ 0.63) and S (sigma-1/ sigma-2 selectivity): r 2 ¼ 0.72, q 2 ¼ 0.46 were derived using a training set of a-tropanyl derivatives. The models provide pictures of the virtual receptor site (VRS) providing a qualitative pharmacophoric representation of the sigma receptor. The analysis performed using the GRIND descriptors has confirmed the presence of two hydrophobic areas and a Hbond donor moiety in the binding site of the sigma receptor, interacting with the lipophilic groups and the electron-rich center of the molecules. They modeled the internal geometrical relationships within two hydrophobic areas (hydrophobic-1 and -2) and a H-bond donor receptor region with which ligands establish non-covalent bonds. Fig. 1 shows the proposed geometrical relationship and maps of the main interaction areas for the s-receptor. The obtained PLS model predicts the sigma-2 activity of the a-tropanyl derivatives involved in the study, while results from the selectivity analysis highlight the distance within the two hydrophobic areas as the major sensitive element for sigma selectivity. [program for finding potential docking sites on proteins of known structure by starting with solvent accessible surface, and filling cavities with overlapping spheres to make binding pockets, ligands of known structure (e.g., found by searching a database) are then automatically docked into this site]. Data calculated: Shape of Signature (the shape signature is a histogram representation the ray segment lengths obtained from the ray tracing within the SAV of a molecule, a shape signature which contains only the length of information is termed a 1D signature, the 2D-MEP signatures encoding both segment length and electrostatic potential information associated with the point of incidence inside the SAV); . Results: Enrichment of ligands has been performed for the serotonin receptor using the Shape Signatures approach, a new 3-dimensional molecular comparison method adapted here to rank ligands of the serotonin receptors. The approach was exemplified using a variety of test databases including the mixture of agonists and antagonists together with approximately 10 000 randomly chosen compounds from the NCI database. Both ID and 2D Shape Signature databases were compiled for the enrichment studies, and key parameters for searching and matching the molecules were determined. It was found that the ID Shape Signature approach is highly efficient in separating agonists from a mixture of molecules which includes compounds randomly selected from the NCI database taken as inactives. The method was also equally effective at separating agonists and antagonists from a pool of active ligands for the serotonin receptor. The influence of conformational variation of the shape signature on enrichment was studied by docking a subset of ligands into the crystal structure of serotonin N-acetyltransferase (code: 1IB1). Enrichment studies using the resulting "docked" conformations yielded only slightly improved results compared with the CORINA-generated conformations. Fig. 1 shows the comparison between CORINA generated and docked conformations, where black circles and gray squares denote 2D and 1D scores. Parallel enrichment studies were carried out using 2D shape signatures showing high selectivity with more restricted coverage due to the high specificity of 2D {Spearmans rank correlation coefficient r s ¼ 1 À [6 · S i d 2 i /n(n 2 À 1)], where d i is the difference between two ranks at the point i and n is the total number of points}; rmsd [root mean square deviation () of the position of the corresponding atoms of two superimposed molecular structures]. Results: In order to identify novel chemical classes of factor Xa inhibitors, five scoring functions were employed to evaluate the docking poses generated by FlexX. The compound collection was composed of 549 confirmed potent factor Xa inhibitors and a subset of the LeadQuest screening compound library. Four scoring functions but PMF successfully reproduced the crystal complex (PDB code: 1FAX). This was unexpected since PMF was parametrized on crystal complexes. Fig. 1 shows the representative poses generated by FlexX in reference to co-crystallized DX-9565a. After docking and scoring by different methods FlexX exhibited the highest hit rate enrichment in the entire screening process, followed by D-SCORE and ChemScore. Hit rate enrichments by G-SCORE and PMF were comparatively moderate. The hit rate of 80% was achieved by FlexX at an energy cutoff of À 40 kJ/mol, which is about 40-fold over random screening (2.06%). The study suggested that presenting more poses of a single molecule to the scoring functions could deteriorate their enrichment factors. A series of potential factor Xa inhibitors was identified from LeadQuest with a potential capability of replacing the benzamidine moiety, yielding compounds with improved pharmacokinetic properties. Several promising scaffolds with favorable binding scores were identified from LeadQuest. Consensus scoring by pair-wise intersection failed to enrich the hit rate yielded by single scorings (i.e. FlexX). It was cautioned that reported successes of consensus scoring in hit rate enrichment could be artificial because their comparisons were based on a selected subset of single scoring and a reduced subset of double or triple scoring. The findings obtained in this study were based upon a single biological system. (B. B.)