key: cord-0965277-eqtbvvvd
authors: De, Priyanka; Bhayye, Sagar; Kumar, Vinay; Roy, Kunal
title: In silico modeling for quick prediction of inhibitory activity against 3CL(pro) enzyme in SARS CoV diseases
date: 2020-09-21
journal: J Biomol Struct Dyn
DOI: 10.1080/07391102.2020.1821779
sha: 114b69b4265660209798d6c09fa45044ca8c5ae9
doc_id: 965277
cord_uid: eqtbvvvd

As of 2 September 2020, the 2019 novel coronavirus or SARS CoV-2 has been responsible for more than 2,56,02,665 infections and 8,52,768 deaths worldwide. There has been an urgent need of newer drug discovery to tackle the situation. Severe acute respiratory syndrome-associated coronavirus 3C-like protease (or 3CL(pro)) is a potential target as anti-SARS agents as it plays a vital role in the viral life cycle. This study aims at developing a quantitative structure–activity relationship (QSAR) model against a group of 3CL(pro) inhibitors to study their structural requirements for their inhibitory activity. Further, molecular docking studies were carried out which helped in the justification of the QSAR findings. Moreover, molecular dynamics simulation study was performed for selected compounds to check the stability of interactions as suggested by the docking analysis. The current QSAR model was further used in the prediction and screening of large databases within a short time. Communicated by Ramaswamy H. Sarma

Since late fall 2019, there has been an outbreak of the novel acute respiratory disease known as coronavirus disease 2019 , which has spread rapidly around the globe (Del Rio & Malani, 2020) . The disease has now been officially designated as severe acute respiratory syndrome-related coronavirus SARS-CoV-2 and has been declared a pandemic by the World Health Organization (WHO) (https://www.who.int/ news-room/detail/27-04-2020-who-timeline-covid-19). SARS CoV-2 has caused much more fatalities in terms of infections, deaths and economic challenges than SARS-CoV in 2002 (Lee et al., 2003 Peiris et al., 2003) . Although, SARS CoV-2 (mortality rate 3%) is less pathogenic than SARS-CoV (mortality rate ¼ 9.5% -11%), the transmission rate of the former is rather high (>2% in case of SARS CoV 2) (D€ omling & Gao, 2020) . Both SARS CoV and SARS CoV-2 share similar structural trend having a single-stranded enveloped positive RNA which infects host cell for transmission (Fung & Liu, 2019) . They have sequence similarity of about 76 to 78% for the whole protein and around 73% to 76% for the receptor binding domain (RBD) (Wan et al., 2020) . Also, the SARS-CoV main protease (or 3C-like protease or 3CL pro ) has 96.1% of similarity with the 2019-nCoV main protease. The sequence of the main protease (3CL pro ) of SARS CoV-2 has only 12 out of 306 residues different from that of SARS-CoV, and thus, this can be used as a homologous target for drug screening and repurposing (Chen et al., 2020; Zhavoronkov et al., 2020) . The present work targets C30 Endopeptidase commonly known as 3C-like proteinase or coronavirus 3C-like protease (3CL pro ) or coronavirus main protease (M pro ) which cleaves the polyproteins into individual polypeptides essential for viral replication and transcription (Goetz et al., 2007; Thiel et al., 2003) . 3CL pro is a homodimeric cysteine protease and is predicted to cleave 11 different polyproteins at 11 sites required for replication and transcription (Fan et al., 2004; Goetz et al., 2007) .

Computational approaches are effective tools to find new drug targets and repurposing of existing drugs. Molecular modeling studies such as quantitative structure-activity relationships (QSAR) (Gramatica, 2020; Roy, 2018) is one of the effective methods in predicting compounds when there is a lack of data and proper experimental facilities. The method allows virtual screening of drug libraries to find suitable drug-target for a particular disease. Large number of candidate molecules available in the drug discovery pipeline face high failure rate at the later stages of drug development. This makes computational approaches inevitable for the early predictions of pharmacokinetic and pharmacodynamic end points, thus enabling the screening process and reducing the cost and time of high end experiments (Toropova, 2017) .

In the present work, we have developed a 2D-QSAR model to determine the chemical features contributing to inhibition of SARS CoV 3CL pro . As discussed earlier, 3CL pro enzyme in both SARS CoV and in novel SARS CoV-2 has about 96% structural similarity; it can be believed that compounds inhibiting SAR CoV 3CL pro can also inhibit the SARS CoV-2 protein. We have taken a dataset of 104 compounds from different literatures as cited in Material and Methods section and determined the physicochemical features essential for their inhibitory activity (pIC 50 ). Further, we have carried out molecular docking and molecular dynamics (MD) simulation studies to understand the molecular interactions between the small molecules and protein. Also, we have carried out large database screening and predicted about the possibility and characteristics of inhibition showed by the database molecules.

The experimental IC 50 of 104 SARS coronavirus 3CL protease inhibitors was taken from previously published literatures (Chen et al., 2005; Liu et al., 2014; Lu et al., 2006; Niu et al., 2008; Park et al., 2012; Tsai et al., 2006) and applied for 2D-QSAR studies to recognize the basic structural features in those molecules essential for inhibition of SARS coronavirus main protease 3CL pro enzyme. The experimental IC 50 values were converted into negative logarithmic form (pIC 50 ) and the converted form was used for QSAR modelling. The structures were prepared in MarvinSketch software (version 14.10.27) (http://www.chemaxon.com/) with proper aromatization and hydrogen bond addition, and then, used for further descriptor calculation.

Molecular descriptors are mathematical values that describe the structures or shape of molecules, helping to predict the activity and properties of molecules without complex experiments. These are numbers containing structural information derived from the structural representation. In the present study, QSAR models were developed using a selected class of two-dimensional (2D) molecular descriptors. This involves E-state indices, connectivity, constitutional, functional, 2D atom pairs, ring, atom-centered fragments and molecular property descriptors calculated from the OCHEM platform (https://ochem.eu/home/show.do) and extended topochemical atom (ETA) indices (Roy & Ghosh, 2010) calculated from PaDel-Descriptor software (Yap, 2011) . Any constant (variance < 0.0001), intercorrelated (jrj > 0.95) descriptors and other incompetent data were removed using an in-house software available at http://dtclab.webs.com/software-tools before model development. The final dataset comprised of 562 descriptors before data division and further model development.

Selection and division of dataset into training and test sets is one of the most important steps in QSAR modeling so as to generate a well validated model (Roy et al., 2008) . The division should ensure that points representing both training and test set are well distributed within the whole descriptor space occupied by the entire dataset. In the present model, we have utilised the Modified k-Medoids (version 1.3) (http:// teqip.jdvu.ac.in/QSAR_Tools/DTCLab) method of dataset division, where 75% of the dataset compounds were put in the training set and rest 25% were put in the test set. The kmedoids algorithm is a local heuristic method that runs just like k-means (where centroids are taken into consideration) clustering when updating the medoids. This method is designed to select k most middle objects as initial medoids. The process classifies a set of objects into clusters, so that the objects within a cluster are similar to each other but are dissimilar to objects present in other clusters (Park & Jun, 2009) . After rearranging the whole dataset according to the cluster number with their corresponding activity values, the 75-25 ratio of training and test sets is obtained for further model development and validation purpose.

Variable selection is a crucial step followed during QSAR model development that ensures the extraction of the most important and influential molecular or physical or chemical features as well as for the generation of a model with good statistical significance for both internal and external validation metrics. In the present case, at the initial stage we have employed Genetic Algorithm (GA) (Devillers, 1996) method in Double Cross Validation (DCV) platform to generate a reduced pool of 29 descriptors. Further, we have employed Best Subset Selection (BSS) method to generate a series of Multiple Linear Regression (MLR) models. Then, the final model was generated using Partial Least Squares (PLS) (Wold et al., 2001) regression method using descriptors selected from BSS.

Validation of a QSAR model is essential to understand the predictive ability of the model. Critical evaluation of the developed models involving internationally accepted internal and external validation parameters was done to examine the robustness in terms of fitness, stability and classical fitness measures and predictivity of the models. Statistical parameters like determination coefficient R 2 , explained variance R 2 a , variance ratio (F) and standard error of estimate (s) were calculated. Other parameters including internal predictivity parameters such as predicted residual sum of squares (PRESS) and leave-one-out cross-validated correlation coefficient (Q 2 LOO ) were also calculated along with external predictivity parameters like R 2 pred or Q 2 F1 , Q 2 F2 and concordance correlation coefficient (CCC) . Further, we have also calculated r 2 m metrics (i.e. r 2 m and Dr 2 m ) for both training and test set compounds (Ojha et al., 2011) . Validation using mean absolute error (MAE) based criteria for both external and internal validation was done . This was done since the Q 2 ext based criteria do not always translate the correct prediction quality because of the influence of the response range as well as the distribution of the values of response in both the training and test set compounds .

According to the OECD guideline 3, any QSAR model should possess a defined applicability domain (AD). AD is a chemical space is defined by the structural information or molecular properties of the chemicals used in the model development purpose (Gadaleta et al., 2016) . Compounds lying within the region of the chemical space as defined by the internal set of the model can only be properly predicted. In this work, we have used distance to model X (DModX) approach at 99% confidence level using SIMCA software (https://landing. umetrics.com/downloads-simca) to check whether the test set compounds are within the AD or not.

In the current analysis, we have implemented molecular docking studies to explore the interaction pattern of molecules (most and least actives from the dataset) with their relevant enzyme (3C-like protease). The crystal structure of the enzyme was retrieved from the protein databank with the PDB ID: 6LU7 (crystal structure of COVID-19 main protease in complex with an inhibitor N3) (Jin et al., 2020) . The molecular docking study was performed by using Autodock tool 1.5.6 (http:// autodock.scripps.edu/resources/adt.) platform following the protocol as discussed by the Rizvi et al. in 2013 (Rizvi et al., 2013 Kumar & Roy, 2020) . Prior to docking, we have prepared the target enzyme and selected inhibitors using the protein and ligand preparation protocol available in Autodock tool 1.5.6 (http://autodock.scripps.edu/resources/adt.). The active site in the enzyme was defined by the providing explicit coordinates of active amino acids residues obtained from the cocrystal ligand in the enzyme using PDBsum web server (http:// www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum/ GetPage.pl?pdbcode=2zu4&template=ligands.html&l=1.1).The size and the exact position of the grid was adjusted by providing the coordinates using the protocol 'Grid preparation' available in Autodock tool 1.5.6 (http://autodock.scripps.edu/ resources/adt.). After completion of the receptor, ligand preparation and binding site definition, molecular docking runs were launched from the command line using cmd. In the docking analysis, we have sorted the generated poses as per binding interaction energy, and the top scoring poses (most negative) were kept for further analysis. The obtained poses were validated using the bound ligand present in the crystal structure of the enzyme. On the basis of number of interactions and active residues interacting with the bound ligand, we have selected the final pose for the further study. From the ligplot (Figure 1 ), we can see the number of interactions and active residues responsible for the significant interaction in crystal structure of COVID-19 main protease and with their bound ligand.

MD simulation of protein-ligand complexes was performed in Gromacs software 2018.1 (Van Der Spoel et al., 2005) . Protein topology was prepared using the CHARMM36 (March 2019) force field (Huang & MacKerell, 2013) . Ligand topology was generated from the CHARMM General Force Field (CGenFF) server (Soteras Guti errez et al., 2016). Dodecahedron box was used to add explicit water molecules keeping protein-ligand complex at the center. The TIP3P water model used (Mark & Nilsson, 2001 ). An appropriate number of sodium ions were added to neutralize the charge of the system. Then, the system was energy minimized by using the steepest descent minimization algorithm to optimize the hydrogen bond network. This was followed by equilibration with NVT and NPT ensembles, respectively, for 100 ps to avoid distortion of a protein-ligand complex. Final production MD simulation of the protein complexes of two most active compounds and least active compounds, 57 & 66 and 16 & 27, respectively, was performed for 100 ns at 300 K temperature. In addition, protein complexes of another three most active and least active compounds 56, 58, 67, 21, 23 and 25 were chosen for MD simulation of 20 ns. Periodic boundary conditions were applied (Makov & Payne, 1995) . Particle Mesh Ewald method was used for long-range electrostatic interactions (Petersen, 1995) . Energy and coordinates of the system were recorded at every 10 ps. Hydrogen bond interaction analyses between protein and ligand during MD was performed in the Visual Molecular Dynamics (VMD) tool by keeping cut off of 3 Å distance and angle of 20 (Humphrey et al., 1996) . Binding free energy (DG Bind ) of the ligands during MD simulation was calculated by the MMPBSA method (Kumari et al., 2014) .

The prime objective of the work was to develop a well validated QSAR model using simple descriptors obtained from PaDel-Descriptor and OCHEM platforms and utilizing them for the prediction of external set of compounds when adequate experimental data is not easily available. The 

The six descriptor PLS model (Model 1) developed for the dataset of 104 compounds was statistically significant and could precisely explain the essential features of the compounds required for good inhibition of 3CL protease. Acceptable values of the determination coefficient R 2 (0.756) and cross-validated determination coefficient (Q 2 LOO ¼0.708) were obtained from the developed model. The predictivity of the model was analysed by predictive R 2 or Q 2 F1 ðQ 2 F1 ¼ 0:752Þ which shows acceptable predictivity for the test set compounds. The values of the descriptor appearing for both the training and test sets and also the predicted pIC 50 are given in the supporting information. The observed pIC 50 versus predicted pIC 50 plot is given in Figure 2 . (Akarachantachote et al., 2014) . The descriptors from higher to lower contribution is given Figure 3 . The model consists of four 2D atom pair descriptors, one ETA and one functional group descriptor as elaborated in Table 1 . The regression coefficient plot (Wold et al., 2001) and the score plot (Jackson, 2005) are given in the supporting information ( Figures S1 and S2, respectively) .

The different descriptors and their contributions to the modelled response give certain information about the structural and physicochemical features present in the dataset compounds useful for the inhibition of 3CL pro . The 2D atom pair descriptors F01[C-N], B05[C-N] and B06[N-N] help in understanding the structures of the compounds giving an idea that single nitrogen containing heteroaromatic ring like pyridine or piperidine (e.g., compounds 58, 59 and 76) is more beneficial than multiple heteroatom containing nucleus like pyridazine, pyrimidine, thiazole and pyrazole (e.g., compounds 2 and 5). Further, the descriptor B04[O-Cl] provides an information of hydrogen bonding which is later discussed in Molecular Docking Analysis section. Presence of this fragment is advantageous as seen in compounds 56 and 58. Unsaturation in these inhibitors is beneficial which is expressed by the ETA_dBeta descriptor and this is observed in compounds 58, 59 and 60. Presence of secondary amide as depicted by nRCONHR is detrimental for good inhibition (e.g., compounds 23, 24 and 25). Figures 4 and 5 show the features increasing or decreasing inhibitory activity of the compounds towards 3CL pro enzyme.

A loading plot gives the relationship between the X-variables (descriptors) and the Y-variable (pIC 50 ) (De et al., 2018) ( Figure 6 ). The plot was developed using the first and second PLS components. During the plot evaluation, the distance from the origin is taken under consideration. The descriptors which are situated far from the plot origin are considered to have greater impact on the Y-response. Descriptors B04 [O-Cl] and nRCONHR are furthest from the plot origin and thus can be considered to have higher impact which can further be authenticated from the VIP plot and their VIP scores (VIP > 1). 

AD 'represents a chemical space from which a model is derived and where a prediction is considered to be reliable' (Gadaleta et al., 2016) . AD evaluation was done using DModX (distance to model)in the X-space using SIMCA 16.0.2 software (https:// landing.umetrics.com/downloads-simca). The AD plots are given in Figures 7 and 8 for training and test sets, respectively, and it is found that there is no outlier in case of training set and none of the compounds are outside AD in case of the test set at 99% confidence level (D-crit¼ 0.009999).

Model randomization ensures about the model significance. The randomization plot is developed in order to authenticate that the model is not the result of any chance correlation (Topliss & Edwards, 1979) . Development of randomized model involves generation of multiple models by shuffling different combinations of X or Y variables (here Y variable only) and A measure of relative unsaturation content. It can be expressed using the following equation: Db ¼ P b ns À P b s Where b ns represents VEM non-sigma contribution of a non-hydrogen vertex and b s represents VEM sigma contribution for a non-hydrogen vertex (Roy, 2015) . based on the fit of the reordered model. In this current method we have used 100 permutations, although, the number of permutations can be changed according to users' choice. For a model not generated out of chance correlation should have poor statistics for its randomized model (R 2 y intercept should not exceed 0.3 and Q 2 y intercept should not exceed 0.05). We have provided the correlation between original Y-vector and permuted Y-vector versus cumulative R 2 y , cumulative Q 2 y plots in Figure 9 . This shows that the model developed (Equation (1)) is nonrandom and robust (since R 2 y intercept ¼ 0.0156 and Q 2 y intercept ¼ À0.497) and is appropriate for prediction of pIC 50 of 3CL pro inhibitors within the AD of the model.

In the current exploration, we have performed the molecular docking studies using the most and least active compounds from the dataset. One of the most active compounds from the dataset, compound 56 (supporting information Figure S3 The next most active compound from dataset, compound 66 ( Figure 168 through interacting forces like hydrogen bonding (conventional and carbon hydrogen bonds), p-bonding (p-alkyl, p-sulfur, p-donor hydrogen bond, amide p-stacked, p-sigma, p-anion, p-p-T-shaped) halogen (fluorine) and alkyl hydrophobic bonding.

One of the least active compounds from the dataset, compound 16 (Figure 11 ), interacts with amino acid residues like CYS A: 145 through hydrogen bonding and p-sulfur and ALA A: 191 through hydrophobic alkyl bonds. Figure S6 in supporting information shows that compound 21, another least active compound from the dataset, interacts with the active amino acid residues of the enzyme such as THR Correlation of docking analysis results with the developed 2D-QSAR model From the above investigations, we have concluded that the formation of hydrogen bonding (conventional, carbon and p-donor hydrogen) and p-interaction (p-p-T shaped, p-p stacked, p-alkyl, p-cation, p-sigma and p-sulphur) between the ligand and target enzyme may play an essential role in the interactions. Hydrogen bonding (conventional, carbon and p-donor hydrogen) may be associated with the descriptors such as B04 [O-Cl] and B05[C-N] of the developed 2D-QSAR model. The descriptors ETA_dBeta and B04 [O-Cl] are well corroborated with interactions via pinteractions (p-p-T shaped, p-p stacked, p-alkyl, p-cation, p-sigma and p-sulphur) between the receptor and a ligand. All these descriptors contributed positively in the developed model and are essential features for the inhibitory activity against the 3CL pro enzyme. The above mentioned features are observed in most active compounds from the dataset such as 56, 57, 58, 66 and 67. In contrast, the descriptors nRCONHR, contributed negatively in the 2D-QSAR model, and thus, might be detrimental for the inhibitory activity, and this has been observed in the least active compound number 23, 25, 27, 16 and 21. Thus, from above observations, we can conclude that features obtained from molecular docking studies well corroborated with the features obtained from the 2D-QSAR model, and these are crucial for the inhibitory activity against 3CL pro enzyme.

After completion of the MD simulation, root mean square deviations of protein c-alpha atoms and the ligand were calculated to study the stability of the protein-ligand complexes for compounds 16, 27, 57 and 66 as depicted in Figure 12 (and for compounds 21, 23, 25, 56, 58, 67 as shown in Figure S9 of supporting information). In all systems, the calpha atoms show stable RMSD at around 1 nm while ligands show RMSD values less than 0.4 nm during their respective MD simulation time. All ligands move from the initial position and try to best fit into the binding site of the protein as shown in Figure 13 and supporting information Figure  S10 . During the 100 ns MD run, it is observed that Compound 27 detaches itself from the binding site. However, after 80 ns it again binds to protein near the edge of the ligand-binding site. Because of this, Compound 27 shows a high RMSD and low average binding affinity (DG Bind ) À50.87 KJ/mol. Similarly, Compound 57 detaches itself from the binding site at 24 ns and binds to a completely different pocket in protein which is present near the original ligand-binding site. This caused a decrease in average DG Bind (-54.80 KJ/mol). Compounds 23, 27 and 57 show more fluctuations in RMSD compared to other compounds and are not able to accommodate into a cavity as the simulation progresses. Compounds 16 and 66 remain bound to the binding site throughout the 100 ns simulation and show high average DG Bind of À72.52 and À77.37 KJ/mol, respectively. Compound 56 flips and orients itself in the binding site cavity to acquire and stabilize into a completely different position from the initial position (supporting information Figure S10 ). Root mean square fluctuation (RMSF) was calculated to study the change in the position of protein atoms during MD simulation, as depicted in Figure 14 and supporting information Figure S11 . Loop residues regions SER 1 -PRO 9, LEU 50 -ASN 53, ASP 153 -ASP 155, PRO 168 -THR 169, ALA 191 -ILE 200, ASP 216 -PHE 223 and GLY 302 -GLN 306 show high RMSF deviation. The RMSF deviation of these loop region residues shows high during when the protein is complexed with Compounds 66, 25 and 27. Hydrogen bond analysis between ligands and protein suggests ( Figure 15 and supporting information Figure S12 and Table 3 and supporting information CYS 145, GLU 166, GLN 189 and GLN 192 are the most common amino acid residues involved in H-bonding interaction with the ligands. The binding free energy (DG Bind ) was calculated by using various contributing energy components such as van der Waals (vdW), electrostatic, polar solvation and solvent accessible surface area (SASA) energies (Table 4 and  supporting information Table S2 ). In the case of Compound 66 complexed with protein, the least negative contribution of polar solvation energy helps an increase in average DG Bind . Compound 23 shows the highest affinity towards the protein with an average DG Bind of À88.14 KJ/mol followed by compound 21 with an average DG Bind of À82.24 KJ/mol. In both the cases, vdW and electrostatic energies contributed highest compared to other ligands resulting in a high binding affinity towards protein. The per residue contribution of energy during MD simulation is depicted in Figure 16 and supporting information Figure S13 . 

In silico virtual screening and computer-aided drug design methodologies allow an initial screening of large databases based on molecular properties and/or substructures, thereby saving both time and money involved in synthesising and analysing each of the molecules available in the database. This, in turn, reduces the number of molecules to be synthesized and analyzed by identifying the hit compounds only. In the present work, we have utilised three databases of 8722 antivirals, 11,309 peptidomimetics and 6968 proteases obtained from Asinex (http://www.asinex.com/) to determine their pIC 50 values using our developed model (Model 1). Furthermore, the domain of applicability and their predictive reliability are analyzed using Prediction Reliability Indicator tool . According to the prediction score obtained from Prediction Reliability Indicator tool, many compounds showed 'Good' to 'Moderate' prediction quality. The trend of the composite score and their corresponding prediction quality goes like: 

The SARS CoV 3C-like protease (3CL pro or M pro ) is a striking target for the development of anti-SARS drugs because of its critical role in viral replication and transcription. Due to high structural closeness between the enzymes in the old strain SARS CoV and the novel SARS CoV-2, the compounds inhibiting the former enzyme could be expected to show similar interactions with the latter. The present study aims at developing a 2D-QSAR model for a series of compounds acting as 3CL pro inhibitors and studying the structural features of those molecules controlling their 3CL pro inhibition (pIC 50 ). The basic features found to control the better inhibition were: (i) presence of single nitrogen containing heteroatoms; (ii) unsaturation; and (iii) hydrogen bonding. These findings were further corroborated with docking analysis studies. Further, we have predicted three large databases and reported top 25 compounds from each database which can further be subjected to experimental testing. Thus, it can be inferred that in silico methods like QSAR provide a basic understanding of physicochemical features of small molecules required for interactions with a specific target, and also it helps in prediction of a large database in a very short period, thus, reducing high experimentation cost.

No potential conflict of interest was reported by the authors.

Asinex. 

Cutoff threshold of variable importance in projection for variable selection

Synthesis and evaluation of isatin derivatives as effective SARS coronavirus 3CL protease inhibitors

Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL pro) structure: Virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates

Chemometricmodeling of larvicidal activity of plant derived compounds against zika virus vector Aedes aegypti: Application of ETA indices

COVID-19-new insights on a rapidly changing epidemic

Genetic algorithms in molecular modeling

Chemistry and Biology of SARS-CoV-2. Chem

Biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3C-like proteinase

Human coronavirus: Host-pathogen interaction

Applicability domain for QSAR models: Where theory meets reality

Substrate specificity profiling and identification of a new class of inhibitor for the major protease of the SARS coronavirus

Principles of QSAR Modeling: Comments and Suggestions from Personal Experience

CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data

VMD: Visual molecular dynamics

A user's guide to principal components

Structure of M pro from SARS-CoV-2 and discovery of its inhibitors

Development of a simple, interpretable and easily transferable QSAR model for quick screening antiviral databases in search of novel 3C-like protease (3CLpro) enzyme inhibitors against SARS-CoV diseases

g_mmpbsa-a GROMACS tool for high-throughput MM-PBSA calculations

A major outbreak of severe acute respiratory syndrome in Hong Kong

Synthesis, modification and docking studies of 5-sulfonyl isatin derivatives as SARS-CoV 3C-like protease inhibitors

Structure-based drug design and structural biology study of novel nonpeptide inhibitors of severe acute respiratory syndrome coronavirus main protease

Periodic boundary conditions in ab initio calculations

Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K

Molecular docking identifies the binding of 3-chloropyridine moieties specifically to the S1 pocket of SARS-CoVMpro

Further exploring rm2 metrics for validation of QSPR models. Chemometrics and Intelligent Laboratory Systems

A simple and fast algorithm for K-medoids clustering

Tanshinones as selective and slow-binding inhibitors for SARS-CoV cysteine proteases

Coronavirus as a possible cause of severe acute respiratory syndrome

Accuracy and efficiency of the particle mesh Ewald method

A simple click by click protocol to perform docking: AutoDock 4.2 made easy for non-bioinformaticians

Quantitative structure-activity relationships in drug design, predictive toxicology, and risk assessment

Quantitative structure-activity relationships (QSARs): A few validation methods and software tools developed at the DTC laboratory

The "double cross-validation" software tool for MLR QSAR model development

How precise are our quantitative Structure-Activity Relationship Derived Predictions for New Query Chemicals?

Be aware of error measures. Further studies on validation of predictive QSAR models

Exploring QSARs with extended topochemical atom (ETA) indices for modeling chemical and drug toxicity

On various metrics used for validation of predictive QSAR models with applications in virtual screening and focused library design

Exploring the impact of size of training sets for the development of predictive QSAR models. Chemometrics and Intelligent Laboratory Systems

Parametrization of halogen bonds in the CHARMM general force field: Improved treatment of ligand-protein interactions

Mechanisms and enzymes involved in SARS coronavirus genome expression

Chance factors in studies of quantitative structure-activity relationships

Drug metabolism as an object of computational analysis by the Monte Carlo method

Discovery of a novel family of SARS-CoV protease inhibitors by virtual screening and 3D-QSAR studies

GROMACS: Fast, flexible, and free

Receptor recognition by the novel coronavirus from Wuhan: An analysis based on decade-long structural studies of SARS coronavirus

PLS-regression: A basic tool of chemometrics

PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints

Potential non-covalent SARS-CoV-2 3C-like protease inhibitors designed using generative deep learning approaches and reviewed by human medicinal chemist in virtual reality