key: cord-0895226-figkxnnk
authors: Płonka, Wojciech; Paneth, Agata; Paneth, Piotr
title: Docking and QSAR of Aminothioureas at the SARS-CoV-2 S-Protein–Human ACE2 Receptor Interface
date: 2020-10-12
journal: Molecules
DOI: 10.3390/molecules25204645
sha: c21e050a7215d1123c3ebf6b442a308aef1dd3d3
doc_id: 895226
cord_uid: figkxnnk

Docking of over 160 aminothiourea derivatives at the SARS-CoV-2 S-protein–human ACE2 receptor interface, whose structure became available recently, has been evaluated for its complex stabilizing potency and subsequently subjected to quantitative structure–activity relationship (QSAR) analysis. The structural variety of the studied compounds, that include 3 different forms of the N–N–C(S)–N skeleton and combinations of 13 different substituents alongside the extensive length of the interface, resulted in the failure of the QSAR analysis, since different molecules were binding to different parts of the interface. Subsequently, absorption, distribution, metabolism, and excretion (ADME) analysis on all studied compounds, followed by a toxicity analysis using statistical models for selected compounds, was carried out to evaluate their potential use as lead compounds for drug design. Combined, these studies highlighted two molecules among the studied compounds, i.e., 5-(pyrrol-2-yl)-2-(2-methoxyphenylamino)-1,3,4-thiadiazole and 1-(cyclopentanoyl)-4-(3-iodophenyl)-thiosemicarbazide, as the best candidates for the development of future drugs.

With the outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1] devastating pandemic that has already claimed the lives of over one million people worldwide [2] , therapeutics are urgently needed. For future prevention, a few vaccine strategies are being tested. At the same time, medicines that can be used in fighting the infection are being sought. It is thus not surprising that vigorous research is being carried out worldwide. One of the targets of both types of studies is the interface between the virus S-protein and the ACE2 receptor, focused on complex stabilizers and protein-protein binding inhibitors. Recent studies in this area include experimental [3, 4] and theoretical [5] [6] [7] [8] structural analyses, as well as mutagenic studies [9] . Advances from all these studies have been summarized [10] .

The family of coronaviruses is relatively well known due to the previous severe human epidemics of SARS [11] and Middle East respiratory syndrome (MERS) [12] . In the search for medicines instantly available, repurposing of already approved drugs is studied as the first line of drug

The main focus of the present study was on the binding of selected molecules to the SARS-CoV-2 S-protein-human ACE2 receptor interface, although binding to the individual S-protein and ACE2 receptor was also analyzed. Three classes of compounds containing an N-N-C(S)-N skeleton, in a linear or cyclic topology, used in our laboratory in recent years for their inhibitory activity against several enzymes, were studied. Their structures contain three main cores: a linear carbonylthiosemicarbazide or its two cyclic derivatives, 1,3,4-thiadiazole and 1,2,4-triazole, each decorated with two substituents. As the C-substituent, one of the four five-member rings was used, while the N-substituent was a benzene ring or its ortho, meta, or para mono-substituted derivatives. These components are listed in Table 1 , where also partial codes for the different moieties are provided (in bold). Thus, for example, the compound code FSoOH corresponds to 5-(2-methylfuran-3-yl)-2-(2-hydroxyphenylamino)-1,3,4-thiadiazole, illustrated in Figure 1 (this turned out to be the best ligand from among the studied compounds, as shown below). In total, 166 of the above compounds were considered. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. Table 1 .

Lists of structural fragments of the compounds used in the current study. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. Table 1 .

Lists of structural fragments of the compounds used in the current study. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. Table 1 .

Lists of structural fragments of the compounds used in the current study. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of R^2 are in the order of 0.9; however, Q^2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. The docking results for all compounds are available in the Supplementary Material for the interface between the S-protein of the virus and the human ACE2 receptor (as well for these two proteins separately). In Table 2 , compounds with the best docking scores for both the interface and the individual proteins are listed. Note that algorithms implemented in docking programs use different mathematical formulas. In the case of the Gold program (and some other programs as well) the more favorable interactions the higher score. The values in bold indicate the best result for the interface (or each individual protein); the subsequent two values are written in italic.

The QSAR results are shown in Figure 2 . The best values of Rˆ2 are in the order of 0.9; however, Qˆ2 values for the leave-one-out validation are 0.43 or lower. The analysis of residuals, presented in Figure 3 , suggested that no particular molecule can be treated as an outlier. The Random Forest's "min_samples_leaf" parameter controls whether decisions of a given tree can be made on the basis of a single sample (min_samples_leaf =1, the default) or of a specified number of samples. Increasing the min_samples_leaf usually lowers the fit of the model for the training set but improves predictive capabilities. Due to the small number of samples available, we investigated Random Forests with min_samples_leaf = 1 and 2 only. 

As can be seen from the docking results presented in Table 2 , FSoOH yielded the best results for the interface as well as for the human ACE2 receptor. The position of its best binding pose is illustrated in Figure 4 . This molecule binds strongly to the Leu29-His34 fragment of the terminal helix of the ACE receptor and, to a lesser extent, to the Pro49-Tyr495 fragment of the S-protein. Among the N-N-C(S)-N motifs, the thiadiazole ring is the moiety most frequently present in top binding compounds. In this group, substituents attached to the nitrogen atom are most frequently substituted in the ortho position by fluorine or a hydroxyl group. Furan is the substituent of choice on the carbon side. It is worth noticing that the compound CCmI (together with ICpNO 2 and CCpNO 2 ) was shown not to be cytotoxic in our recent studies on anti-T. gondii activity (unpublished results). For this ligand, the molecular interactions in the binding groove are illustrated in Figure 5 . 

As can be seen from the docking results presented in Table 2 , FSoOH yielded the best results for the interface as well as for the human ACE2 receptor. The position of its best binding pose is illustrated in Figure 4 . This molecule binds strongly to the Leu29-His34 fragment of the terminal helix of the ACE receptor and, to a lesser extent, to the Pro49-Tyr495 fragment of the S-protein. Among the N-N-C(S)-N motifs, the thiadiazole ring is the moiety most frequently present in top binding compounds. In this group, substituents attached to the nitrogen atom are most frequently substituted in the ortho position by fluorine or a hydroxyl group. Furan is the substituent of choice on the carbon side. It is worth noticing that the compound CCmI (together with ICpNO2 and CCpNO2) was shown not to be cytotoxic in our recent studies on anti-T. gondii activity (unpublished results). For this ligand, the molecular interactions in the binding groove are illustrated in Figure 5 . In an attempt to extract the knowledge required for the design and synthesis of new compounds similar to those in our set but with improved activity, we used Machine Learning and performed a QSAR analysis of the S-protein-ACE2 interface docking scores. We wanted to investigate the relationships between activity and general physicochemical properties of the compounds, general structural features, as well as features specific to our molecules. We employed similar techniques used in the past [41, 42] , which proved successful. We chose the Random Forest Regressor as a modeling algorithm, due to its ability to handle non-linear relations and the possibility of learning from a small dataset with a large number of features. In an attempt to extract the knowledge required for the design and synthesis of new compounds similar to those in our set but with improved activity, we used Machine Learning and performed a QSAR analysis of the S-protein-ACE2 interface docking scores. We wanted to investigate the relationships between activity and general physicochemical properties of the compounds, general structural features, as well as features specific to our molecules. We employed similar techniques used in the past [41, 42] , which proved successful. We chose the Random Forest Regressor as a modeling algorithm, due to its ability to handle non-linear relations and the possibility of learning from a small dataset with a large number of features. It can be observed that Random Forests with one molecule per leaf described the training set better than those with two molecules per leaf. General physicochemical and topological descriptors (MOE All, MOE 15) worked better than fingerprints, while counts of fragments characteristic to the set performed poorly. This suggests that general shape-related and physicochemical properties of molecules are more important for activity than particular substituents present in our set of compounds. It should also be noted that there was almost no difference in the behavior of the models built with 354 MOE 2D descriptors and a limited set of 15, confirming the Random Forests' capability to handle feature selection issues automatically. The leave-one-out cross-validation R^2 of about 0.4 of the models, however, did not encourage their specific use for judging the potency of new compounds; evidently, for this purpose, more training data are needed.

This somewhat disappointing performance of the QSAR analysis may have several causes. Firstly, the number of studied molecules and their structural variety could have been too limited. However, since our previous experience with a similar technique using much smaller sets was quite positive [24, 25] we think the most important reason was the size of the interface. This spans for nearly 45 Å in contrast to the quite confined size of a typical active site of an enzyme. This has several consequences. One of them is the fact that different molecules can bind to different parts of the interface groove, steered by forces (van der Waals, hydrogen bonding, electrostatic interactions, etc.) in different combinations. This is illustrated in Figure 6 , which presents an overlay of the best 10 binding poses for each of the studied molecules.

In the absence of clear indicative QSAR results that would allow us to scrutinize compounds outside the studied set, we analyzed the ADMET properties of the compounds used in the docking and QSAR studies, in search for the best lead candidates within this set. The ADME analysis comprised the evaluation of physicochemical parameters, druglikeness, lipophilicity, pharmacokinetics, and leadlikeness. A typical collection of consensus parameters for a given compound is illustrated in the left panel of Figure 7 for CCmI, while all individual values are presented in the Supporting Information (Table S5 ). The right panel of Figure 7 illustrates a comparison of the results obtained for all comperrorsounds. As can be seen, with a few exceptions, all points lie in the white area of the graph, relating logP to polar surface area (TPSA), indicating good oral bioavailability. It can be observed that Random Forests with one molecule per leaf described the training set better than those with two molecules per leaf. General physicochemical and topological descriptors (MOE All, MOE 15) worked better than fingerprints, while counts of fragments characteristic to the set performed poorly. This suggests that general shape-related and physicochemical properties of molecules are more important for activity than particular substituents present in our set of compounds. It should also be noted that there was almost no difference in the behavior of the models built with 354 MOE 2D descriptors and a limited set of 15, confirming the Random Forests' capability to handle feature selection issues automatically. The leave-one-out cross-validation Rˆ2 of about 0.4 of the models, however, did not encourage their specific use for judging the potency of new compounds; evidently, for this purpose, more training data are needed.

This somewhat disappointing performance of the QSAR analysis may have several causes. Firstly, the number of studied molecules and their structural variety could have been too limited. However, since our previous experience with a similar technique using much smaller sets was quite positive [24, 25] we think the most important reason was the size of the interface. This spans for nearly 45 Å in contrast to the quite confined size of a typical active site of an enzyme. This has several consequences. One of them is the fact that different molecules can bind to different parts of the interface groove, steered by forces (van der Waals, hydrogen bonding, electrostatic interactions, etc.) in different combinations. This is illustrated in Figure 6 , which presents an overlay of the best 10 binding poses for each of the studied molecules.

In the absence of clear indicative QSAR results that would allow us to scrutinize compounds outside the studied set, we analyzed the ADMET properties of the compounds used in the docking and QSAR studies, in search for the best lead candidates within this set. The ADME analysis comprised the evaluation of physicochemical parameters, druglikeness, lipophilicity, pharmacokinetics, and leadlikeness. A typical collection of consensus parameters for a given compound is illustrated in the left panel of Figure 7 for CCmI, while all individual values are presented in the Supporting Information (Table S5 ). The right panel of Figure 7 illustrates a comparison of the results obtained for all comperrorsounds. As can be seen, with a few exceptions, all points lie in the white area of the graph, relating logP to polar surface area (TPSA), indicating good oral bioavailability.

Molecules 2020, 25, x 8 of 14 Figure 6 . Overlay of the 10 best binding poses of all studied compounds in the groove of the S-protein-ACE2 receptor interface (for color code, see Figure 4 ).

Results of the absorption, distribution, metabolism, and excretion (ADME) assessment. Left panel: consensus values for a single compound (CCmI). Right panel: BOILED-Egg graph [43] illustrating good gastrointestinal absorption (white area).

Since the ADME analysis also turned out not to be significantly discriminative for the studied compounds, we have carried out toxicity studies using statistical models. These calculations were Since the ADME analysis also turned out not to be significantly discriminative for the studied compounds, we have carried out toxicity studies using statistical models. These calculations were [43] illustrating good gastrointestinal absorption (white area).

Since the ADME analysis also turned out not to be significantly discriminative for the studied compounds, we have carried out toxicity studies using statistical models. These calculations were performed only for the compounds listed in Table 2 . The results of the tests are given in the last three columns of this table. The last column reports values relative to those obtained for chloroquine, a medicine considered for repurposed used against coronavirus disease 2019 (COVID-19), which, with the ChemPLP score of 66.29, exhibits strong complex-stabilization properties. As can be seen, toxicity results advocate against considering the compounds PSoF and FCpF, while favoring PSoMe and CCmI in further drug development studies.

We used the published model [19] of the interface between the viral S-protein and the human ACE2 receptor that was constructed using SARS-CoV-2 S-protein (NCBI: YP_009724390.1) and human ACE2 receptor (PDB: 2AJF) as templates. The structure of this model has been recently confirmed experimentally [44] . A few docking algorithms have been tested. Initially, following the literature data, Vina [45] docking program was used. However, the scores obtained for the strongest inhibitors were very close and not discriminating. We, therefore, switched to the FlexX algorithm [46] , as implemented in the LeadIT platform [47] . The scoring function of this program did a better job in the differentiation of the binding affinities of the studied compounds but showed some constraints, the most serious being the limit on the size of the acceptor, which precluded, in some cases, searches of the whole enzyme. SwissDock [48] in our hands also turned out not practical, as only a single ligand per submission to the server was possible. We therefore finally decided to continue with the ChemPLP algorithm [49] as implemented in the Gold program [50] . It is also noteworthy that this algorithm has been evaluated as one of the best in the most recent benchmark studies [51] .

For the preparation and visualization of proteins and ligands, apart from those embedded in the docking programs, Hyperchem [52], Gaussview [53] , Chimera [54] , and Mercury [55] were used.

Four sets of different molecular descriptors were used:

All MOE [56] descriptors, as a measure of general properties, see Table S3  • 15 MOE 2D descriptors selected by Particle Swarm Optimization using Fujitsu ADMEWORKS ModelBuilder software [57] ('"ASA_H", "a_acc", "a_nH", "a_nN", "E_ele", "E_oop", "GCUT_PEOE_3", "GCUT_SLOGP_1", "opr_violation", "PEOE_VSA_PPOS", "rsynth", "SlogP_VSA6", "SMR_VSA1", "vsurf_CW5", "vsurf_W5"), which were found to correlate best with activity in a linear combination, as a low-count set of general properties • Morgan Fingerprints [58] with Radius = 3 and bit length of 4096 as general structural features • Counts of fragments frequently appearing in our compound set obtained by the RECAP algorithm using Fujitsu ADMEWORKS ModelBuilder software, as features most specific to our set, see Table S4 Random Forest Regressor has been used as a modeling algorithm. The Rˆ2 on the whole training set and Rˆ2 of the leave-one-out cross-validation were used as metrics for learning and prediction performance, respectively. Calculations were done using Python scripts in the Anaconda environment [59] and the Fujitsu ADMEWORKS ModelBuilder software. Models were built using scikit-learn [60] of Random Forest Regressor with default parameters, except min_samples_leaf, as described in the Results section, and fixed random seed for reproducibility.

The SwissADME program [61] implemented online [62] was used for the assessment of the ADME properties of all the studied compounds. The BOILED-Egg graph, which uses logP calculated by the Wildman and Crippen method [63] and TPSA [64] , was generated at the same site.

Toxicology studies were carried out using online implementation [65] of the PreADMET program [66] . From among available statistical models, Ames TA100_10RLI/TA100_NA/TA1535_10RLI/ TA1535_NA tests on Salmonella typhimurium bacterial strain [67] and mouse [68] carcinogenicity, and Daphnia fish [69] toxicity were carried out.

The presented docking studies identified 5-(2-methylfuran-3-yl)-2-(2-hydroxyphenylamino)-1,3,4thiadiazole as the molecule with the best docking score among the studied set of nearly 170 compounds containing the -N-N-C(S)-N-motif. The resulting scores, when subjected to QSAR analysis did not, however, yield a model useful for a rational search of other related molecules. The most probable reason for this is the length of the interface binding groove, resulting in different binding forces depending on the exact site of the best interactions between proteins and ligands.

As a side effect of our calculations, yet another warning for the type of studies presented here can be indicated. Attempts are being made to use massive docking analyses for large libraries of ligands in rational drug design (see for example, reference 11 for studies on the S-protein-ACE2 receptor interface.) These studies are excessively automated, leading to thousands of poses that can hardly be reviewed manually, as was done in this contribution. However, as depicted in Figure 8 for the compound PSmMe, the ligand can bind several neighboring sites. In fact, in this case, binding inside the ACE2 receptor (two poses on the left side) exhibited much higher scores (57.22 and 59.22 ) than binding at the S-protein-ACE2 receptor interface. This example illustrates possible pitfalls of automatic docking and advocates for a rectangular definition of the docking space (available, for example, in Vina) rather than for the more popular docking within a given radius from a molecule, residue, or point. TA1535_10RLI/TA1535_NA tests on Salmonella typhimurium bacterial strain [67] and mouse [68] carcinogenicity, and Daphnia fish [69] toxicity were carried out.

The presented docking studies identified 5-(2-methylfuran-3-yl)-2-(2-hydroxyphenylamino)-1,3,4-thiadiazole as the molecule with the best docking score among the studied set of nearly 170 compounds containing the -N-N-C(S)-N-motif. The resulting scores, when subjected to QSAR analysis did not, however, yield a model useful for a rational search of other related molecules. The most probable reason for this is the length of the interface binding groove, resulting in different binding forces depending on the exact site of the best interactions between proteins and ligands.

As a side effect of our calculations, yet another warning for the type of studies presented here can be indicated. Attempts are being made to use massive docking analyses for large libraries of ligands in rational drug design (see for example, reference 11 for studies on the S-protein-ACE2 receptor interface.) These studies are excessively automated, leading to thousands of poses that can hardly be reviewed manually, as was done in this contribution. However, as depicted in Figure 8 for the compound PSmMe, the ligand can bind several neighboring sites. In fact, in this case, binding inside the ACE2 receptor (two poses on the left side) exhibited much higher scores (57.22 and 59.22) than binding at the S-protein-ACE2 receptor interface. This example illustrates possible pitfalls of automatic docking and advocates for a rectangular definition of the docking space (available, for example, in Vina) rather than for the more popular docking within a given radius from a molecule, residue, or point. Figure 8 . Overlay of the 10 best binding poses of PSmMe compound in the groove of the S-protein-ACE2 receptor interface (for color code see Figure 4 ).

Overall, our combined docking, QSAR, and ADMET studies led us to the conclusion that two of compounds among the studied aminothioureas, with high, although not the highest, docking scores are suitable for further search for a drug against COVID-19 due to their predicted low toxicity. These Overall, our combined docking, QSAR, and ADMET studies led us to the conclusion that two

Three Emerging Coronaviruses in Two Decades

Coronavirus COVID-19 Global Cases by the

Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2

Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2

Dynamics of the ACE2-SARS-CoV-2/SARS-CoV spike protein interface reveal unique mechanisms

Protein structure analysis of the interactions between SARS-CoV-2 spike protein and the human ACE2 receptor: From conformational changes to novel neutralizing antibodies

Spike protein recognition of mammalian ACE2 predicts the host range and an optimized ACE2 for SARS-CoV-2 infection

Enhanced receptor binding of SARS-CoV-2 through networks of hydrogen-bonding and hydrophobic interactions

Engineering human ACE2 to optimize binding to the spike protein of SARS coronavirus 2

SARS-CoV-2 pandemic and research gaps: Understanding SARS-CoV-2 interaction with the ACE2 receptor and implications for therapy

A Novel Coronavirus Associated with Severe Acute Respiratory Syndrome

Middle East Respiratory Syndrome coronavirus (MERS CoV): Update

A screen of the NIH Clinical Collection small molecule library identifies potential anti-coronavirus drugs

Binding repurposed drugs and aminothioureas derivatives to SARS-CoV-2 enzymes-A docking perspective. Sci. Rep. 2020. unpublihed work

Rapid repurposing of drugs for COVID-19

A rational roadmap for SARS-CoV-2/COVID-19 pharmacotherapeutic research and development

Remdesivir, lopinavir, emetine, and homoharringtonine inhibit SARS-CoV-2 replication in vitro

Potential Therapeutic Agents for COVID-19 Based on the Analysis of Protease and RNA Polymerase Docking. Preprints 2020, unpublished work

Repurposing therapeutics for COVID-19: Supercomputer-based docking to the SARS CoV-2 viral spike protein and viral spike protein-human ACE2 interface. ChemRxiv 2020, unpublished work

An In Silico Database of Approved Drugs, Regulated Chemicals, and Herbal Isolates for Computer-Aided Drug Discovery

Critical Differences between the Binding Features of the Spike Proteins of SARS-CoV-2 and SARS-CoV

Discovery of Potent and Selective Halogen-Substituted Imidazole-Thiosemicarbazides for Inhibition of Toxoplasma gondii Growth In Vitro via Structure-Based Design

Systematic Identification of Thiosemicarbazides for Inhibition of Toxoplasma gondii Growth In Vitro

Triazole-Based Compound as a Candidate To Develop Novel Medicines To Treat Toxoplasmosis

Paneth, P. 1,4-Disubstituted Thiosemicarbazide Derivatives are Potent Inhibitors of Toxoplasma gondii Proliferation. Molecules

Searching for novel scaffold of triazole non-nucleoside inhibitors of HIV-1 reverse transcriptase

Dual antibacterial and anticancer activity of 4-benzoyl-1-dichlorobenzoylthiosemicarbazide derivatives

Synthesis and antibacterial activity of 1,4-dibenzoylthiosemicarbazide derivatives

Biological evaluation and molecular modelling study of thiosemicarbazide derivatives as bacterial type IIA topoisomerases inhibitors

Design, synthesis and biological evaluation of 4-benzoyl-1-dichlorobenzoylthiosemicarbazides as potent Gram-positive antibacterial agents

Search for factors affecting antibacterial activity and toxicity of 1,2,4-triazole-ciprofloxacin hybrids

Structure-activity Relationship Studies of Microbiologically Active Thiosemicarbazides Derived from Hydroxybenzoic Acid Hydrazides

Search for human DNA topoisomerase II poisons in the group of 2,5-disubstituted-1,3,4-thiadiazoles

Cytotoxicity and topoisomerase I/II inhibition activity of novel 4-aryl/alkyl-1-(piperidin-4-yl)-carbonylthiosemicarbazides and 4-benzoylthiosemicarbazides

Cytotoxic effect and molecular docking of 4-ethoxycarbonylmethyl-1-(piperidin-4-ylcarbonyl)-thiosemicarbazide-A novel topoisomerase II inhibitor

Molecular mechanism of action and safety of 5-(3-chlorophenyl)-4-hexyl-2,4-dihydro-3H-1,2,4-triazole-3-thione-A novel anticonvulsant drug candidate

Studies on the anticonvulsant activity of 4-alkyl-1,2,4-triazole-3-thiones and their effect on GABAergic system

Studies on the Anticonvulsant Activity and Influence on GABA-ergic Neurotransmission of 1,2,4-Triazole-3-thione-Based Compounds

Preliminary Pharmacological Screening of Some Thiosemicarbazide, s-triazole, and Thiadiazole Derivatives

Pharmacological and Structure-Activity Relationship Evaluation of 4-aryl-1-Diphenylacetyl(thio)semicarbazides

What do docking and QSAR tell us about the design of HIV-1 reverse transcriptase nonnucleoside inhibitors?

Assessment of Nonnucleoside Inhibitors Binding to HIV-1 Reverse Transcriptase Using HYDE Scoring

A BOILED-Egg to Predict Gastrointestinal Absorption and Brain Penetration of Small Molecules

Crystal structure of SARS-CoV-2 spike receptor-binding domain bound with ACE2. Protein Data Bank 2020

AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading

Evaluation of the FLEXX incremental construction algorithm for protein-ligand docking

LeadIT 2.3.2 Program

SwissDock, a protein-small molecule docking web service based on EADock DSS

Flexible docking using tabu search and an empirical estimate of binding affinity

Development and validation of a genetic algorithm for flexible docking 1 1Edited by F. E. Cohen

Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results

Version 6.1

UCSF Chimera-A visualization system for exploratory research and analysis

0: From visualization to analysis, design and prediction

Medicinal Chemistry and the Molecular Operating Environment (MOE): Application of QSAR and Molecular Docking to Drug Discovery

Anaconda Software Distribution. Computer Software. Version 2020.02. Available online

Scikit-learn: Machine Learning in Python

SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules

Prediction of Physicochemical Parameters by Atomic Contributions

Fast Calculation of Molecular Polar Surface Area as a Sum of Fragment-Based Contributions and Its Application to the Prediction of Drug Transport Properties

The PreADME Approach: Web-based program for rapid prediction of physico-chemical, drug absorption and drug-like properties

Designing Drugs and Crop Protectants: Processes, Problems and Solutions

Models and Methods for in Vitro Toxicology

Prediction of binding potential of natural leads against the prioritized drug targets of chikungunya and dengue viruses by computational screening

Comparative Assessment of the Sensitivity of Fish Early-Life Stage, Daphnia, and Algae Tests to the Chronic Ecotoxicity of Xenobiotics: Perspectives for Alternatives to Animal Testing

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution