key: cord-0003191-q69f57el authors: Farhadi, Tayebeh; Hashemian, Seyed MohammadReza title: Computer-aided design of amino acid-based therapeutics: a review date: 2018-05-14 journal: Drug Des Devel Ther DOI: 10.2147/dddt.s159767 sha: 17c135abc3c6a212ba22ae3d3750e4a396933022 doc_id: 3191 cord_uid: q69f57el During the last two decades, the pharmaceutical industry has progressed from detecting small molecules to designing biologic-based therapeutics. Amino acid-based drugs are a group of biologic-based therapeutics that can effectively combat the diseases caused by drug resistance or molecular deficiency. Computational techniques play a key role to design and develop the amino acid-based therapeutics such as proteins, peptides and peptidomimetics. In this study, it was attempted to discuss the various elements for computational design of amino acid-based therapeutics. Protein design seeks to identify the properties of amino acid sequences that fold to predetermined structures with desirable structural and functional characteristics. Peptide drugs occupy a middle space between proteins and small molecules and it is hoped that they can target “undruggable” intracellular protein–protein interactions. Peptidomimetics, the compounds that mimic the biologic characteristics of peptides, present refined pharmacokinetic properties compared to the original peptides. Here, the elaborated techniques that are developed to characterize the amino acid sequences consistent with a specific structure and allow protein design are discussed. Moreover, the key principles and recent advances in currently introduced computational techniques for rational peptide design are spotlighted. The most advanced computational techniques developed to design novel peptidomimetics are also summarized. Different diseases may be caused by pathogens or malfunctioning organs, and using therapeutic agents to heal them has an old recorded history. Small molecules are conventional therapeutic candidates that can be easily synthesized and administered. However, many of these small molecules are not specific to their targets and may lead to side effects. 1 Moreover, a number of diseases are caused due to deficiency in a specific protein or enzyme. Thus, they can be treated using biologically based therapies that are able to recognize a specific target within crowded cells. 2 Under the biologic conditions, some macromolecules such as proteins and peptides are optimized to recognize specific targets. 3 Therefore, they can override the shortcomings of small molecules. 3 Recently, pharmaceutical scientists have shown interest in engineering amino acid-based therapeutics such as proteins, peptides and peptidomimetics. [4] [5] [6] Theoretical and experimental techniques can predict the structure and folding of amino acid sequences and provide an insight into how structure and function are encoded in the sequence. Such predictions may be valuable to interpret genomic information and many life processes. Moreover, engineering of novel proteins or redesigning the existing proteins has opened the ways to achieve novel biologic macromolecules with desirable therapeutic functions. 7 Protein sequences comprise tens to thousands of amino acids. Besides, the backbone and side chain degrees of freedom lead to a large number of configurations for a single amino acid sequence. Protein design techniques give minimal frustration through precise identification of sequences and their characteristics. [8] [9] [10] [11] Considering energy landscape theory, the adequately minimal frustration in natural proteins occurs when their native state is adequately low in energy. 7 The de novo design of a sequence is difficult because there are huge numbers of possible sequences: 20 N for N-residue proteins with only 20 natural amino acids. 12 Peptide design should incorporate computational approaches. It can benefit from searching the more advanced fields used for small molecules and protein design. 13 However, the straightforward adoption of computational approaches employed to small-molecule and protein design has not be accepted as a reasonable solution to the peptide design problem. [14] [15] [16] In the peptide drug design, the conformational space accessible to peptides challenges the small-molecule computational approaches. Besides, the necessity for nonstandard amino acids and various cyclization chemistries challenges the available tools for protein modeling. 13 Furthermore, the aggregation of peptide drugs during production or storage can be an unavoidable problem in the peptide design procedure. Rational design of a peptide ligand is also challenging because of the elusive affinity and intrinsic flexibility of peptides. 17 Peptide-focused in silico methods have been increasingly developed to make testable predictions and refine design hypotheses. Consequently, the peptide-focused approaches decrease the chemical spaces of theoretical peptides to more acceptable focused "drug-like" spaces and reduce the problems associated with aggregation and flexibility. 13, 18 For the discussions that follow, peptides can be defined as relatively small (2-30 residues) polymers of amino acids. 18 In physiological conditions, several problems such as degradation by specific or nonspecific peptidases may limit the clinical application of natural peptides. 19 Moreover, the promiscuity of peptides for their receptors emerges from high degrees of conformational flexibility that can cause undesirable side effects. 20 Besides, some properties of therapeutic peptides, such as high molecular mass and low chemical stability, can result in a weak pharmacokinetic profile. Therefore, peptidomimetic design can be a valuable solution to circumvent some of undesirable properties of therapeutic peptides. 21, 22 In the biologic environment, peptidomimetics can mimic the biologic activity of parent peptides with the advantages of improving both pharmacokinetic and pharmacodynamic properties including bioavailability, selectivity, efficacy and stability. A wide range of peptidomimetics have been introduced, such as those isolated as natural products, 23 synthesized from novel scaffolds, 24 designed based on X-ray crystallographic data 25 and predicted to mimic the biologic manner of natural peptides. 26 Using hierarchical strategies, it is possible to change a peptide into mimic derivatives with lower undesirable properties of the origin peptide. 27 Over the past 10 years, computational methods have been developed to discover peptidomimetics. 28 In a part of this review, novel computational methods introduced for peptidomimetic design have been summarized. Peptidomimetics can be categorized as follows: peptide backbone mimetics (Type 1), functional mimetics (Type 2) and topographical mimetics (Type 3). 29 The first generation of peptidomimetics (Type 1) mimics the local topography of amide bond. It includes amide bond isosteres, 30 pyrrolinones 31 or short fragments of secondary structure, such as beta-turns. 32 Such mimetics generally match the peptide backbone atom-for-atom, and comprise chemical groups that also mimic the functionality of the natural side chains of amino acids. A number of prosperous instances of Type 1 peptidomimetics have been reported. 33 The second type of peptidomimetics is described as functional mimetics or Type 2 mimetics, which include small, non-peptide compounds that are able to identify the biologic targets of their parent peptide. 34 At first, they were assumed to be conservative structural analogs of parent peptides. However, using site-directed mutagenesis, their binding sites to biologic targets were investigated. The results indicated that Type 2 peptidomimetics routinely bind to protein sites that are different from those selected by the original peptide. 35 Therefore, Type 2 mimetics maintain the ability to interfere with the peptide-protein interaction process without the necessity to mimic the structure of the natural peptide. 28 Type 3 peptidomimetics reveal the best conception of peptidomimetics. They consist of the necessary chemical groups that act as topographical mimetics and contain novel chemical scaffolds that are unrelated to natural peptides. 36 Here, theoretical and computational techniques to design proteins, peptides and peptidomimetics are reviewed. However, the current review does not deeply highlight the computational aspects of amino acid-based therapeutic design, but only discusses the methods used to design the mentioned therapeutics. Figure 1 summarizes the key concepts presented in this study. As some examples, the structures of Aldesleukin, Leuprolide and Spaglumic acid, important amino acid-based therapeutics approved by the US Food and Drug Administration (FDA), are shown in Figure 2A Computer-aided design of amino acid-based therapeutics Figure 2A ) and Leuprolide (PDB ID: 1YY2; Figure 2B ) were obtained from the Protein Data Bank (PDB; http://www. rcsb.org/) and visualized by PyMol tool. The structure of Spaglumic acid was retrieved (in MOL format) from Pub-Chem database (https://pubchem.ncbi.nlm.nih.gov/) with the PubChem ID 188803 ( Figure 2C ) and visualized using PyMol. Aldesleukin, a lymphokine, is a recombinant protein used to treat adults with metastatic renal cell carcinoma (https://www.drugbank.ca/drugs/DB00041). Leuprolide, a synthetic nine-residue peptide analog of gonadotropin releasing hormone, is used to treat advanced prostate cancer (https://www.drugbank.ca/drugs/DB00007). Spaglumic acid is used in allergic conditions such as allergic conjunctivitis. The drug belongs to a class of peptidomimetics known as hybrid peptides. Hybrid peptides contain at least two dissimilar types of amino acids (alpha, beta, gamma or delta) linked to each other via a peptide bond (https://www.drugbank.ca/ drugs/DB08835). In the current study, all FDA-approved therapeutics (in 2018) were retrieved from DrugBank (https://www.drugbank. ca/biotech_drugs) and an analysis was conducted to compare their percentages. Protein-based therapies, gene or nucleic acid-based therapies, vaccines, allergenics and cell transplant therapies made up 8.05%, 0.17%, 2.64%, 16.20% and 0.14% of total approved therapeutics, respectively. Small-molecule drugs made up 72.76% of the approved therapeutics ( Figure 3 ). Computational designing of proteins can be classified as follows: 1) template-based designing in which three-dimensional (3D) Farhadi and Hashemian structure of a predefined template is adapted to design a sequence and 2) de novo designing in which the amino acids' arrangement is changed to generate both sequence and 3D structure of a completely novel protein. 3 The problem of predicting the fold of an unknown sequence could be solved by utilizing templates. Since the fold is unaltered, the backbone atoms are directly located on this framework. 3 Moreover, to generate a functional protein, the side chains that can effectively stabilize the structure are added to the backbone. 37, 38 Routine concerns and methods for template-based protein design are reviewed below. Selecting the template (scaffold) protein The template (also named as scaffold protein) contains a group of backbone atom coordinates. The coordinates can be retrieved from an available X-ray crystal structure or cautiously from a nuclear magnetic resonance (NMR) structure. 39 Computer-aided design of amino acid-based therapeutics Fixing the backbone decreases the computational complication, but it may inhibit the main chain modifications to adjust sequence alternation. 7 Backbone flexibility can generate designed functionalities over the protein's normal function. The backbone flexibility is introduced through incorporating other closely associated conformations to an existing structure. [40] [41] [42] Recently, new functionalities were effectively introduced into the TIM-barrel topology. 43 This fold has been detected as one of the most shared structures in 21 distinct protein superfamilies. 44 Sequence search and characterization In a design procedure, a protein sequence is selected such that it meets the energetic and geometric constraints established by the chosen fold. Sequence search techniques sample different sequences and estimate their energies to gain the one owing the minimum energy. 3 In order to identify the sequences subject to an objective function or a specific energy, a diverse strategies including optimization and probabilistic approaches have been developed. 45 Optimization processes may recognize candidate sequences using stochastic or deterministic methods. 45 Probabilistic approaches focus on characterizing the sequence space probabilistically. Deterministic methods: To achieve a sequence folded into a global minimum energy conformation, deterministic methods search the whole sequence space and identify the global optima. 3, 7 These methods include dead-end elimination (DEE), 46 self-consistent mean field, 47 graph decomposition and linear programming. 48 Stochastic algorithms search the sequence space in an exploratory manner. 3 These algorithms include Monte Carlo algorithms (simulated annealing), 49 graph search methods 50 and genetic algorithms. 51 Some of the most commonly used methods are discussed below. DEE has been considered as a thorough search algorithm. To find and remove sequence-rotameric positions that are not portions of the global minimum energy conformation, DEE compares two amino acid rotamers and removes the one with greater interaction energy. 52 Interaction energies are computed for each rotamer of the test amino acid, along with all rotamers of every other amino acid. 3 The situation is repetitively examined for total amino acid states as well as their rotamers until it no longer holds true. 52, 53 Expanding the sequence length increases the combinatorial complication of DEE exponentially. Therefore, to design sequences of 30 amino acids or larger, application of DEE may be restricted. 54 Details of the theorem are explained elsewhere. 3, 7 Stochastic search algorithms: As mentioned before, deterministic approaches are perfect to design proteins with small sizes, but show the applied disadvantages with extension of sequence size. Stochastic or heuristic methods are valuable to design large proteins. 3 The most widely used method for protein design includes Monte Carlo sampling. 3, 7 Monte Carlo method samples positions of complicated proteins in a way related to a selected probability distribution such as Boltzmann distribution. Boltzmann distribution specially weighs low-energy configurations. The Monte Carlo algorithm performs iterative series of calculations. At the primary step of each search, a partially accidental test sequence is generated, and its energy is calculated via a physical potential. During the primary step, both rotamer state and amino acid identity are adjusted and an efficient temperature controls the probable energy alterations. In the next step, named simulated annealing, the temperature gradually decreases and permits favorable sampling of lowerenergy configurations. 55 Multiple independent calculations are carried out to converge the system to a global minimum. 3, 7 For more explanation about the theorems and details of the formulation of the probability distribution and weights, readers are referred to study previous reports. 3, 7 Probabilistic approach: Probabilistic approaches are frequently employed when thorough information is not accessible for protein design. In a probabilistic approach, sitespecific amino acid probabilities may be utilized, rather than particular sequences. The procedure is partially motivated by the uncertainties to find sequences consistent with a specific structure. Briefly, the backbone atoms are fixed or greatly constrained, side chain conformations are discretely handled, energy functions are estimated and solvation is handled by simple models. 7 However, in order to offer valuable sequence information for design experiments and to find structurally significant amino acids, probabilistic techniques leverage structural characteristics of interatomic interactions. 7 Generally, Monte Carlo methods give a probabilistic sampling of sequences. 49, 55 In addition, an entropy-based formalism has been defined to predict amino acid probabilities for a certain backbone structure. 56, 57 The method employs concepts from statistical thermodynamics to assess the sitespecific probabilities. To address the whole space of existing compositions, the theory is not restricted by the computational enumeration and sampling. Large protein structures with .100 variable residues can be supplied simply. 7 Sampling sequence space to generate conformations The chemical variability of a sequence and the number of various amino acids permitted at each position are defined as "degrees of freedom for each amino acid". Moreover, each of the 20 natural residues search the whole sequence space. 58 Drug Design, Development and Therapy 2018:12 submit your manuscript | www.dovepress.com To decrease the degrees of freedom for each amino acid and searching the sequence space, diverse approaches such as hydrophobic patterning have been proposed. 58 Monomers can be used to probe a protein structure 59 and improve its function, 60 other than the naturally occurring amino acids. 61 Sampling of side chain conformational space to form conformations Side chain conformations are typically consistent with the energy minima of molecular potentials and can be obtained from a structural database. 62 Rotamer statuses are related to the repeatedly detected values of dihedral angles in the side chain of each amino acid. For example, the simplest amino acids including alanine and glycine have only one rotamer status, while the bigger amino acids have .80 diverse rotamer statuses. 62 A variety of rotamer libraries including backbonedependent, secondary structure-dependent and backboneindependent libraries have been developed for protein design. 62, 63 By using a rotamer library, one can discretize a meaningful state space to decrease the computational difficulty. Rotamer libraries can be extended beyond the 20 natural amino acids. The effective rotamers can model cofactors, ligands, water and posttranslational modifications. For example, to improve the modeling of protein-protein interactions and model water within proteins interiors, the structurally definite water molecules can be inserted as a solvated rotamer library. 61 Energy functions have been employed to quantify sequencestructure compatibilities. 64 They include linear associations of hydrogen bonds made by backbone atoms, repulsion among atoms, hydrophobic attraction among non-polar groups and electrostatic interactions among sequential neighbors. 65 The sequence of a protein is selected so that it can adjust the energetic and geometric constraints enforced by the favorite fold. Constraints typically contain several intramolecular interactions such as van der Waals, hydrophobic, polar and electrostatic interactions, as well as hydrogen bonds. Generally, by using a scoring function, it is possible that energetic contributions of the mentioned parameters are taken into account. 3, 7, 65 De novo design: designing the sequence and 3D structure Through assembly of proteins fragments 66, 67 or secondarystructure elements, 68,69 novel structures can be modeled de novo. In the design procedures, the backbone coordinates are generally constrained. Summary and important findings of some proteins designed using computational approach including a retroaldol enzyme, 43 the Kemp elimination enzyme, 70 a novel βαβ protein, 71 a redesigned procarboxypeptidase, 72 a novel α/β protein structure and the TOP7 73 are shown in Table 1 . Peptide design methods have been categorized as ligand-and target-based design methods. In the ligand-based designing procedure, information derived from peptides is used to design novel therapeutic peptides. In the target-based method, information derived from target proteins is specifically utilized. Typically, a hybrid approach including both ligand-and target-based design is utilized. 13 Ligand-based peptide design The ligand-based design has been classified as follows: 1) sequence-based, 2) property-based and 3) conformationbased design. Sequence-based approach uses the information of conserved regions and analyzes the multiple sequence alignments. This method is directed by the hypothesis that conserved regions are functionally and structurally significant. 13 Computational tools allow the ligand-based peptide design, although they lag behind bioinformatics strategies developed for protein designing. 13 Recently, using a method based on a PAM250 matrix, the relationship between a series of 35 collagen peptides and antiangiogenic activity including proliferation, migration and adhesion was analyzed. 74 The PAM250 matrix captured information of mutation rates among all pairs of amino acids. Based on the results, regions at the C and N termini of the peptides were detected to be significant for an ideal activity and suggested as two distinct binding sites. The approach showed the potential worth of the sequence-based peptide design. 74 In another report, a computational platform called SARvision was developed to support sequence-based design. SARvision signifies an important step for peptide sequence/activity relationship (SAR) analysis. Moreover, it pools the improved visualization abilities with advanced sequence/activity analysis. 75 Compared to small molecules, property-based design methods for peptides are in the early stages of development. In a recent study, the ΔG decomposition per residue and the physicochemical characteristics of amino acids, such as hydrophilicity, hydrophobicity and volume, were used Computer-aided design of amino acid-based therapeutics to model peptide binding to targets of interest. 76, 77 Finally, a model was built to estimate peptide ΔG values for binding to the class I major histocompatibility complex (MHC) protein HLA-A*0201. 78 Furthermore, in a wide range of studies, antimicrobial peptides were successfully analyzed by using the property-based approach. 79 For example, a machinelearning method was employed to design novel antimicrobial peptides. 80 The victory of the property-based methods with antimicrobial peptides may be explained by the fact that the desired biologic activity of membrane disruption is relatively nonspecific. 13 In the case of conformation-based peptide design, computational techniques were developed to predict the conformational ensembles or structure of peptides and analyze the SARs. 81,82 PEP-FOLD is an online tool used to predict the 3D structures of peptides of length 9-36 residues. 81 A remarkable suggestion from the data is that PEP-FOLD seems to solve the conformational sampling problem. 13, 81 In order to search conformational spaces of a peptide, long timescale molecular dynamic simulations have been employed. 83, 84 Besides, quantum mechanical calculations are promising to address the scoring deficiency in the peptide conformational examination. 85 Apparently, to affect the peptide design processes positively, improving the major theoretical and technical issues is necessary before such computationally sophisticated and costly procedures. Conformation of a peptide may be modeled to generate a 3D pharmacophore hypothesis. A certain pharmacophore hypothesis is useful to determine the ADME/Tox activities or particular potencies of a peptide. 86 For example, screening of a peptide library was jointed to generate a pharmacophore hypothesis to identify potent agonists of melanocortin-4 receptor isoforms. A combinatorial tetrapeptide library was screened, and SAR and ligand-derived pharmacophore templates were generated. The pharmacophore hypothesis was proposed to allow continuous attempts in the rational design of melanocortin receptor molecules. 86 Target-based peptide design Compared to ligand-based peptide design, target-based design appears to be in a more improved level. 13 Targetbased design is initiated with the computer-aided survey of a ligand-bound or unbound protein target to recognize its potential binding sites, prospective specificity surfaces and other pharmacologic activity elements. The phase is generally followed by an in silico design phase where computational methods perform, refine and evaluate peptide design ideas. Some recently developed computational methods for targetbased peptide design are reviewed below. Recently, an increase in the number of protein-peptide 3D structures deposited in the PDB has assisted to search the molecular mechanism and structural basis of peptide recognition and binding. 87 Information of crystal structures of protein-peptide complexes can improve our knowledge of the Farhadi and Hashemian chemical forces involved in the binding and special modes of binding. Dynamic data of the complexes can be partially extracted from the solution NMR structures deposited in the PDB. To record the structures and functions of various protein-peptide complexes, the experimentally resolved structure data were gathered, annotated and analyzed, and several distinctive databases such as PepX, 88 PepBind 89 and peptidDB were generated. 90 The PepX database, derived from the PDB, comprises unique protein-peptide interface collections. 88 The PepBind database contains 4,986 proteinpeptide complex structures from the PDB. 89 PeptidDB is a curated database of 103 protein-peptide complexes. 90 The abundance of the structural information specifically on monomeric proteins could be gathered to design proteinpeptide interactions with no requirement for their sequence homology. 91 Protein-peptide docking Precise docking of a highly flexible peptide is a major challenge. 18 Traditional docking protocols, such as AutoDock, Vina 92,93 and MOE-Dock, 94 developed for docking of small molecules, were also used to dock a peptide to a protein receptor. However, comparative studies revealed that these techniques would face failure if the docked peptides were .3 residues long. 95 Therefore, development of peptide-focused docking protocols is very important. 96 Other protein-protein docking tools such as z-dock and Hex have been used for the computational peptide design in some studies. 96 Below, details of recently developed peptide-focused docking approaches are discussed. First, heuristic evolution procedures were applied to search the large conformational space of linear peptides before the binding. 97 However, these procedures were not efficient and their use was limited. 18 Then, a scheme based on conformational sampling became common in the peptide docking. Besides, several illustrative approaches were proposed to balance between the accuracy and efficacy of the flexible peptide docking. In this aspect, DocScheme, 98 DynaDock 99 and pepspec 100 were integrated to online userfriendly interfaces and introduced. Recently, PepCrawler 101 and FlexPepDock 102 were developed as the peptide docking tools. 18 It is reported that FlexPepDock 102 has sub-angstrom accuracy in reproducing the crystal structures of protein-peptide complexes. 103 All of the FlexPepDock-based methods assume previous information about the peptide-binding site. 13 AnchorDock, a recently described algorithm, allows powerful blind docking calculations through relaxing the constraint. 104 The program predicts anchoring origins on a protein surface. Following recognition of the anchoring origins, an assumed peptide conformation is refined using an anchor-constrained molecular dynamic process. 105 HADDOCK, a well-known protein-protein docking tool, has been recently expanded to run the flexible peptide-protein docking. 105 To handle a docking procedure, HADDOCK uses ambiguous interaction restraints based on the experimental information about intermolecular interactions. This rigid body peptide docking is followed through a flexiblesimulated annealing process. The novel HADDOCK strategy initiates docking computations from an ensemble of three dissimilar peptide conformations (eg, α-helix, extended and polyproline-II) that are high informative inputs. 105 CABS-dock is a recently introduced protein-peptide docking tool and runs a primary docking procedure whose outcomes can be refined by other tools such as FlexPepDock. 106 In the primary phase of the procedure, random conformations of a peptide are predicted and located around the protein target of interest. The process is followed by replica exchange Monte Carlo dynamics. Subsequently, 10 models are selected for the last optimization using the Modeller tool to gain accurate scoring and ranking poses. 13, 106 GalaxyPepDock was developed to use experimentally resolved protein-peptide structures for running the template-based docking pooled by flexible energy-based optimization. 107 Atomistic simulation Atomistic Monte Carlo and molecular dynamics simulations are accurate, but they are meticulous techniques to investigate peptide-protein binding interactions. These techniques can also detect the thermodynamic profile and trajectory included in protein-peptide identification. These methods predict the association among conformations of a peptide in solution or protein. 108 In a study, in order to describe the binding of a decapeptide to the cognate SH3 receptor, a long-term molecular dynamic simulation was used and a two-state model was built. 109 In the first step, a relatively quick diffusion phase, nonspecific encounter complexes were generated and stabilized by using electrostatic energy. The secondary step was a slow modification phase, in which the water molecules were emptied out from the space between the peptide ligand and the receptor. 109 In another report, by using Monte Carlo method, the mentioned two-state model was verified to trace some oligopeptide routes for binding to various PDZ (Post synaptic density protein, Drosophila disc large tumor suppressor, and Zonula occludens-1 protein) domains. 110 Drug Design, Development and Therapy 2018:12 submit your manuscript | www.dovepress.com Computer-aided design of amino acid-based therapeutics The affinity of BH3 peptides to Bcl-2 protein was investigated, and results showed the higher affinity of bound peptides occurred when the corresponding peptides were in a lower degree of disorder in unbound states and vice versa. 111 These results showed that the highly structured peptides could increase their affinity through reducing the entropic loss associated with the binding. Overall, in addition to the electrostatic and hydrophobic forces, protein-peptide interactions can be affected by the entropic effect and conformational flexibility that could be willingly examined with atomistic simulations. 111 Very recently, using a fast molecular dynamics simulation, the energetic and dynamic features of protein-peptide interactions were studied. In most cases, the native binding sites and native-like postures of protein-peptide complexes were recapitulated. Additional investigation showed that insertion of motility and flexibility in the simulation could meaningfully advance the correctness of protein-peptide binding prediction. 112 Peptide affinity prediction Most features of computational peptide design are based on the accuracy and efficacy of affinity prediction. Hence, the fast and reliable prediction of peptide-protein affinity is significant for rational peptide design. 18 In this aspect, two categories of prediction algorithms including sequence-and structure-based approaches were developed. The sequencebased method uses the information derived from primary polypeptide sequences to approximate and evaluate the standards of the binding affinity. The structure-based process takes the information derived from 3D structures of proteinpeptide complexes to predict the binding affinity. 113 At the sequence level, the quantitative structure-activity relationships (QSARs) have been widely utilized to forecast the binding affinity of peptides and conclude the biologic function. 114 To model the statistical correlation between sequence patterns and biologic activities of experimentally assessed peptides, machine-learning methods such as partial least squares (PLS), artificial neural networks (ANN) and support vector machine (SVM) have been used. The obtained correlations have been used to infer experimentally undetermined peptides. 115 The relationship between the biologic activity and molecular structure is an important issue in biology and biochemistry. QSAR is a well-established method employed in pharmaceutical chemistry and has become a standard tool for drug discovery. However, the predictive capacity of QSAR techniques is generally weaker than statistics-based approaches. Therefore, a combination of the QSAR method with a statistic-based technique may bring out the best in each other and can be a trend in future developments of drug discovery. 114 At the structural level, numerous reports on affinity prediction have addressed the MHC-binding peptides. Plentiful MHC-peptide complex structure records have been deposited in the PDB. 116 The significance of domain-peptide recognition has been recently illustrated in the metabolic pathway and cell signaling. 117 To predict the protein-peptide binding potency, a number of strict theories were suggested based on the potential free energy perturbation. The theories computed the alteration of free energies upon the interaction between phosphor-tyrosine-tetra-peptide (pYEEI) and human Lck SH2 domain. 118 Furthermore, to obtain a deep insight into the structural and energetic aspects of peptide recognition by the SH3 domain, a number of molecular modeling experiments such as homology modeling, molecular docking and mechanism dynamics were used. 119 Peptide array strategies confirmed that some peptide candidates may be potent binders of the Abl SH3 domain. 120 Very recently, an approach including quantum mechanics/molecular mechanics, semiempirical Poisson-Boltzmann/surface area and empirical conformational free energy analysis was developed to quantitatively illustrate the energetic contributions involved in the affinity losing of PDZ domain and OppA protein to their peptide ligands. 121, 122 De novo peptide design Recently, in order to de novo target-based peptide design, two remarkable methodologies including the VitAL method and an approach developed by Bhattacherjee and Wallin were introduced. The VitAL method pools verterbi algorithm with AutoDock to design peptides for the binding sites of a target. 123 The "Bhattacherjee and Wallin" approach explores both peptide sequence and conformational space around a protein target at the same time. 124 This approach was tested on three dissimilar peptide-protein domains to assess its ability. 13 A brief list of the existing computational resources employed in peptide design is presented in Table 2 . In recent years, some computational methods have been proposed to design peptidomimetics. These methods can be classified based on their specificity to translate peptides to peptidomimetics. 28 To select the best method, Drug Design, Development and Therapy 2018:12 submit your manuscript | www.dovepress.com Farhadi and Hashemian awareness about the structure of peptide-protein complexes is important. 28, 96 Herein, recently introduced methods for computer-aided design of peptidomimetics are presented. GrowMol is a combinatorial algorithm employed in the peptidomimetics design. GrowMol searches a variety of probable ligands for the binding sites of a target protein 125 and produces molecules with the chemical and steric complementarity for the 3D structure of binding sites. This method was used to generate peptidomimetic inhibitors of thermolysin, HIV protease and pepsin. By using the X-ray crystal structures of pepstatin-pepsin complexes, GrowMol predicted therapeutic peptidomimetics against the aspartic proteases. The algorithm created some cyclic inhibitors bridging the side chains of cysteine residues in the Pl and P3 inhibitor subsites. The binding modes were checked using X-ray crystallography. 125, 126 LUDI is another interesting software referring to the de novo methodology. 127 By using natural and non-natural amino acids as building blocks, the software designed peptidomimetics against renin, thermolysin and elastase. 127 Conformational flexibility of each novel peptidomimetic was searched through sampling the multiple conformers of each amino acid. 127 Peptide-driven pharmacophoric hypothesis is the most perceptive computational technique discovered in the peptidomimetics design. The method is especially useful when the X-ray structures of protein-protein complexes exist. 28 The main idea is to adapt the hot spot concept into the associated pharmacophoric feature concept. With a pharmacophorebased virtual screening process, this strategy can determine novel type 3 mimetics. 128 In fact, the side chains of each amino acid can be simply categorized based on the conventional pharmacophoric characteristics, such as hydrogen bond donors and acceptors, aromatic ring and charged and hydrophobic centers. For example, in a report, pharmacophore model directed synthesis of the non-peptide analogs of a cationic antimicrobial peptide identified an anti-staphylococcal activity. 129 To make a pharmacophore hypothesis, a model of RNA III-inhibiting peptide (RIP), a well-known heptapeptide inhibitor of the staphylococcal pathogenesis, was utilized. Through the virtual screening of 300,000 commercially available small molecules based on the RIP-based pharmacophore, Hamamelitannin was discovered as a non-peptide mimetic of RIP. Hamamelitannin is a tannin derivate extracted from Hamamelis virginiana. 28, 129 In another study, two rounds of in silico screening were performed to discover potential peptidomimetics able to mimic a cyclic peptide (cyclo- [CPFVKTQLC] ) that is known to bind the anb3 integrin receptor. 130 At the end of the process, the most potent representatives were at least 2,000 times better than the original cyclopeptide (around 2 mM). 130 In a prosperous instance, virtual screening was done by using multi-conformational forms of a large commercial library. A target-based pharmacophoric model mapped the CD4-binding site on HIV-1 gp120. The pharmacophore hypothesis was made based on a homology model of the protein cavity. In a cell-based assay, two of the top scoring molecules were detected as micromolar inhibitors of HIV-1 replication. 131 Computer-aided design of amino acid-based therapeutics The pharmacophore-based screening was used to find the novel Alzheimer's therapeutics as mimetics of neurotrophins. 132 The therapeutic utilization of neurotrophins might be restricted because of several deficiencies such as its reduced central nervous system penetration, decreased stability and potency to enhance neuronal death through interaction with the p75NTR receptor. The mimetism of particular nerve growth factor domains could inhibit neuronal death. Peptidomimetics of the loop 1 and loop 4 domains of nerve growth factor can prevent neuronal death induced by p75NTR-dependent and Trk-related signaling. 132 In another study, a full-computational pharmacophorebased approach assessed the FDA-approved drugs as valuable candidates to inhibit protein-protein interactions. 133 Peptide structures were designated in terms of pharmacophores and searched against the FDA-approved drugs to detect same molecules. The top ranking drug matches contained several nuclear receptor ligands and matched allosterically to the binding site on the target protein. The top ranking drug matches were docked to the peptide-binding site. The majority of the top-ranking matches presented a negative free energy change upon binding that was comparable to the standard peptide. 133 Geometry similarity method Geometry similarity methods create a geometric similarity between non-peptide templates and peptide patches. In a study, the SuperMimic tool was developed to recognize peptide mimetics. 134 In the program, a complex library of peptidomimetics composed of several protein structure libraries has been deposited. Moreover, SuperMimic includes the D-peptides, synthetic components (reported as betaturn or gamma-turn mimetics) and peptidomimetic ligands obtained from the PDB. 134 In the program, the searching process allows scanning a library of small molecules that mimic the tertiary structure of a query peptide followed by scanning of a protein library where a query for small molecule can adopt into the backbone. 28, 134 Sequence-based method Recently, a method has been developed to rank peptide compound matches that are limited to short linear motifs in proteins and compounds with amino acid substituents. 135 The algorithm allows mapping the side chain-like substituents on every compound of a large chemical library. The complete molecule can be signified by a short sequence, and each fragment in the molecule can be represented as a distinct letter abbreviation. 28 A cross-search between the PubChem database (about 5.4 million molecules) and a non-redundant collection of 11,488 peptides obtained from PDB demonstrated that the algorithm can be useful for high-throughput measurements. 28 To recognize a true positive, the method explored identified protein motifs against the National Cancer Institute Developmental Therapeutic Program compound database. 135 In another study, the Similarity of Amino Acid Motifs to Compounds web server was developed to ease screening of identified motif structures against bioactive compound databases. 136 The methodology was reported to be efficient since the compound databases were preprocessed to maximize the accessible data, and the necessary input data was minimal. 136 In Similarity of Amino Acid Motifs to Compounds, motif matching can be full or partial that may decrease or enhance the number of potential mimetics, respectively. Using a novel search algorithm, the web service can perform a fast screening of known or putative motifs against ready compound libraries. The classified results can be examined by linking to appropriate databases. 28, 136 Fragment-based method Replacement with Partial Ligand Alternatives through Computational Enrichment is a fragment-based approach. 137 By using structures of peptide-bound proteins as design anchors, the program can computationally find a non-peptide mimetic for specific determinants of known peptide ligands. 137 Hybrid peptide-driven shape and pharmacophoric method Development and application of strategies for pharmacophore modeling indicate that the medicinal chemistry community has broadly accepted the intuitive nature of the pharmacophore concept. Besides, shape complementarity has been identified as a significant element in the molecular identification between ligands and their targets. 28 In virtual screening efforts, using the pharmacophore-and shape-based techniques distinctly may increase the rate of false-positive results. 128 Therefore, incorporating both pharmacophore-and shape-matching techniques into one program can potentially diminish the rate of false positives. 128 Recently, to discover novel peptidomimetics, a weboriented virtual screening tool named pepMMsMIMIC 138 was developed to pool the conventional pharmacophore matching with shape complementarity. A library of 17 million conformers were extracted from 3.9 million commercially available chemicals and gathered in the MMsINC database. The database was used as a skeleton to develop Farhadi and Hashemian pepMMsMIMIC. 139 In the pepMMsMIMIC interface, the 3D structure of a protein-bound peptide is used as an input. Then, chemical structures able to mimic the pharmacophore and shape similarity of the original peptide are proposed to involve in the protein-protein recognition. 139 A list of in silico methods used to design potential peptidomimetics along with their strengths and weaknesses is presented in Table 3 . Overall, design and development of therapeutics are tedious, expensive and time-consuming procedures. Therefore, using modern approaches including computer-aided design methods can lessen the examination phase, price and failure of therapeutics discovery. Computational methods used to design amino acid-based therapeutics can increase the range of available biotherapeutics. Benefiting from the dramatic advance in bioinformatics, computational tools can be used to find and develop therapeutic proteins, peptides and peptidomimetics. 140, 141 Moreover, using the computational tools decrease the cost of therapeutics development, from concept to market, by up to 50%. 140 However, in the computational protein designing, there are some challenges such as our inadequate knowledge of folding and physical forces that stabilize protein structures. Moreover, sequences and local structures have many degrees of freedom that can complicate the sequence search. Therefore, there is a requirement for effective methods to find sequences related to a particular structure and measure essential protein folding criteria. Overall, in silico design of amino acid-based therapeutics includes many challenges that should be removed to improve the overall performance of the design processes. For example, although structure determination of all disease-related proteins through crystallography and NMR is a laborious task, it is necessary to gather much structural information of peptide-protein interactions. Besides, development of vigorous algorithms to calculate protein-protein binding energies is essential. The estimation of binding constant between two macromolecules with an appropriate speedaccuracy tradeoff needs millisecond scale molecular dynamics. Moreover, understanding of both protein-protein and protein-peptidomimetics recognition processes in a molecular level can be improved using higher accurate force fields such as quantum mechanical polarizable force. In recent years, there are growing examples on the approval of monoclonal antibodies (therapeutic antibodies) by the FDA for treatment of various diseases. This important area of amino acid-based therapeutics has been covered in more depth elsewhere. 142, 143 For more explanation about the theorems and details of antibody informatics for drug discovery as well as the computer-aided antibody design, readers are referred to study previous reports. 142, 143 The authors report no conflicts of interest in this work. Computer-aided design of amino acid-based therapeutics What is the future of targeted therapy in rheumatology: biologics or small molecules Protein therapeutics: a summary and pharmacological classification In silico methods for design of biological therapeutics Constructing novel chimeric DNA vaccine against Salmonella enterica based on SopB and GroEL proteins: an in silico approach In silico phylogenetic analysis of Vibrio cholera isolates based on three housekeeping genes Designing of complex multi-epitope peptide vaccine based on Omps of Klebsiella pneumoniae: an in silico approach Theoretical and computational protein design annu Evaluation of in silico protein secondary structure prediction methods by employing statistical techniques Inhibition of mycobacterial CYP125 enzyme by sesamin and β-sitosterol: an in silico and in vitro study Theory of protein folding: the energy landscape perspective Toward an outline of the topography of a realistic protein-folding funnel Artificial diiron enzymes with a de novo designed four-helix bundle structure computer-enabled peptide drug design: principles, methods, applications and future directions Docking small peptides remains a great challenge: an assessment using AutoDock Vina Empirical estimation of local dielectric constants: toward atomistic design of collagen mimetic peptides Recent work in the development and application of protein-peptide docking Rational design of peptide drugs: avoiding aggregation Computational peptidology: a new and promising approach to therapeutic peptide design Strategies employed in the design and optimization of synthetic antimicrobial peptide amphiphiles with enhanced therapeutic potentials Multifaceted roles of disulfide bonds. Peptides as therapeutics Peptidomimetics, a synthetic tool of drug discovery An in silico pipeline for the design of peptidomimetic proteinprotein interaction inhibitors (Order No. 10188557) Natural products as sources of new drugs over the last 25 years Diversity-oriented synthesis of macrocyclic peptidomimetics Structure-based design, synthesis, and biological evaluation of peptidomimetic SARS-CoV 3CLpro inhibitors Advances in Amino Acid Mimetics and Peptidomimetics A hierarchical approach to peptidomimetic design Mimicking Peptides… In Silico Peptidomimetic design Rational design for peptide drugs Peptidomimetics as a cutting edge tool for advanced healthcare An unusual functional group interaction and its potential to reproduce steric and electrostatic features of the transition states of peptidolysis Low molecular weight, non-peptide fibrinogen receptor antagonists Neurotrophin small molecule mimetics: candidate therapeutic agents for neurological disorders Design of peptides, proteins, and peptidomimetics in chi space Molecular technology. Designing proteins and peptides Molecular engineering: an approach to the development of general capabilities for molecular manipulation X-ray versus NMR structures as templates for computational protein design High-resolution protein design with backbone freedom Prediction of protein-protein interface sequence diversity using flexible backbone computational protein design Backbone flexibility in computational protein design De novo computational design of retro-aldol enzymes One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions Search and sampling in structural bioinformatics The dead-end elimination theorem and its use in protein side-chain positioning Application of a self-consistent mean field theory to predict protein sidechains conformation and estimate their conformational entropy Design of protein-interaction specificity gives selective bZIP-binding peptides Computational methods for protein design and protein sequence variability: biased Monte Carlo and replica exchange Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm Side-chain and backbone flexibility in protein core design Dead-end elimination with a polarizable force field repacks PCNA structures Improved prediction of protein side-chain conformations with SCWRL4 Trading accuracy for speed: a quantitative comparison of search algorithms in protein sequence design Using self-consistent fields to bias Monte Carlo methods with applications to designing and sampling protein sequences Computational design and characterization of a monomeric helical dinuclear metalloprotein Statistical theory of combinatorial libraries of folding proteins: energetic discrimination of a target structure Achieving stability and conformational specificity in designed proteins via binary patterning Photophysics of a fluorescent non-natural amino acid: p-cyanophenylalanine An expanded eukaryotic genetic code A "solvated rotamer" approach to modeling watermediated hydrogen bonds at protein-protein interfaces Rotamer libraries in the 21st century Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library Potential energy functions for protein design De novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions Structure by design: from single proteins and their building blocks to nanostructures Computational de novo design and characterization of a four-helix bundle protein that selectively binds a nonbiological cofactor Using α-helical coiled coils to design nanostructured metalloporphyrin arrays Kemp elimination catalysts by computational enzyme design De novo design of a βαβ motif High-resolution structural and thermodynamic analysis of extreme stabilization of human procarboxypeptidase by computational protein design Design of a novel globular protein fold with atomic-level accuracy Novel peptide-specific quantitative structure activity relationship (QSAR) analysis applied to collagen IV peptides with antiangiogenic activity Development of an informatics platform for therapeutic protein and peptide analytics Two-level QSAR network (2L-QSAR) for peptide inhibitor design based on amino acid properties and sequence positions Recent development of peptide drugs and advance on theory and methodology of peptide inhibitor design Predicting the affinity of epitope-peptides with class I MHC molecule HLA-A*0201: an application of amino acid-based peptide prediction A brief overview of antimicrobial peptides containing unnatural amino acids and ligand-based approaches for peptide ligands Machine learning assisted design of highly active peptides for drug discovery PEP-FOLD: an updated de novo structure prediction server for both linear and disulfide bonded cyclic peptides In silico predictions of 3D structures of linear and cyclic peptides with natural and nonproteinogenic residues Long-timescale molecular dynamics simulations of protein structure and function How fastfolding proteins fold Bond distances in polypeptide backbones depend on the local conformation Identification of tetrapeptides from a mixture based positional scanning library that can restore nM full agonist function of the L106P, I69T, I102S, A219V, C271Y, and C271R human melanocortin-4 polymorphic receptors (hMC4Rs) The protein data bank Protein design with fragment databases PepBind: a comprehensive database and computational tool for analysis of protein-peptide interactions The structural basis of peptide-protein binding strategies Protein-peptide interactions adopt the same structural motifs as monomeric protein folds Highly Flexible Protein-Peptide Docking Using CABS-Dock Computer-aided design of amino acid-based therapeutics Virtual screening for potential inhibitors of CTX-M-15 protein of Klebsiella pneumoniae In silico panning for a non-competitive peptide inhibitor Comparative evaluation of eight docking tools for docking and virtual screening accuracy In silico designing of peptide inhibitors against pregnane X receptor: the novel candidates to control drug metabolism Computation of the binding of fully flexible peptides to proteins with flexible side chains A flexible docking procedure for the exploration of peptide binding selectivity to known structures and homology models of PDZ domains DynaDock: a new molecular dynamics-based algorithm for protein-peptide docking including receptor flexibility Structure-based prediction of proteinpeptide specificity in Rosetta PepCrawler: a fast RRT-based algorithm for high-resolution refinement and binding-affinity estimation of peptide inhibitors Rosetta FlexPepDock ab-initio: simultaneous folding, docking and refinement of peptides onto their receptors Sub-angstrom modeling of complexes between flexible peptides and globular proteins AnchorDock: blind and flexible anchordriven peptide docking A unified conformational selection and induced fit approach to proteinpeptide docking CABS-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site GalaxyPepDock: a proteinpeptide docking tool based on interaction similarity and energy optimization Predicting peptide structures in native proteins from physical simulations of fragments Mechanism of fast peptide recognition by SH3 domains Binding free energy landscape of domain peptide interactions Molecular dynamics simulations of pro-apoptotic BH3 peptide helices in aqueous medium: relationship between helix stability and their binding affinities to the anti-apoptotic protein Bcl-XL Structural and dynamic determinants of protein-peptide recognition Quantitative sequenceactivity model (QSAM): applying QSAR strategy to model and predict bioactivity and function of peptides, proteins and nucleic acids Recent advances in QSAR and their applications in predicting the activities of chemical molecules, peptides and proteins for drug design Comprehensive comparison of eight statistical modelling methods used in quantitative structure retention relationship studies for liquid chromatographic retention times of peptides generated by protease digestion of the Escherichia coli proteome Prediction of MHC-peptide binding: a systematic and comprehensive overview Domain mediated protein interaction prediction: from genome to network Calculation of absolute protein-ligand binding free energy from computer simulations Prediction of binding affinities between the human amphiphysin-1 SH3 domain and its peptide ligands using homology modeling, molecular dynamics and molecular field analysis Characterization of domain-peptide interaction interface: a generic structure-based model to decipher the binding specificity of SH3 domains Why OppA protein can bind sequence-independent peptides? A combination of QM/MM, PB/SA, and structure-based QSAR analyses Characterization of PDZ domain-peptide interactions using an integrated protocol of QM/MM, PB/SA, and CFEA analyses Computational design of peptide ligands Exploring protein-peptide binding specificity through computational peptide screening Multiple highly diverse structures complementary to enzyme binding sites: results of extensive application of a de novo design method incorporating combinatorial growth Transformation of peptides into non-peptides. Synthesis of computer-generated enzyme inhibitors Towards the automatic design of synthetically accessible protein ligands: peptides, amides and peptidomimetics Structure-based pharmacophores for virtual screening Antimicrobial activity of small β-peptidomimetics based on the pharmacophore model of short cationic antimicrobial peptides Small molecule inhibitors of hantavirus infection A dynamic target-based pharmacophoric model mapping the CD4 binding site on HIV-1 gp120 to identify new inhibitors of gp120-CD4 protein-protein interactions Alzheimer's therapeutics Approved drug mimics of short peptide ligands from protein interaction motifs SuperMimic-Fitting peptide mimetics into protein structures Identification of potential small molecule peptidomimetics similar to motifs in proteins Drug Design, Development and Therapy 1254 Web server to identify similarity of amino acid motifs to compounds (SAAMCO) REPLACE: a strategy for iterative design of cyclin-binding groove inhibitors Swimming into peptidomimetic chemical space using pepMMsMIMIC MMsINC: a large-scale chemoinformatics database Computational drug discovery Developability assessment as an early de-risking tool for biopharmaceutical development Antibody informatics for drug discovery Computer-aided antibody design TumorHoPe: a database of tumor homing peptides Drug-permeability and transporter assays in Caco-2 and MDCK cell lines PepX: a structural database of non-redundant protein-peptide complexes Rosetta FlexPepDock web server -high resolution modeling of peptide-protein interactions pDOCK: a new technique for rapid and accurate docking of peptide ligands to major histocompatibility complexes Predicting peptide binding sites on protein surfaces by clustering chemical interactions Protein-peptide complex prediction through fragment interaction patterns PEP-SiteFinder: a tool for the blind identification of peptide binding sites on protein surfaces VitAL: Viterbi algorithm for de novo peptide design Drug Design, Development and Therapy 2018:12 submit your manuscript | www.dovepress.com Submit your manuscript here: http://www.dovepress.com/drug-design-development-and-therapy-journal Drug Design, Development and Therapy is an international, peerreviewed open-access journal that spans the spectrum of drug design and development through to clinical applications. Clinical outcomes, patient safety, and programs for the development and effective, safe, and sustained use of medicines are the features of the journal, which has also been accepted for indexing on PubMed Central. The manuscript management system is completely online and includes a very quick and fair peer-review system, which is all easy to use. Visit http://www.dovepress.com/testimonials.php to read real quotes from published authors.