key: cord-0276662-hm59rp3l authors: Jain, Shipra; Dhall, Anjali; Patiyal, Sumeet; Raghava, Gajendra P. S. title: IL13Pred: A method for predicting immunoregulatory cytokine IL-13 inducing peptides for managing COVID-19 severity date: 2021-09-21 journal: bioRxiv DOI: 10.1101/2021.09.19.460950 sha: 48ea61d07a3e0ebb06879e5bdf16907e62f64e72 doc_id: 276662 cord_uid: hm59rp3l Interleukin 13 (IL-13) is an immunoregulatory cytokine that is primarily released by activated T-helper 2 cells. It induces the pathogenesis of many allergic diseases, such as airway hyperresponsiveness, glycoprotein hypersecretion and goblet cell hyperplasia. IL-13 also inhibits tumor immunosurveillance, which leads to carcinogenesis. In recent studies, elevated IL-13 serum levels have been shown in severe COVID-19 patients. Thus it is important to predict IL-13 inducing peptides or regions in a protein for designing safe protein therapeutics particularly immunotherapeutic. This paper describes a method developed for predicting, designing and scanning IL-13 inducing peptides. The dataset used in this study contain experimentally validated 313 IL-13 inducing peptides and 2908 non-inducing homo-sapiens peptides extracted from the immune epitope database (IEDB). We have extracted 95 key features using SVC-L1 technique from the originally generated 9165 features using Pfeature. Further, these key features were ranked based on their prediction ability, and top 10 features were used for building machine learning prediction models. In this study, we have deployed various machine learning techniques to develop models for predicting IL-13 inducing peptides. These models were trained, test and evaluated using five-fold cross-validation techniques; best model were evaluated on independent dataset. Our best model based on XGBoost achieves a maximum AUC of 0.83 and 0.80 on the training and independent dataset, respectively. Our analysis indicate that certain SARS-COV2 variants are more prone to induce IL-13 in COVID-19 patients. A standalone package as well as a web server named ‘IL-13Pred’ has been developed for predicting IL-13 inducing peptides (https://webs.iiitd.edu.in/raghava/il13pred/). Key Points Interleukin-13, an immunoregulatory cytokine plays an important role in increasing severity of COVID-19 and other diseases. IL-13Pred is a highly accurate in-silico method developed for predicting the IL-13 inducing peptides/ epitopes. IL-13 inducing peptides are reported in various SARS-CoV2 strains/variants proteins. This method can be used to detect IL-13 inducing peptides in vaccine candidates. User friendly web server and standalone software is freely available for IL-13Pred Author’s Biography Shipra Jain is currently working as Ph.D. in Computational Biology from Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India. Anjali Dhall is currently working as Ph.D. in Computational Biology from Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India. Sumeet Patiyal is currently working as Ph.D. in Computational Biology from Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India. Gajendra P. S. Raghava is currently working as Professor and Head of Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India. IL-13 is an immune-regulatory cytokine primarily secreted by activated T helper-Type (Th) 2 cells that inhibits inflammatory cytokine production [1, 2] . In literature many studies have shown that IL-13 is also produced by diverse cell types, including eosinophils, mast cells, basophils, smooth muscle cells, natural killer cells, and fibroblasts with varied biological functions [3, 4] . The transcription of IL-13 cytokine is mainly regulated by GATA3 transcription factor. IL-13 has approx. 25% sequence homology with IL-4 and is located on human chromosome 5q31 [4] . It has been shown that IL-4 and IL-13 are functionally related, but it is surprising that IL-13 seems to be a more promising target for designing therapeutics than IL-4 [3] . As shown in Figure 1 , it mediates several vital functions in diverse biological pathways including regulation of airway hyperresponsiveness, allergic inflammation, mastocytosis, goblet cell hyperplasia, tissue eosinophilia, IgE Ab production, tumor cell growth, tissue remodeling, intracellular parasitism, and fibrosis. Many studies exhibited that alterations in IL-13 effector functions can be targeted to treat certain cancers, like B-cell chronic lymphocytic leukemia and Hodgkin's disease [5, 6] . It inhibits tumor immunosurveillance in homo sapiens, therefore IL-13 inhibitors can act as effective cancer immunotherapeutics candidates by activating type-1 anti-cancer defense mechanisms [7, 8] . Moreover, studies have reported that IL-13/ IL-4 receptors have a crucial role in prognosis of cancers such as pancreatic, gastric, and colon cancer [9] [10] [11] [12] . They interact with tumor microenvironment by activating tumor-associated macrophages and myeloidderived suppressor cells [13, 14] . IL-13 receptors were found to be overexpressed in glioblastoma multiforme human samples in situ, whereas, normal specimen expressed few IL-13 binding receptor sites [15] . Moreover, increased IL-13 levels were observed in pulmonary artery hypertension (PAH) patients as compared to non-PAH controls [16] . Involvement of IL-13 receptor α1 in myocardial homeostasis embarking its role in cardiovascular diseases [17] . IL-13 has emerged as a central regulator in airway hyperresponsiveness, fibrosis, mucus hypersecretion and switching of B cell antibody production from IgM to IgE. Research highlights that the IL-13 pathway could be a promising target in the treatment of diverse asthma phenotypes [18, 19] . It has been shown in the past that anti-IL-13 drugs plays a major role in controlling the 'Th2 high' asthmatic phenotype [20] . Therefore, anti-IL-13 drugs have become popular (anrukinzumab, lebrikizunab and tralokinumab) for controlling the severity of asthma [21] . IL-13 also has crucial consequences on non-hematopoietic cells, including endothelial cells, smooth muscle cells, fibroblasts, epithelial cells, and sensory neurons [22] [23] [24] [25] . Additionally, several studies have shown that the plasma levels of IL-13 were significantly higher in COVID-19 patients [26] [27] [28] . A recent study has shown that the COVID-19 infected patients with elevated IL-13 serum levels required ventilation support as compared to others. Also, COVID-19 patients prescribed with IL-13 inhibiting drug (Dupilumab) have shown less dreadful symptoms [26] . In another study, researchers have reported differential expression levels of 14 cytokines including IL-13 in healthy control, moderate and severe COVID-19 patients, and they observed that the higher IL-13 expression level is directly proportional to the COVID-19 severity [28] . Therefore, it is the need of the hour to develop a prediction method devoted to classify IL13 inducing peptides. To the best of our knowledge, IL-13Pred is the first of its kind to predict the IL-13 inducing and non-inducing peptides from its amino acid sequence. In this study, we present an in-silico method to classify the IL-13 inducing and non-inducing peptides/epitopes. We have used experimentally reported IL-13 inducing and non-inducing peptides of humans from IEDB. Using this dataset, we have applied various state-of-the-art machine learning classifiers and the model performance was evaluated on the independent dataset. We have delivered a webserver and standalone version of 'IL-13Pred' for scientific community usage. Initially, we extracted 343 IL-13 inducing experimentally validated peptides of humans from IEDB [29] . During pre-processing, we have removed the peptides with length less than 8 or more than 35 amino acids [30] . Also, to avoid any redundancy in the dataset, we have removed duplicate copies of the peptides (if any). Eventually, we left with a positive dataset of 313 IL-13 inducing peptides. One of the major challenges was to compile a negative dataset of experimentally validated non-inducing peptides. To overcome this issue, we have acquired the negative dataset from the recently published article IL6Pred [30] , and after the pre-processing we obtained 2908 non-inducing peptides. Finally, we proceeded with a positive dataset of 313 IL-13 inducing and a negative dataset of 2908 non-inducing unique peptides. In the present method, we have implemented Pfeature to compute various types of descriptors using sequence information of the peptides. It generates a wide range of descriptors in a fixed vector size of a given protein or peptide sequence [31] . We utilized composition-based module of Pfeature for our dataset, and generated a vector of 9165 features for each sequence. For our dataset we computed 15 types of composition-based features like, AAC, DPC, TPC, ATC and many more as given in Table 1 . In order to extract the crucial features from a larger pool of features generated using Pfeature, we utilized SVC-L1-based feature selection technique from Scikit-learn package. SVC-L1 based method implements the support vector classifier (SVC) with linear kernel, penalized with L1 regularization [32] . This technique identifies the important features from the highdimensional feature set. SVC-L1 is a faster method when compared to other available techniques for feature selection. Using this method, we have listed 95 important features from the pool of 9165 features. Post that, feature-selector tool was implemented for ranking of the obtained key 95 features based on their performance in classifying the IL-13 inducing and non-inducing peptides. Decision tree-based algorithm light gradient boosting machine is implemented in the featureselector tool which calculates the rank of feature based on the number of times a feature is used to split the data across all trees [33] . This tool generates top-ranked features (Supplementary Table S1 ) that were used to build machine learning prediction models for IL-13 inducing peptides. We have implemented standard protocols to build, test and evaluate our prediction models. In In order to develop a prediction method for classifying IL-13 inducing peptides, we have implemented diverse machine learning techniques. In this study, we have used various classifiers such as eXtreme Gradient Boosting (XGB), K-nearest neighbor (KNN), support vector machine (SVM), gaussian naive bayes (GNB), decision tree (DT), linear regression (LR), and random forest (RF). We implemented Scikit-learn package of Python to build these machine learning prediction models [35] . Standard norms of five-fold cross validation were followed such as marking four folds of data as training dataset and remaining one-fold as testing data. Where, FP is false positive, FN is false negative, TP is true positive and TN is true negative, respectively. We have provided a user-friendly web interface named 'IL-13Pred' to predict IL-13 inducing and non-inducing peptides (https://webs.iiitd.edu.in/raghava/il13pred). The web server is easy to use and has four major modules named as "Predict", "Design", "Protein Scan", and "Blast Scan". The front-end of the web interface was created using HTML5, JAVA, CSS3 and PHP scripts. This platform is compatible with most of modern devices including mobile, tablets, desktop and laptops. Standalone version of IL-13Pred can be downloaded from the provided URL https://webs.iiitd.edu.in/raghava/il13pred/stand.html. We have computed the amino acid composition for the positive and negative dataset independently. The average amino acid composition for IL-13 inducing and non-inducing peptides is shown in Figure 3 . The average composition of L, K, M and N residues are higher in IL-13 inducing peptides. Whereas, the amino acids G, P, Y and T residues are abundant in non-inducing peptides. In order to check the robustness of the developed models, we have reshuffled the dataset 10 times and calculated the performance of models on top-10 features. Table 4 represents the mean and standard deviation of performance obtained after reshuffling the data ten times. We have observed that the performance of the models is maintained even after reshuffling, signifying the robustness of the developed models. XGB based model has shown the similar trend as in Table 3 , with AUC of 0.82 ± 0.03 Recent studies have reported the role of elevated IL-13 levels in COVID-19 patients [26] [27] [28] . Coronavirus proteins massively induces the release of this immunoregulatory cytokine. In this study, we have made a systematic attempt to highlight contrast among IL-13 inducing However, the similar trends of these amino acid is depicted in Figure 3 , where the composition of Arginine and Asparagine is preferred in IL-13 inducing peptides as compared to noninducing peptides. In a given sequence, composition and position of the amino acid plays a vital role in changing the properties and role of the proteins in variant strains, which is also depicted in Table 6 . The complete results of three variant spike protein strains are provided in the Supplementary Tables (S6-S8 ). and a negative dataset of 2908 IL-13 non-inducing unique peptides, were used in this study. The amino acid composition and positional analysis was performed to show contrast among IL-13 inducing and non-inducing peptides. From compositional analysis we observe that the amino acid leucine, lysine, methionine and asparagine residues are preferred in IL-13 inducing peptides. However, glycine, proline, tyrosine and threonine residues are abundantly found in non-inducing peptides. On the other hand, the positional analysis shows that leucine is preferred at 2 nd , 5 th , 7 th ,9 th , 11 th and 16 th position, whereas, asparagine is mostly preferred at The efficacy of an epitope/peptide to induce IL-13 and alter the immune response towards a disease state makes it of great importance in immunotherapy and vaccine designing. Although inducing IL-13 response in patients is a very complex process that depends on various factors. An epitope/ peptide is still a promising alternative while designing a vaccine or immunotherapy against any disease. In literature, studies have shown that antibodies that can block IL-13 receptors could be used in designing vaccine effectively [36, 37] . Thus, IL-13 induced immunosuppression could serve as a crucial step in vaccine subunit designing. Although diverse in silico methods are available for T cell epitopes prediction, but no computational method was there for IL-13 inducing epitopes/ peptide prediction. The present study is an organized attempt made in this direction for providing a user-friendly platform/ webserver. We encourage scientific community to use IL-13Pred for developing efficient immunotherapy and vaccine candidates by differentiating them apriori as IL-13 inducing epitopes. Interleukin 13, a T-cell-derived cytokine that regulates human monocyte and B-cell function Interleukin-13 is a new human lymphokine regulating inflammatory and immune responses IL-13 effector functions Interleukin 13 and its role in gut defence and inflammation Interleukin 13: a growth factor in hodgkin lymphoma Suppression of an IL-13 autocrine growth loop in a human Hodgkin/Reed-Sternberg tumor cell line by a novel IL-13 antagonist Role of IL-13 in regulation of anti-tumor immunity and tumor growth Association between IL13 gene polymorphisms and susceptibility to cancer: a meta-analysis Endogenously Expressed IL-4Rα Promotes the Malignant Phenotype of Human Pancreatic Cancer In Vitro and In Vivo Possible Roles of Interleukin-4 and -13 and Their Receptors in Gastric and Colon Cancer Interleukin-4 and interleukin-13 increase NADPH oxidase 1-related proliferation of human colon cancer cells Association of IL4, IL13, and IL4R polymorphisms with gastrointestinal cancer risk: A meta-analysis Immune surveillance: a balance between protumor and antitumor immunity Alternative activation of tumor-associated macrophages by IL-4: priming for protumoral functions Receptor for interleukin 13 is a marker and therapeutic target for human high-grade gliomas Interleukin-13 in the pathogenesis of pulmonary artery hypertension New Role for Interleukin-13 Receptor α1 in Myocardial Homeostasis and Heart Failure Interleukin-13 signaling and its role in asthma Role of interleukin-13 in asthma Lebrikizumab treatment in adults with asthma A Critical Evaluation of Anti-IL-13 and Anti-IL-4 Strategies in Severe Asthma IL-13 selectively induces vascular cell adhesion molecule-1 expression in human endothelial cells IL-12/IL-13 axis in allergic asthma Effects of Th2 cytokines on chemokine expression in the lung: IL-13 potently induces eotaxin expression by airway epithelial cells Sensory Neurons Co-opt Classical Immune Signaling Pathways to Mediate Chronic Itch IL-13 is a driver of COVID-19 severity Clinical features of patients infected with 2019 novel coronavirus in Wuhan Plasma IP-10 and MCP-3 levels are highly associated with disease severity and predict the progression of COVID-19 The Immune Epitope Database (IEDB): 2018 update Computer-aided prediction and design of IL-6 inducing peptides: IL-6 plays a crucial role in COVID-19 Computing wide range of protein/peptide features from their sequence and structure Feature Selection for Classification: A Review LightGBM: A Highly Efficient Gradient Boosting Decision Tree Computer-aided designing of immunosuppressive peptides based on IL-10 inducing potential Scikit-learn: Machine Learning in Python Double-Blind, Placebo-Controlled, Dose-Escalation First-in-Human Study Blockade of interleukin-13-mediated cell activation by a novel inhibitory antibody to human IL-13 receptor alpha1 AD is thankful to the Department of Science and Technology (DST-INSPIRE) and SP is thankful to the Department of Biotechnology (DBT) for providing senior research fellowships. SJ, AD and SP are thankful to the Department of Computational Biology, IIITD New Delhi for infrastructure and facilities. All the datasets generated for this study are either included in this article the dataset is available at https://webs.iiitd.edu.in/raghava/il13pred/dataset.php. The authors declare no competing financial and non-financial interests. SJ, AD, and SP collected and processed the datasets. SJ, AD, SP and GPSR implemented the algorithms. SJ, AD, and SP developed the prediction models. SJ, AD, SP and GPSR analysed the results. SP, SJ and AD created the back-end of the web server and front-end user interface. SJ, AD, SJ and GPSR penned the manuscript. GPSR conceived and coordinated the project, and gave overall supervision to the project. All authors have read and approved the final manuscript.