key: cord-0011417-flyno6vj authors: Sayed, Sifat Bin; Nain, Zulkar; Khan, Md. Shakil Ahmed; Abdulla, Faruq; Tasmin, Rubaia; Adhikari, Utpal Kumar title: Exploring Lassa Virus Proteome to Design a Multi-epitope Vaccine Through Immunoinformatics and Immune Simulation Analyses date: 2020-01-02 journal: Int J Pept Res Ther DOI: 10.1007/s10989-019-10003-8 sha: 690c5014566c7780c32cf5d558782ffe43b97f12 doc_id: 11417 cord_uid: flyno6vj Lassa virus (LASV) is responsible for a type of acute viral haemorrhagic fever referred to as Lassa fever. Lack of adequate treatment and preventive measures against LASV resulted in a high mortality rate in its endemic regions. In this study, a multi-epitope vaccine was designed using immunoinformatics as a prophylactic agent against the virus. Following a rigorous assessment, the vaccine was built using T-cell (N(CTL) = 8 and N(HTL) = 6) and B-cell (N(LBL) = 4) epitopes from each LASV-derived protein in addition with suitable linkers and adjuvant. The physicochemistry, immunogenic potency and safeness of the designed vaccine (~ 68 kDa) were assessed. In addition, chosen CTL and HTL epitopes of our vaccine showed 97.37% worldwide population coverage. Besides, disulphide engineering also improved the stability of the chimeric vaccine. Molecular docking of our vaccine protein with toll-like receptor 2 (TLR2) showed binding efficiency followed by dynamics simulation for stable interaction. Furthermore, higher levels of cell-mediated immunity and rapid antigen clearance were suggested by immune simulation and repeated-exposure simulation, respectively. Finally, the optimized codons were used in in silico cloning to ensure higher expression within E. coli K12 bacterium. With further assessment both in vitro and in vivo, we believe that our proposed peptide-vaccine would be potential immunogen against Lassa fever. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s10989-019-10003-8) contains supplementary material, which is available to authorized users. Lassa virus (LASV) is an emerging viral pathogen belongs to the Arenaviridae family and can cause severe viral haemorrhagic fever which is known as Lassa fever, with a 20% fatality rate (Charrel and De Lamballerie 2003) . Lassa fever is a viral acute zoonotic disease due to its capability to affect the highest number of people (~ 500,000) and causes 5000 annual deaths in Western Africa (Ogbu et al. 2007) . Also, the people of Ghana (~ 10%), Côte d'Ivoire (~ 30), Nigeria (~ 40%), Guinea (~ 50%), Sierra-Leone and Liberia (~ 80%) and a few areas of Mali are assumed to be affected by Lassa fever (Fichet-Calvet and Rogers 2009; Safronetz et al. 2010) . About 200 million people of West African regions (i.e., Nigeria and Senegal) are at high risk of LASV outbreak (Charrel and De Lamballerie 2003) . Moreover, it also affects many areas of Europe such as the United Kingdom (Kitching et al. 2009 ), Netherlands (WHO 2000) and Germany (Haas et al. 2003) . Though it has been revealed that the virus primarily targets antigen-presenting cells (mainly dendritic cells and macrophages) and endothelial cells and interferes with their complete maturation and activation, but the pathogenesis of Lassa fever is yet not clearly understood (Hallam et al. 2018; Oti 2018) . Given the high annual incidence and mortality rate, however, the development of an effective LASV vaccine is an urgent necessity. LASV is endemic to West Africa, and the genomic organization of the Lassa virus is an enveloped, ambisense and has a bisegmented, negative sense and single-stranded RNA genome consisting of large (L) segments and small (S) segments (Oti 2018) . The large or L segment of the RNA encodes the 200 kDa RNA polymerase (L) protein and the small ring-finger protein (matrix protein or Z-protein, 11 kDa) that regulate replication and transcription (Cornu and de la Torre 2001; Djavani et al. 1997 ). The small segments encode the surface glycoprotein precursor (GP, 75 kDa) and the nucleoprotein (NP, 63 kDa), which is proteolytically cleaved into GP1 and GP2 (envelope glycoprotein) that bind to the alpha-dystroglycan receptor and mediate entering into the host cell (Cao et al. 1998; Oti 2018) . LASV is transmitted to the human being through the rodent reservoir Mastomys natalensis, a typical African rat lurking in village houses (Bonner et al. 2007 ). Recent evidence, however, indicates that other rodent species may also be LASV recipients, like, African wood mouse Hylomyscus pamfi (Nigeria), and Guinea mouse M. erythroleucus (Nigeria and Guinea) (Hallam et al. 2018 ). Exchange of LASV occurs when a healthy individual comes in contact with the blood, secretion, tissue or excretion of any infected personal or by food contaminated with the host excreta. However, skin to skin contact without exchange of blood fluid cannot transmit the virus (Keenlyside et al. 1983) . Children under ten years old are considered as the most vulnerable to LASV. For instance, a study showed 15% seropositivity in the under-aged population in West Africa (Kernéis et al. 2009 ). Besides, pregnant patients with Lassa fever results in spontaneous abortions (Price et al. 1988 ). Ribavirin, an antiviral drug, is found to be effective at the initial phase of Lassa fever and can reduce the fatality rate (Jahrling et al. 1980; McCormick et al. 1986 ). However, the development of potential toxicity and teratogenicity when used in the later stage of disease drives us to think that Ribavirin is not a potent therapeutic against Lassa fever (Fisher-Hoch et al. 1992; Kochhar 1990) . Peptide vaccines are immune stimulants where fragments of virus-derived proteins mimic natural pathogens; hence, more influential in terms of safety, efficacy and specificity (Skwarczynski and Toth 2016) . In 1985, the first epitope-based vaccine was developed using cholera toxin against E. coli (Jacob et al. 1985) . Furthermore, peptide vaccines against many pathogenic agents (i.e., HIV, malaria, swine fever, influenza, anthrax, etc.) are promptly under development (Li et al. 2014) . Present study demonstrates the screening of whole LASV proteome followed by the grouping of viral proteins. Each protein group was evaluated separately for the identification of T-cell and B-cell epitopes along with their respective MHC alleles using vaccinomics. Subsequently, a vaccine was designed using the most persuasive epitopes from each protein with suitable adjuvant and linkers. The primary sequence was used for physicochemical analysis and immunogenic profiling, followed by the secondary and tertiary structure predictions. The predicted three-dimensional (3D) structure was applied for refinement and validation. Besides, disulphide bridging was done to improve structural stability. The binding affinity and interactions between the vaccine protein and the receptor were calculated by molecular docking and dynamics simulation, respectively. Codon optimization and in silico cloning were taken care of for the evaluation of the expression of chimeric protein within the appropriate host. Finally, an immune simulation was performed to estimate the immunogenic potency in real-life. The whole proteome of the Lassa mammarenavirus was retrieved from the ViPR database (Virus Pathogen Database and Analysis Resource), an integrated robust database for several virus families and their respective species (Pickett et al. 2012) . Initially, the proteins from LASV proteome were isolated and classified as glycoprotein, L-protein, matrix protein, nucleocapsid protein, nucleoprotein, polymerase, ring-finger protein, Z-protein. As a measure of an immune response, the structural proteins were then applied for antigenicity prediction using the Vaxijen v2.0 server with a 0.5 threshold value (Doytchinova and Flower 2007) . This server uses auto and cross-covariance (ACC) transformation method to maintain 70-89% prediction accuracy. A protein with the best antigenic score was chosen from each class of structural proteins. The selected protein was submitted to the NetCTL v1.2 server for the prediction of CTL epitopes (9-mer) for 12 supertypes (i.e., A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58, and B62). This server predicts CTL epitopes based on three criteria, namely, MHC-I binding peptides, C-terminal cleavage, and TAP transport efficiency. MHC-I binding and C-terminal cleavage are obtained by using artificial neural networks whereas TAP transporter efficiency is evaluated by the weight matrix (Larsen et al. 2007) . In this study, the threshold was set to 0.5 which has a sensitivity and specificity of 0.89 and 0.94, respectively. Furthermore, MHC-I binding alleles for each CTL epitope were predicted using the consensus method in the IEDB analysis tool (Moutaftsi et al. 2006) . The human was selected as the source species, and percentile rank ≤ 5 was considered since lower score indicates higher affinity. Firstly, selected CTL epitopes were rechecked for antigenicity to ensure their ability to induce immune response with VaxiJen v2.0 server. Further, they were also evaluated for immunogenicity with MHC-I immunogenicity tool of IEDB server (Calis et al. 2013) . Vaccine components should be free from an allergic reaction. So, AllergenFP v1.0 server was used for allergenicity prediction. This server can recognize both allergens and non-allergens with 88% accuracy (Dimitrov et al. 2014b) . Furthermore, toxic epitopes should be eliminated as they could compromise the functionality of the vaccine construct. Therefore, we used ToxinPred server to sort out the toxic CTL epitopes (Gupta et al. 2013 ). Helper T-lymphocyte (HTL) responses play an essential role in the induction of both humoral and cellular immune responses. Therefore, HTL epitopes are likely to be a significant element of preventive and immunotherapeutic vaccines. The IEDB MHC-II binding tool was applied to predict 15 amino acid long HTL epitopes using NN-align method (Nielsen and Lund 2009) . A percentile rank was generated by comparing peptide's binding affinity with a comprehensive set of randomly selected peptides from the Swiss-Prot database. Percentile rank ≤ 5 was also considered for this analysis (Paul et al. 2016 ). Innate immune system, B-lymphocytes, cytotoxic T-cells and other immune cells are activated by the help of helper T-cells which further releases different types of cytokines, i.e., interferon-gamma (IFN-γ), interleukin-4 (IL-4) and interleukin-10 (IL-10), HTL epitopes have the ability to overcome proinflammatory response and thus diminish tissue damage (Luckheeram et al. 2012) . Therefore, cytokineinducing HTL epitopes are essential in vaccine development. So, we used IFNepitope server for the prediction of IFN-γ inducing HTL epitopes using a hybrid method (Motif and SVM) along with IFN-gamma versus Non-IFN-gamma model (Dhanda et al. 2013a, b) . In addition to IFN-gamma, IL-4 and IL-10 properties were also evaluated with IL4pred and IL10pred servers, respectively (Dhanda et al. 2013a, b; Nagpal et al. 2017 ). Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development as it produces antibodies which provide humoral immunity. Therefore, LBL epitopes were predicted using iBCE-EL server. It uses a method that combines gradient boosting algorithms with an extremely randomized tree method (Manavalan et al. 2018) . The predicted LBL epitopes were then further assessed to check their antigenicity, allergenicity and toxicity with VaxiJen v2.0, AllergenFP v1.0 and ToxinPred server, respectively. Different HLA alleles, as well as their expression, are sensationally distributed at various frequencies in diverse ethnicities (Adhikari and Rahman 2017) . Hence, the HLA-alleles distribution among the world population is crucial for successful multi-epitope vaccine development. In this study, the IEDB population coverage analysis tool was used for the analysis of the population coverage of the potential CTL and HTL epitopes and their MHC binding alleles ). To construct multi-epitope vaccine, finally selected CTL, HTL, and LBL epitopes were linked together with the help of AAY, GPGPG and KK linkers, respectively (Gu et al. 2017; Nain et al. 2019) . Peptides used for the vaccine construction are generally poorly immunogenic when used alone, therefore, requires adjuvants to boost up the immune response (Li et al. 2014) . Consequently, the amino acid sequence of OmpA protein (GenBank: AFS89615.1) was chosen as an adjuvant and linked ahead of the first CTL epitope through EAAAK linker (Arai et al. 2001 ). Using linker for joining of two epitopes is required for the effective functioning of each epitope (Nezafat et al. 2014 ). As vaccine protein should be highly antigenic, antigenicity prediction is necessary. The designed vaccine construct was analysed using Vaxijen v2.0 (Doytchinova and Flower 2007) and cross-checked with ANTIGENpro server (Magnan et al. 2010) . Vaxijen is free from any alignment and work based on various physicochemical properties of the protein, whereas ANTIGENpro server has been developed based on microarray analysis data. Screening for allergenicity is important as it indicates the potentiality of a vaccine construct to cause sensitization and allergic reaction. We used AllergenFP v1.0 (Dimitrov et al. 2014b ) and AllerTOP v2.0 (Dimitrov et al. 2014a ) server to check the allergenicity of the vaccine protein. Inducing an immune response after injecting the vaccine into the body is the sole purpose of vaccination. Therefore, the assessment of various physicochemical properties of the chimera protein is essential. The primary protein sequence of the vaccine was used to predict the various physiochemical features through ProtParam (Wilkins et al. 1999 ) web-server. Furthermore, the solubility of the vaccine protein upon overexpression in E. coli was predicted by SOLpro (Magnan et al. 2009 ) tool in the SCRATCH suite. PSIPRED v4.0 server was used for the prediction of vaccine's secondary structure. Two feed-forward neural networks are the basis of PSIPRED, which process the PSI-BLAST (Position-Specific Iterated-Blast) to predict a secondary structure from an amino acid sequence as an input (Buchan et al. 2013) .The tertiary structure of the final subunit vaccine was predicted using the RaptorX server, which is based on three steps; specifically single-template threading, multiple-template threading and alignment quality prediction. The server predicts 3D protein model and provides some confidence scores to evaluate the quality of predicted models. P-value, GDT, uGDT, modelling error at each residue etc. are different confidence scores that provide a clear indication to the relative good model (Källberg et al. 2017 ). The predicted model of the vaccine protein was refined through GalaxyRefine web-server. This server initially rebuilds sidechains, then sidechain repacking and consequently uses the molecular dynamics simulation for overall structural relaxation (Heo et al. 2013) . Structural validation is a process to identify potential errors in the predicted tertiary structure . Therefore, ProSA-web was used for structural validation, which provides an overall quality score for the input structure. If the calculated score falls outside the range characteristics of the native protein, then the structure contains errors (Wiederstein and Sippl 2007) . The ERRAT server was also used to analyse non-bonded atom-atom interactions (Colovos and Yeates 1993) . Finally, Ramachandran plot was obtained using PRO-CHECK server. The Ramachandran plot is a way to visualize energetically allowed and disallowed dihedral angles psi (ψ) and phi (ϕ) of amino acid and is calculated based on van der Waal radius of the side chain. The result from PRO-CHECK include the percentage and number of residues in most favoured, additional allowed, generously allowed, and disallowed region, which defines the quality of modelled structure (Laskowski et al. 1993) . The conformational B-cell epitope is the collection of amino acid residues on the 3-dimensional geometry of the vaccine protein which interacts directly to the immune receptor. Therefore, ElliPro tool of IEDB server was used to determine the presence of conformational B-cell epitopes in the validated tertiary structure. ElliPro uses three algorithms relying on their protrusion index (PI) values to approximate the protein structure as an ellipsoid, calculate the residue PI, and adjacent cluster residues. ElliPro provides an average PI value over each epitope residue for each generated epitope. For each epitope residue, the PI value is calculated based on the residue mass centre outside the largest possible ellipsoid (Ponomarenko et al. 2008 ). Disulphide bonds are covalent interactions that stabilizes molecular interactions and provide considerable stability by confirming precise geometric conformations. Disulphide engineering is a novel approach for creating disulphide bonds into the target protein structure. Therefore, disulphide engineering was executed with Disulphide by Design v2.12 web tool. Initially, the refined protein model was uploaded and run for the residue-pair search that can be used for the disulphide engineering purpose. Potential residue pairs were selected for mutation, and cysteine residue was used as a final target for disulphide engineering (Craig and Dombkowski 2013) . Codon optimization is essential as unadapted codon may lead to the minor expression rate in the host. Codon optimization was performed using Java Codon Adaptation Tool (JCat) server with a view to improving the translational efficiency in E. coli K12 strain (Grote et al. 2005) . Three additional options were selected to avoid the rho-independent transcription termination, prokaryote ribosome binding site, and restriction enzymes cleavage sites. Codon adaptation index (CAI) value and GC content of the adapted sequence was obtained and compared with the ideal range (Sharpl and Li 1987) . Consequently, the received nucleotide sequence was cloned into the E. coli pET28a(+) vector by using Snap-Gene v4.2 tool. Molecular docking is a computational method which involves the interaction between a ligand molecule and the receptor molecule to provide a stable adduct. Also, a calculated score was provided as a measure of the degree of binding interaction (Lengauer and Rarey 1996) . Tolllike receptor 2 (TLR2) can mediate high proinflammatory responses against LASV infection (Hayes and Salvato 2012) . Therefore, TLR2 structure was used as the receptor (PDB ID: 3A7B) and the refined vaccine protein as the ligand (Berman et al. 2002) . Finally, the binding affinity between the multi-epitope vaccine and TLR2 receptor was calculated through the ClusPro v2.0 server (Kozakov et al. 2017) . This server completed the task in three consecutive steps such as rigid body docking, clustering of lowest energy structure, and structural refinement by energy minimization. The bestdocked complex was selected based on the lowest energy scoring and docking efficiency. Molecular dynamics study is critically essential for checking the stability of the protein-protein complex in any in silico analysis. Protein stability can be determined by comparing essential protein dynamics to their normal modes (Van Aalten et al. 1997; Wüthrich et al. 1980 ). The iMODS server was used to explain the collective protein motion in the internal coordinates through normal mode analysis (NMA) (López-Blanco et al. 2014) . The server estimated the direction and extent of the immanent motions of the complex in terms of deformability, eigenvalues, B-factors, and covariance. The deformability of the main chain depends on whether a specified molecule can deform at each of its residues. The eigenvalue of each normal mode describes the rigidity of motion. This value is related directly to the energy required for the structural deformation, and the deformation is much easier if the eigenvalue is low. To estimate the immunogenic potential of the final vaccine, in silico immune simulations were conducted using the C-ImmSim server. This immune simulator uses a positionspecific scoring matrix (PSSM) and machine learning techniques for the prediction of epitope prediction and immune interactions, respectively (Rapin et al. 2010 ). The minimum suggested interval between dose 1 and dose 2 is 4 weeks, according to most vaccines in current use (Castiglione et al. 2012 ). All parameters were set at default with time steps set at 1, 84, and 168 where each time step is equal to 8 h. Therefore, three injections were given four weeks apart. Moreover, six doses of injections of the designed vaccine were given in the same manner to simulate repeated exposure interaction to the antigen seen in a typical endemic area to probe for clonal selection. For the construction of candidate vaccine, a total of 1665 protein sequences of different structural (glycoprotein, matrix protein, nucleocapsid protein, nucleoprotein, ring-finger protein and Z-protein) and non-structural (L-protein, Polymerase, RNA directed RNA-polymerase) protein sequences of the LASV were retrieved from ViPR database. Structural protein enables viruses to invade and assemble viral particles in the host, while non-structural proteins secrete various enzymes that assist in viral replication and development of structural proteins. Vaxijen v2.0 server revealed the highest antigenic protein from each type and as we set a threshold of 0.5, only glycoprotein, matrix protein, ring-finger protein and Z-protein showed antigenicity (Table 1) . A total of 180 unique CTL epitopes (9-mer) were predicted from the four LASV highest antigenic proteins. Here 97, 22, 32 and 29 CTL epitopes were predicted from the glycoprotein, matrix protein, ring-finger protein and Z-protein, respectively, using the NetCTL v1.2 server. Among them, only 42 epitopes were found as antigenic, immunogenic, and non-toxic (Supplementary Table 1 ). Of the 42 epitopes, 22 were found to be non-allergenic. These non-allergenic epitopes were further used to predict their MHC-I binding alleles using MHC-I allele prediction tool of the IEDB server. Similarly, a total of 72 unique HTL epitopes (15-mer) and their MHC-II binding molecules were predicted using the IEDB MHC-II prediction tool. The cytokine (i.e., IFN-γ, IL-4 and IL-10) inducing ability of these HTL epitopes were also evaluated (Supplementary Table 2 ). B-cell epitopes are antigenic regions of a protein that can trigger antibody formation. The iBCE-EL tool was used to predict linear B lymphocyte (LBL) from the LASV proteins. We found a total of 101 LBL epitopes and after evaluation, only 37 unique epitopes (14, 5, 9 and 9 LBL epitopes were found from the glycoprotein, matrix protein, ring-finger protein and Z-protein, respectively) were found to be non-allergenic and non-toxic and considered for vaccine construction (Supplementary Table 3 ). For multi-epitope vaccine designing, we have considered highly antigenic CTL epitopes from every type of proteins that are immunogenic, non-allergenic and non-toxic (Table 2) . On the other hand, at HTL epitope selection, we screened cytokine-inducing properties and found that only two epitopes from the glycoprotein have the capacity to induce all three type of cytokine, while epitopes from the other proteins were positive for a maximum of two cytokines. Hence, we selected two epitopes from the glycoprotein and epitopes having inducing feature for at least two cytokines were selected in case of other proteins (Table 3) . As we get a numerous number of B-cell epitopes with higher antigenicity, non-toxicity and non-allergenicity, we took the epitope with the best probability score (obtained from iBCE-EL tool) from each type of protein (Table 4) . Therefore, 6 CTL, 8 HTL and 4 LBL epitopes are merged by AAY, GPGPG and KK linkers, respectively (Fig. 1) . OmpA agonist (GenBank ID: AFS89615.1), which is 352 amino acid residues long, was used as an adjuvant for TLR2 receptor using EAAAK linker. The final vaccine construct comprises 642 amino acid residues. The distribution of HLA allele varies between different geographical and ethnic regions around the globe. Therefore, population coverage during the development of an efficient vaccine must be taken into account. In this study, selected CTL and HTL epitopes, which were used to construct the vaccine and their corresponding HLA alleles (Supplementary Tables 4 and 5) were obtained for population coverage analysis both individually and in combination. Our selected CTL and HTL epitopes were found to cover 91.74% and 68.15% of the world population, (Fig. 2) . Multiple physicochemical properties were calculated from the ProtParam server by inserting the whole vaccine construct as an amino acid sequence. The molecular weight of the construct was calculated as ~ 68 kDa and the antigenicity prediction showed that the construct has good antigenic properties. The analysis shows 9.31 pI (Isoelectric point) value which indicates the vaccine construct is basic in nature. The instability index (II) was computed to be 29.05, which implies that the sequence of the construct will remain stable after expression. The aliphatic index was calculated as 84.98 which indicates the construct's thermostability. The grand average of hydropathicity (GRAVY) was calculated to be negative (− 0.118). This negative value indicates the hydrophilic nature of protein; therefore this protein tends to have better interaction with other proteins. Estimated halflife in mammalian reticulocyte (in vitro) was found to be 30 h, while in yeast and Escherichia coli the estimated halflife (in vivo) are > 20 h and > 10 h, respectively. The immunogenic appraisal also revealed that our vaccine construct is highly antigenic, non-allergenic and showed higher solubility rate calculated through SOLpro server. These evaluations suggested that our vaccine construct might be an ideal vaccine against LASV (Table 5 ). PSIPRED v4.0 workbench was used to predict the secondary structure of the final vaccine construct. The final vaccine construct (642 amino acid long) was analysed in which 352 amino acids involved in random coil formation while 149 amino acids involved in α-helix creation and β-strands are formed only by 141 amino acids. So, overall secondary structural feature prediction results indicate 54.83% are random coils, 23.21% forms α-helix and 21.96% are β-strands (Fig. 3 ). The tertiary structure of the multiepitope vaccine was modelled from the RaptorX server. In RaptorX server, 100% (642) amino acid residues were modelled as six domains. The protein structure (PDB ID: 1bxwA) was used as the best template for modelling. The relative quality of the modelled structure was evaluated by P-value. Calculated P-value for the modelled structure was 5.39 × 10 −06 which is very low. Here, lower the P-value higher the quality of the model. Furthermore, uGDT is also used as a parameter for 3D structure evaluation and a construct with > 100 residues, uGDT > 50 is a good indicator. As we got 354 as uGDT score, it indicates the tertiary model as acceptable protein model for further analysis (Supplementary Table 7 ). The predicted tertiary (3D) structure was further refined using GalaxyRefine server, leads to an increase in the number of residues in the favoured region, generated five refined models. The refined best model showed 93.6% residues in the most favoured region in the Ramachandran plot, GDT-HA score 0.9276, RMSD 0.490, MolProbity 2.052, Clash score 13.2 and Poor rotamers 0.8 (Supplementary Table 8) , which indicate the quality of the refined model among all the five models after comparison and finally helped us to select the model for further studies (Fig. 4) . Validation of the refined tertiary structure was checked by using PROCHECK, ProSA-Web and ERRAT server. Ramachandran plot analysis of the crude structure, by PROCHECK server, revealed that 85.9% of the structure was located in the most favoured region, 11.1% in additional allowed regions, 2.0% in generously allowed regions and 1.0% of the residues were the in disallowed regions. Whereas, after refinement PROCHECK generated a better result, 89.7% of residues were located in the most favoured regions, 8.6% in additionally allowed, 1.0% in generously allowed and 0.8% of residues were found in the disallowed region (Fig. 5a) . ProSA-web and ERRAT verified the quality and potential errors in a crude 3D model. The selected best model after refinement had an overall quality factor of 78.2% with ERRAT while ProSA-web gave a Z-score of − 4.23 for the input vaccine protein model, indicating the model is slightly in the range of native protein conformation (Fig. 5b) . Conformational B-cell epitopes were predicted using ElliPro, an online web server that predicts epitope based on the tertiary structure. A total of 305 residues with scores varying from 0.566 to 0.944 were predicted to be located in eight conformational B-cell epitopes (Fig. 6, Table 6 ). The epitopes ranged in size from 4 to 93 residues. Disulphide engineering was performed using Disulfide by Design v2.12 to stabilize the modelled structure of the final vaccine construct. In total, 54 pairs of residues could be used in disulphide engineering have been discovered (Supplementary Table 9 ). However, only two pairs of residuals have been concluded after the evaluation of other parameters such as energy score and χ 3 angle, as their value falls below the allowed range, i.e. energy should be less than 2.2 kcal/mol, and χ 3 angles are expected to be between − 87 and + 97° (Craig and Dombkowski 2013) . Therefore, a total of four mutations were generated on the residue pairs. For Ala11-Ala19 residual pairs, the energy score is 0.98 kcal/mol, and the χ 3 angle is 85.64 degree. Whereas, for Cys325-Cys337, the χ 3 angle and the energy were − 74.33° and 1.93 kcal/mol (Fig. 7) . Expressing the LASV-derived vaccine protein epitope into the E. coli expression system was the primary purpose of in silico cloning. Therefore, according to the codon usage of the E. coli expression system, it was necessary to adapt the codon respectively to the subunit vaccine construct. To optimize codon usage of the vaccine construct in E. coli (strain K12) for maximal protein expression, the Java Codon Adaptation Tool (JCat) was used. The length of the optimized codon sequence was 1926 nucleotides. Codon optimization evaluates the sequence and tells about GC content of the cDNA sequence and codon adaptive index (CAI) where GC content was calculated as 53.63% which lies in the optimum range of (30-70) %. CAI was calculated as 0.98, which also lies in the range (0.8-1.0), which indicates the possibility of good expression of the vaccine candidate in the E. coli host. XhoI (158) and NdeI (238) restriction sites were later created and cloned using SnapGene software into the pET28a (+) vector (Fig. 8) . Thus, the total length of the clone was 7.22 kbp. To assess the interaction between the refined model and the TLR2 (PDB ID-3A7B) immune receptor, molecular docking was performed by using online server ClusPro v2.0 and a total of 30 models were generated (Supplementary Table 10 ). Among them, only that model was selected, which occupied the receptor properly and having the lowest energy score. Since model number 1 fulfils the desired criteria, therefore, was chosen as the best-docked complex (Fig. 9 ). The energy score obtained for the model 1 was found to be − 1406, which is lowest among all other predicted docked complex confirming the highest binding affinity. Normal mode analysis (NMA) was conducted to scrutinize protein stabilization and their large-scale mobility. This assessment was conducted by iMODS server depending on the internal coordinates of the docked complex. The complex's deformability depends on the individual distortion of each residue, depicted by chain hinges (Fig. 10b) . The eigenvalue found for the complex was 9.857553e −08 (Fig. 10a) . The variance correlated with each normal mode was inverted to the eigenvalue (Kovacs et al. 2004 ). The B-factor values generated from normal mode analysis were proportional to RMS (Fig. 10c) . Covariance matrix showing various pairs of related, anti-correlated or uncorrelated motions represented by red, blue and white colours, has stated the coupling of pairs of residues, respectively (Fig. 10d) . The result also provided an elastic model of the network that distinguished the pairs of atoms linked through springs (Fig. 10e) . Each dot in the diagram showing one spring, coloured by the degree of stiffness, between the corresponding atom pairs. The darker the greys, the more rigid the springs were. The simulated immune response was compatible with actual immune responses (Fig. 11 ). For instance, the secondary and tertiary responses were higher than the primary response. High concentrations of IgM was characterized at the primary response. In both the secondary and tertiary reactions, the typical high levels of immunoglobulin activities (i.e., IgG1 + IgG2, IgM, and IgG + IgM antibodies) were evident with concomitant antigen reduction (Fig. 11a) . This indicates the emergence of immune memory and thus increased antigen clearance upon subsequent exposures (Fig. 11e) . Furthermore, several long-lasting B-cell isotypes were observed, suggesting the potential for isotype switching and memory formation (Fig. 11b, c) . In the TH (helper) and TC (cytotoxic) cell populations with the respective memory development, a similarly elevated response was noticed ( Fig. 11d-f ). During exposure, increased macrophage activity was demonstrated, with continuously proliferating dendritic cells (Fig. 11g, h) . High levels of IFN-γ and IL-2 were also evident. Besides, a lower Simpson index (D) indicates greater diversity (Fig. 11i) . Moreover, multiple exposure (n = 6) simulation as an encounter to endemic regions led to Fig. 1 ). This profile suggests immune memory development and, therefore, natural immune protection against the virus in question. LASV is a virus with a higher mortality rate and has the potential to bring upon catastrophe among the endemic region like West Africa. So, for developing a prevention method to fight against LASV is an obligation. Nowadays, vaccination is the most dynamic approach to improve the immunity system to fight against infectious diseases. Efficient development and manufacturing of live or attenuated vaccine, however, is expensive and can take years to complete. Nonetheless, the incorporation of excessive antigenic load in the attenuated vaccine appears not only to contribute little to the protective immune response but to complicate the state by causing allergic reactions (Li et al. 2014 ). Compared to traditional vaccines, multi-epitope vaccines decrease unwanted parts, which can either cause pathological immune responses or adverse effects (Zhang 2018) . Increased safety, cost-effectiveness, the opportunity to rationally engineer the epitopes for increased potency and breadth, and the ability to focus immune responses on conserved epitopes also include potential benefits of epitope-based vaccines (Shey et al. 2019) . For many years, researchers have sought to minimize the cost, time and side-effects of vaccine development. Different strategies are readily available at this moment for designing and developing efficient and competent new generation epitope-based vaccines depending on immunoinformatic approaches (María et al. 2017; Seib et al. 2012) . Researchers also used immunoinformatics methods as a tool to provide futuristic models of multi-epitope driven vaccine against Ebola virus, Hepatitis C virus, Oropouche virus, Dengue virus, etc. (Adhikari et al. 2018; Ali et al. 2017; Dash et al. 2017; Ikram et al. 2018) . Knowing all the pros of multiepitope based vaccine, our first and foremost concern was to construct a vaccine which will be able to elicit a robust immune response after vaccination. Though there had been a few attempts to suggest candidates for peptide vaccine against LASV, this is the very first approach recommending a fully functional multi-peptide based vaccine that has been evaluated by in silico approaches (Faisal et al. 2017; Hossain et al. 2018; Verma et al. 2015) . ViPR database was used to retrieve the whole complete sequence of LASV and after screening different types of protein, 4 protein sequences were selected due to their higher antigenicity. Through different servers and databases CTL, HTL and LBL epitopes were chosen as a vaccine candidate. An effective multi-epitope vaccine should be designed to include epitopes capable of producing CTL, HTL and B cells epitopes and inducing efficient reactions to a specific tumour or virus (Zhang 2018) . Because of its function in inducing antibody manufacturing and mediating its effective features, we have been interested in incorporating B cell epitopes (Cooper and Nemerow 1984) . Over time, however, the humoral response from memory B cells can easily be overcome by the emergence of antigens, whereas cell-mediated immunity (T-cell immunity) often leads to lifelong immunity (Bacchetta et al. 2005) . CTL limits pathogen spread by identifying and destroying infected cells and by secreting unique antiviral cytokines (Garcia et al. 1999) . Therefore, B and T-cell epitopes were predicted for the vaccine construct. The vaccine candidates were selected from HTL, CTL and B-cell epitopes based on their antigenicity, allergenicity, immunogenicity and toxicity. Helper T-cells that release other types of cytokines such as interferon-gamma (IFNγ), interleukin-4 (IL-4) and interleukin-10 (IL-10) have the potential to overcome pro-inflammatory response and therefore reduce tissue damage. The innate immune system, B-lymphocytes, cytotoxic T cells and other immune cells are activated with the help of helper T-cells. Thus, the cytokine (i.e., IFN-γ, IL-4 and IL-10) inducing ability of specific HTL epitopes were also evaluated for candidate choosing. The vaccine construction was completed after joining the CTL, HTL and B-cell epitopes with AAY, GPGPG and KK linkers, respectively. To enhance expression, folding and stabilization, linkers are implemented as an indispensable element in the development of vaccine protein (Shamriz et al. 2016) . Furthermore, the OmpA agonist (GenBank ID: AFS89615.1) was used as a TLR2 adjuvant and joined to the first CTL epitope using EAAAK linker (Arai et al. 2001) . When used alone, multiepitope-based vaccines are poorly immunogenic and require coupling to adjuvants (Meza et al. 2017) . Adjuvants are ingredients added to vaccine formulations that affect particular immune responses to antigens, their development, stability, and longevity and are protective against infection (Lee and Nguyen 2015) . Also, they gain great attention because the immune response to humoral and cell-mediated immune responses can be selectively modulated (Bonam et al. 2017) . Though, the vaccine size is seemingly long as a peptide vaccine, several studies had been done where the vaccine length is even longer (Chatterjee et al. 2018; Kalita et al. 2019; Rahmani et al. 2019 ). Therefore, we think it won't be a problem in term of stability and expression. When assessing the vaccine construct, we observed that the non-adjuvant construct showed less antigenicity (0.624) than the adjuvant construct (0.7223) with the aid of Vaxijen server, which clearly states that the adjuvant is significant for the chimera. The molecular weight of our vaccine candidate is ~ 68 kDa which is an average molecular weight for a multiepitope vaccine. One of the fundamental requirements of many biochemical and functional analysis is the solubility of overexpressed recombinant protéin within the E. coli host ). The constructed vaccine protein was found to be soluble which secured their easy access to the host. The basic nature of the vaccine is indicated by theoretical pI value. In addition, the predicted instability index shows that the protein will remain stable after expression, thus enhancing usage capacity further. The GRAVY score and aliphatic index depict the hydrophilicity and thermostability, respectively. The 3D structure modelling includes sufficient information on the spatial arrangement of crucial protein components and excellent support in the investigation of protein function, dynamics, ligand interactions and other proteins. The desirable properties of the vaccine construct enhanced significantly after refinement. The Ramachandran plot demonstrates that most residues are discovered in the favored and allowed regions (99.2%) with very few residues in the disallowed region; which depicts that the quality of the overall model is satisfactory. Besides, GDT-HA, RMSD value, MolProbity, Clash Score and Poor Rotamers values indicate the good quality of our designed vaccine construct. Different structure validation tools were used to identify errors in the modelled vaccine construct. The Z-score (− 4.23) and ERRAT quality factor (78.2%) showed that the overall structure of the refined vaccine is adequate. The HLA alleles maintain the response to T-cell epitopes, and in different ethnic communities, these alleles are highly polymorphic. The T-cell epitope should bind with more HLA alleles to obtain more population coverage. So, we selected the CTL and HTL epitopes with their respective HLA alleles to predict the allele distribution worldwide. The findings showed that the chosen epitopes and their individual alleles cover ideally in numerous geographic regions of the globe. The highest population coverage was recorded at 99.16% in Chile Amerindian, and those epitopes and their respective HLA alleles cover 97.37% of the world population when used in combination. In Western Africa, in particular, Nigeria, Senegal and Mali, the epidemic of the LASV happened in most significant measure. Therefore, in these geographical regions, vaccine candidates are essential to protect people against LASV infection. The population coverage was found to be 88.86% at West Africa, where the virus first appeared and had several outbreaks. Data driven protein-receptor docking analysis and molecular dynamics simulation was carried out to evaluate a potential immune interaction and stability between TLR2 and the vaccine protein, considering the use of a TLR2 agonist as an adjuvant in the constructed chimera. Energy minimization was conducted to minimize the potential energy of the whole system for the complete conformational stabilization of the vaccine protein-TLR2 docked complex. The energy minimizes the inappropriate structural geometry by replacing individual protein atoms, thus making the structure more stable with adequate stereochemistry. The derived eigenvalue indicates the stiffness of motion and the required energy for the complex deformability. The immunoreactivity testing through serological assessment is one of the first steps in validating a candidate vaccine (Gori et al. 2013) . The expression of the recombinant protein in a suitable host is required. E. coli expression systems are determined for recombinant protein manufacturing (Chen 2012; Rosano and Ceccarelli 2014) . Codon optimization had been performed with a view to achieving a high level of expression of our recombinant vaccine protein in E. coli K12. Both the codon adaptability index (0.98) and the GC content (53.63%) were promising for high-level protein expression in bacteria. Enhancing the stability of proteins is an indispensable objective in various biomedical and mechanical applications. In this study, we have introduced a disulphide bridging into the multi-epitope vaccine construct to improve protein thermostability, modify its practical features and assist in the analysis of genetic components. The immune simulation revealed results consistent with typical immune responses. There was an overall increase in immune responses following repeated exposure of the antigen. The development of memory B-cells and T-cells were visible, with several months lasting memory B-cells. Another intriguing finding is that after the first injector concentrations of IFN-γ and IL-2 increased and were maintained at peaks after repeated antigen exposure. This finding shows elevated TH cell concentrations and therefore effective Ig production that support a humoral response. Both dendritic and macrophage cells activity were satisfactory in our study. Besides, components like epithelial cells of the innate immune system were active. The Simpson Index, D suggests a possibility of different immune responses for clonal specificity analysis. Lassa virus is an emerging viral pathogen which is characterized by severe haemorrhagic fever with a higher mortality rate, hence, become an increasing concern. Though antiviral drug Ribavirin showed some promises earlier, excessive toxicity and teratogenicity rendered its effectivity questionable. Knowing all the merits that a peptide vaccine has to offer, immunoinformatics strategies have been taken into consideration for designing a multi-epitope vaccine. Both T-cell and B-cell epitopes derived from different LASV proteins were included in the vaccine to produce an effective immune response. We believe that our vaccine will hopefully generate cell-mediated and humoral immune responses. The binding potential and interaction between vaccine protein and receptor were higher and stable. Besides, effective immune responses in real life were observed in immune simulation. However, further investigations both in vitro and in vivo are warranted to ensure its true potential to fight against Lassa fever. Author Contributions ZN and UKA conceived and designed the analysis; SBS and ZN performed immunoinformatic analyses; SBS prepared illustrations and wrote the manuscript; FA contributed in the dynamic simulation; SBS, SAK, and RT performed the antigenicity and epitope predictions; UKA and ZN contributed to the critical revision of the manuscript; UKA supervised the whole work; and all authors approved the final manuscript. Overlapping CD8 + and CD4 + T-cell epitopes identification for the progression of epitope-based peptide vaccine from nucleocapsid and glycoprotein of emerging Rift Valley fever virus using immunoinformatics approach Immunoinformatics approach for epitope-based peptide vaccine design and active site prediction against polyprotein of emerging oropouche virus Exploring dengue genome to construct a multi-epitope based subunit vaccine by utilizing immunoinformatics approach to battle against dengue infection Design of the linkers which effectively separate domains of a bifunctional fusion protein CD4+ regulatory T cells: mechanisms of induction and effector function The protein data bank An overview of novel adjuvants designed for improving vaccine efficacy Poor housing quality increases risk of rodent infestation and lassa fever in refugee camps of sierra leone Scalable web services for the PSIPRED protein analysis workbench Predicting population coverage of T-cell epitope-based diagnostics and vaccines Properties of MHC class I presented peptides that enhance immunogenicity Identification of alpha-dystroglycan as a receptor for lymphocytic choriomeningitis virus and Lassa fever virus How the interval between prime and boost injection affects the immune response in a computational model of the immune system Arenaviruses other than Lassa virus Scrutinizing Mycobacterium tuberculosis membrane and secretory proteins to formulate multiepitope subunit vaccine against pulmonary tuberculosis by utilizing immunoinformatic approaches Bacterial expression systems for recombinant protein production: E. coli and beyond Verification of protein structures: patterns of nonbonded atomic interactions The role of antibody and complement in the control of viral infections RING finger Z protein of lymphocytic choriomeningitis virus (LCMV) inhibits transcription and RNA replication of an LCMV S-segment minigenome Disulfide by design 2.0: a webbased tool for disulfide engineering in proteins In silico-based vaccine design against Ebola virus glycoprotein Prediction of IL4 inducing peptides Designing of interferongamma inducing MHC class-II binders AllerTOP v.2-a server for in silico prediction of allergens AllergenFP: allergenicity prediction by descriptor fingerprints Completion of the Lassa fever virus sequence and identification of a RING finger open reading frame at the L RNA 5′ end VaxiJen: A server for prediction of protective antigens, tumour antigens and subunit vaccines Computer aided epitope design as a peptide vaccine component against Lassa virus Risk maps of lassa fever in West Africa Unexpected adverse reactions during a clinical trial in rural West Africa Structural basis of T cell recognition Peptides for immunological purposes: design, strategies and applications JCat: a novel tool to adapt codon usage of a target gene to its potential expression host Vaccination with a paramyosin-based multi-epitope vaccine elicits significant protective immunity against Trichinella spiralis infection in mice In silico approach for predicting toxicity of peptides and proteins Imported Lassa fever in Germany: surveillance and management of contact persons Baseline mapping of Lassa fever virology, epidemiology and vaccine research and development reviewarticle Arenavirus evasion of host anti-viral responses GalaxyRefine: Protein structure refinement driven by side-chain repacking Design of peptide-based epitope vaccine and further binding site scrutiny led to groundswell in drug discovery against Lassa virus Exploring NS3/4A, NS5A and NS5B proteins to design conserved subunit multi-epitope vaccine against HCV utilizing immunoinformatics approaches Priming immunization against cholera toxin and E. coli heat-labile toxin by a cholera toxin short peptide-beta-galactosidase hybrid synthesized in E. coli Lassa virus infection of rhesus monkeys: pathogenesis and treatment with ribavirin Development of multi-epitope driven subunit vaccine against Fasciola gigantica using immunoinformatics approach Template-based protein structure modeling using the RaptorX web server Case-control study of mastomys natalensis and humans in Lassa virus-infected households in Sierra Leone Prevalence and risk factors of lassa seropositivity in inhabitants of the Forest Region of Guinea: a cross-sectional study Exploring Leishmania secretory proteins to design B and T cell multi-epitope subunit vaccine using immunoinformatics approach A fatal case of Lassa fever in London Effects of exposure to high concentrations of ribavirin in devloping embryos Predictions of protein flexibility: first-order measures The ClusPro web server for protein-protein docking Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction PROCHECK: a program to check the stereochemical quality of protein structures Recent advances of vaccine adjuvants for infectious diseases Computational methods for biomolecular docking Peptide vaccine: progress and challenges IMODS: Internal coordinates normal mode analysis server CD4 +T cells: differentiation and functions SOLpro: Accurate sequence-based prediction of protein solubility High-throughput prediction of protein antigenicity using protein microarray data iBCE-EL: a new ensemble learning framework for improved linear B-cell epitope prediction The impact of bioinformatics on vaccine design and development A novel design of a multi-antigenic, multistage and multi-epitope vaccine against Helicobacter pylori: an in silico approach A consensus epitope prediction approach identifies the breadth of murine TCD8+-cell responses to vaccinia virus Computer-aided designing of immunosuppressive peptides based on IL-10 inducing potential Proteome-wide screening for designing a multiepitope vaccine against emerging pathogen Elizabethkingia anophelis using immunoinformatic approaches A novel multi-epitope peptide vaccine against cancer: an in silico approach NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction Lassa fever in West African sub-region: an overview A reemerging lassa virus: aspects of its structure, replication, pathogenicity and diagnosis TepiTool: a pipeline for computational prediction of T cell epitope candidates ViPR: an open bioinformatics database and analysis resource for virology research ElliPro: a new structure-based tool for the prediction of antibody epitopes A prospective study of maternal and fetal outcome in acute Lassa fever infection during pregnancy Development of a conserved chimeric vaccine based on helper T-cell and CTL epitopes for induction of strong immune response against Schistosoma mansoni using immunoinformatics approaches Computational immunology meets bioinformatics: The use of prediction tools for molecular binding in the simulation of the immune system Recombinant protein expression in Escherichia coli: advances and challenges Detection of lassa virus, mali Developing vaccines in the era of genomics: a decade of reverse vaccinology Effect of linker length and residues on the structure and stability of a fusion protein with malaria vaccine application Codon Adaptation Index-a measure of directional synonymous codon usage bias, and its potential applications In-silico design of a multi-epitope vaccine candidate against onchocerciasis and related filarial diseases Peptide-based synthetic vaccines Amadei A (1997) A comparison of techniques for calculating protein essential dynamics In silico prediction of Band T-cell epitope on Lassa virus proteins for peptide based subunit vaccine design Imported case of Lassa fever in the Netherlands ProSA-web: Interactive web service for the recognition of errors in three-dimensional structures of proteins Protein identification and analysis tools in the ExPASy server Correlations between internal mobility and stability of globular proteins Multi-epitope vaccines: a promising strategy against tumors and viral infections