key: cord-0779378-kv06me01 authors: Obaidullah, Ahmad J.; Alanazi, Mohammed M.; Alsaif, Nawaf A.; Albassam, Hussam; Almehizia, Abdulrahman A.; Alqahtani, Ali M.; Mahmud, Shafi; Sami, Saad Ahmed; Emran, Talha Bin title: Immunoinformatics-guided design of a multi-epitope vaccine based on the structural proteins of severe acute respiratory syndrome coronavirus 2 date: 2021-05-19 journal: RSC advances DOI: 10.1039/d1ra02885e sha: 5470d506b6e8c73760fc71b0685d7c7aba4848b5 doc_id: 779378 cord_uid: kv06me01 Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), resulting in a contagious respiratory tract infection that has become a global burden since the end of 2019. Notably, fewer patients infected with SARS-CoV-2 progress from acute disease onset to death compared with the progression rate associated with two other coronaviruses, SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV). Several research organizations and pharmaceutical industries have attempted to develop successful vaccine candidates for the prevention of COVID-19. However, increasing evidence indicates that the SARS-CoV-2 genome undergoes frequent mutation; thus, an adequate analysis of the viral strain remains necessary to construct effective vaccines. The current study attempted to design a multi-epitope vaccine by utilizing an approach based on the SARS-CoV-2 structural proteins. We predicted the antigenic T- and B-lymphocyte responses to four structural proteins after screening all structural proteins according to specific characteristics. The predicted epitopes were combined using suitable adjuvants and linkers, and a secondary structure profile indicated that the vaccine shared similar properties with the native protein. Importantly, the molecular docking analysis and molecular dynamics simulations revealed that the constructed vaccine possessed a high affinity for toll-like receptor 4 (TLR4). In addition, multiple descriptors were obtained from the simulation trajectories, including the root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), solvent-accessible surface area (SASA), and radius of gyration (R(g)), demonstrating the rigid nature and inflexibility of the vaccine and receptor molecules. In addition, codon optimization, based on Escherichia coli K12, was used to determine the GC content and the codon adaptation index (CAI) value, which further followed for the incorporation into the cloning vector pET28+(a). Collectively, these findings suggested that the constructed vaccine could be used to modulate the immune reaction against SARS-CoV-2. event plays a crucial role in virus pathogenesis. Organ tropism occurs because the S protein mediates the entry of the virus through receptor recognition and membrane fusion. The S protein binds to host angiotensin-converting enzyme 2 (ACE2) receptors through the receptor-binding domain (RBD) in type 2 lung alveolar cells, which increases virus transmissibility. Viral particle entry into the host cell depends on the S1 subunit, whereas the integration of the viral and host cell membranes depends on the S2 subunit, making the S protein highly antigenic. These functions of the S protein make it one of the most promising antigen formulations for COVID-19 vaccine development. [13] [14] [15] [16] The E protein is the smallest of the structural proteins that are essential for the life cycle of the virus, involved in assembly, budding, envelope formation, and interactions with other structural and host cell proteins. 17, 18 The N protein is a highly immunogenic RNA-binding protein containing two highly conserved domains that bind to viral genomic RNA and regulate viral RNA transcription, replication, and folding. This makes the N protein a potential target for vaccine development and diagnostic methods. [19] [20] [21] The M protein is involved in determining the virion shape, facilitating membrane curvature, and binding to the N protein. 12 Increased CD4 + and CD8 + T cell counts in the peripheral blood and the high concentrations of cytotoxic granules have been reported in infected patients. The disease severity can increase due to T cell overactivation, which can cause injury to the immune systems of COVID-19 patients. By contrast, less effective T-cell responses may allow the progression of viral pathology, resulting in an increased fatality rate. CD4 + and CD8 + T cell responses provide long-lasting protection against this deadly virus. Additionally, the antibody-mediated immune response, together with cellular immunity, plays a vital role in the protection against viral infections. 22 Vaccine-mediated protection is provided by virus-specic T cells, known as effector T cells, and the production of antiviral cytokines also increases. 23 The viral epitopes presented by the major histocompatibility complex (MHC) class I and MHC class II proteins are recognized by CD4 + and CD8 + T cells, respectively. The heterogeneity in T cells responses to this novel coronavirus may be partly correlated with the capacity of the MHC proteins to recognize the viral antigens. 24, 25 Vaccine administration has been established as the most useful way to control and eliminate diseases associated with pathogenic organisms by triggering a specic immune response against a foreign particle within the host body. 26, 27 The modulation of the immune system prepares the body to ght against contagious viruses and can save millions of lives. An effective vaccine must provide a robust and diverse immune response that activates both cell-mediated immune responses and B-cell mediated humoral immunity to elicit immunogenic responses. 28 B-cell receptors can reactivate memory B cells, which protect against later re-infection. 29 Traditional vaccine development based on biochemical trials can be costly, allergenic, time-consuming, and requires the in vitro culture of pathogenic viruses to ensure safety. 17 Vaccine candidates that rely on high-molecular-weight antigenic proteins can be difficult to develop because high-molecular-weight proteins are challenging to express and can interfere with normal complement system function. However, a multiepitope vaccine may lack these limitations and also possess signicant advantages, including increased safety and enhanced immunogenicity, 30 and several previous studies have described the design of multi-epitope vaccines for a variety of pathogens. [31] [32] [33] Recently, various immunoinformatics approaches have been described to predict and assess the antigenicity, allergenicity, toxicity, and immunogenicity of various viral epitopes. The application of computational biology tools can greatly reduce the number of experiments required through the appropriate application of in silico predictions and additional methodologies, making computational methods both time-and cost-effective. 12, 34 Most of the recent vaccine candidates that have been developed using an immunoinformatics approach have been based on SARS-CoV-2 S glycoprotein variants. However, the immune responses that have been generated through the use of a single SARS-CoV-2 protein have thus far been insufficient to warrant use for effective prophylactic tool development. Multi-epitope vaccine candidates have previously been designed, and their efficacies have been reported against several viruses, including previous coronaviruses, such as SARS-CoV and MERS-CoV. Multi-epitope vaccines present reduced biohazard risks compared with other types of immunizations. 35 In our two previous studies, we utilized the SARS-CoV-2 N protein and S protein to predict several epitopes. 36, 37 In our present study, we utilized an immunoinformatics approach to predict the potential of T-cell and B-cell epitopes within the two other SARS-CoV-2 structural proteins, the E and M proteins. Combined with the ndings from previous studies, we identied potential MHC class I, class II and B-cell epitopes that could be combined through the addition of sufficient adjuvants and linkers into a multi-epitope antigen Fig. 1 . We expect our current ndings to facilitate COVID-19 vaccine development and the performance of experimental laboratory work, which remain necessary to validate our result further. The SARS-CoV-2 structural (S, E, N, and M) protein sequences were used as targets for epitope screening. We previously published two articles in which we incorporated cytotoxic Tlymphocytes (CTL) and linear B-lymphocyte (LBL) epitopes from the N protein and CTL, helper T-lymphocytes (HTL), and LBL epitopes from the S protein of the SARS-CoV-2. 36, 37 In the present study, all available sequences for the E protein were retrieved from the UniProt database, those for the M protein were retrieved from the NCBI database, and the amino acid sequences for both proteins were downloaded in FASTA format. 38, 39 The antigenicity of the selected structural proteins was anticipated using the VaxiJen v.2.0 (http://www.ddgpharmfac.net/vaxijen/) server, 40 and a default threshold value of 0.4 for the virus was used. 41 We chose the structural protein sequences with the highest antigenic score for the next step of investigations for the E protein and M protein. CTLs are one of several cell types in the immune system that can directly interact with infectious cells and are capable of killing infectious cells. 42 CTLs can enter pathogenic cells and play an essential role in the host defense mechanism. To predict CTL epitopes, the selected SARS-CoV-2 E and M protein sequences were submitted to the NetCTL v1.2 server, which is available at http://www.cbs.dtu.dk/services/NetCTL/. 43 The predicted epitopes were further assessed through the following servers using the default parameters: VaxiJen v2.0 40 to predict antigenicity and MHC class I immunogenicity (http://tools.iedb.org/ immunogenicity/); 44 ToxinPred, (http://crdd.osdd.net/raghava/ toxinpred/) 45 to predict toxicity; and AllerTop v2.0 (https://ddgpharmfac.net/AllerTOP/) 46 to predict allergenicity. HTLs are essential for inducing adaptive immunity because they recognize foreign antigens and activate B cells and cytotoxic T cells, resulting in the elimination of infectious pathogens. 42 We used the MHC class II binding allele prediction tool IEDB to determine the HTL epitopes (http://tools.iedb.org/ mhcii/) and selected HTL epitopes using the CONSENSUS method, based on a percentile rank of 5%. 47 Previously, we did not perform HTL epitope prediction for the N protein. Therefore, in this study, in addition to the SARS-CoV-2 E and M proteins, we also performed HTL epitope prediction for the N protein. We further evaluated the predicted epitopes based on their antigenicity and cytokine stimulating ability for the induction of interleukin-4 (IL4) and interleukin-10 (IL10). Antigenicity was determined using the VaxiJen v2.0 server, whereas IL4 and IL10 characteristics were predicted using the default parameters in the IL4pred (http://crdd.osdd.net/ raghava/il4pred/) 48 and IL10pred (http://crdd.osdd.net/ raghava/IL-10pred/) 49 servers, respectively. B-cell epitopes play vital roles in inducing humoral or antibodymediated immunity. B cells destroy pathogenic organisms by interacting with secreted antibodies and activating the immune system. 50 We predicted the LBL epitopes in the E and M proteins using the ABCpred server. 51 We further evaluated the predicted LBL epitopes for the SARS-CoV-2 E and M proteins and the previously predicted LBL epitopes from the S and N proteins using the VaxiJen v2.0, 40 ToxinPred, 45 and AllerTop v2.0 46 servers. The molecular docking analysis was performed using the methodologies described in our previous studies. 36, 37, 52 For modeling, the targeted CTL epitopes were submitted to the PEP-FOLD v3.0 server. 53 The energy of each structure was determined by the SWISS-PDB VIEWER, and the structure with the lowest energy was chosen for further analysis. 54 56 By utilizing a sophisticated gradient optimization method, Auto-Dock Vina effectively provides an optimization algorithm from a single evaluation. 57 First, we used the protein preparation wizard UCSF Chimera (Version 1.11.2) to prepare the protein for docking analysis by deleting the attached ligand and adding hydrogens and Gasteiger-Marsili charges. The prepared le was then added to the AutoDock wizard of PyRx 0.8 and converted into the .pdbqt format. The energy form of the ligand was minimized and converted to the .pdbqt format by OpenBabel. 58 The parameters used for the docking simulation were set to default parameters. The size of the grid box in AutoDock Vina was maintained at 78.701Å Â 65.296Å Â 90.806Å, respectively, for the X-, Y-, and Z-axes. AutoDock Vina was implemented via the shell script offered by AutoDock Vina developers. Docking results are reported as negative scores in kcal mol À1 , as the binding affinities of ligands are depicted in negative energies. 57 In addition, for the validation of the docking approach, we selected the epitopes that were associated with the respective PDB IDs to serve as positive controls and performed molecular docking analysis for these epitopes using the same parameters. The molecular docking analyses were visualized using Discovery Studio (DS) version 4.5, and gures were generated using Adobe Illustrator CC 18. The physiochemical features describe the basic properties of a protein. We used the ProtParam server (https://web.expasy.org/ protparam/) to anticipate the physicochemical features and understand the fundamental nature of the designed vaccine. 59 We further evaluated the immunological properties using the VaxiJen v2.0, 40 AllerTop, 46 and SOLpro 60 servers. We used the improved self-optimized prediction method (SOPMA) server (https://npsa-prabi.ibcp.fr/NPSA/npsa_seccons.html) and PSIPRED v4.0 server (http://bioinf.cs.ucl.ac.uk/psipred/), with default parameters, to identify the two-dimensional (2D) structural protein features of the vaccine construct, such as the formation of alpha-helices, beta-turns, and random coils. 61,62 SOPMA has been reported to return greater than 80% prediction accuracy. 61 We retrieved and evaluated the 2D structural features to understand the composition of the constructed vaccine. The three-dimensional (3D) model of the constructed vaccine was generated using the homology modeling tool Raptor-X server. 63 The renement of the model was performed using the Galaxy Rene server. 64 The validation of the constructed vaccine was performed using the ProSA-web and Procheck web servers. ProSAweb predicted the Z-score, which indicates the overall quality of the constructed vaccine, 65 and a Ramachandran plot was analyzed by the Procheck server to determine the overall quality. 66 2.9 Construction of the multi-epitope vaccine candidate The vaccine construct was designed using the targeted CTL, HTL, and LBL epitopes from the structural proteins of SARS-CoV-2. In addition, a suitable adjuvant was added using appropriate linkers during vaccine construction. 67, 68 In the current experiment, we used a TLR4 agonist as an adjuvant because the viral glycoproteins adapt it. In addition, the inclusion of this adjuvant is necessary to maximize the production of the target vaccine candidate for optimal translation. 69 The 50 S ribosomal protein L7/L12 (NCBI ID: P9WHE3) was considered as an adjuvant to increase the constructed vaccine's immunogenicity. The adjuvant was linked with the front portion of the vaccine using an EAAAK linker, which is bi-functional and has the capability of several lengths of helix-forming peptides to separate two weakly interacting b-domains. In addition, the CTL epitopes were linked together using AAY linkers, which represents a proteasomal cleavage site that increases the stability of proteins. 24, 70 Further, the HTL epitopes were linked through GPGPG linkers to prevent junctional epitopes and enable immunological processing. Finally, the LBLs were linked by incorporating KK linkers, 67,68 also known as bi-lysine linkers, and are primarily associated with the independent immunoactivities of a vaccine. TLR4 (PDB: 4G8A) was extracted from PDB to assess the interaction pattern. The protein structure was optimized and prepared in DS version 4.5 and the PyMOL soware package. Initially, the bound ligands from TLR4 were deleted and saved in.pdb format using DS version 4.5. Aerward, the .pdb le was loaded into the PyMOL soware package. The AutoDock tool was run in the PyMOL soware package, hydrogen and charges were assigned, and water molecules were removed. The docking study was performed in PatchDock server, 71 and further renement was performed in Firedock web-server. 72 The docking interaction was visualized in PyMOL 73 and DS. 74 To analyze the structural stability and variations in the vaccine and receptor protein compounds, a molecular dynamics simulation was conducted in YASARA version 20.1.1. 75 The AMBER14 force eld 76 was used for this study, and the protein complex was initially cleaned and optimized, followed by hydrogen bond orientation. A cubic simulation cell was created in which the system was neutralized by the addition of 0.9% NaCl at a temperature of 310 K temperature and pH 7.4. The system temperature was maintained using a Berendsen thermostat. 77 The long-range electrostatic interactions were calculated by the particle mesh Ewald method. 78 The simulation time step was set to 1.25 fs, and the simulation trajectory was saved aer every 100 ps. Finally, the simulation study was performed for 150 ns to analyze the root-mean-square deviation (RMSD), root-meansquare uctuation (RMSF), the radius of gyration (R g ), the solvent-accessible surface area (SASA), and hydrogen bond formation. [79] [80] [81] [82] [83] [84] [85] [86] In addition, we have used the peptide QYIKWP-WYI as a control in this study, based on evidence in the literature and wet lab results. 87 Currently, these peptide molecules are being examined in clinical trials (EudraCT 2020-002502-75, EudraCT 2020-002519-23), and we compared their dynamic behavior with that of the vaccine complex. The normal mode analysis (NMA) of a vaccine complex was assessed to understand the stability and exibility of the vaccine complex. 88 This tool may serve as an alternative solution for more costly molecular dynamics simulations. 89 The motif stiffness of the vaccine complex was evaluated using eigenvalues in which the main chain deformity was predicted by measuring the biological target's efficacy. The elastic network model and covariance matrix were also determined for the vaccine complex. 90 Codon adaptation and in silico cloning are two important steps during the process of vaccine design. Amino acids can be encoded by more than one codon in different organisms. In this workow, codon optimization was conducted to identify specic codons that can be used to encode a specic amino acid more efficiently in a particular organismal system. 91 Codon adaptation was implemented in the Jcat server for translation in suitable expression vectors. 92 The vaccine was expressed in the Escherichia coli strain K12 host system, and the generated cDNA sequences were analyzed based on the percent CG content and the codon adaptation index (CAI). Rho-independent transcription terminators and prokaryotic ribosome binding sites were avoided. Finally, the optimized vaccine sequence was incorporated into the pET28(+) vector through the addition of XhoI and NdeI restriction sites at the N-and C-terminus, respectively, using the SnapGene tool. 93 3 Results We downloaded all relevant E protein sequences from the UniProt database and all relevant M protein sequences from the NCBI database in the present study. We analyzed the antigenicity of each sequence by inputting the sequences into the VaxiJen database. We selected the protein sequences with the highest scores, which were UniProt entry no. A0A6M8FIC5 for E protein and NCBI accession no. QIC53216.1 for the M protein, with antigenicity scores of 0.6908 for E protein and 0.5102 for M protein. In our previously published articles, we determined potential CTLs for the S and N protein. In addition, we screened potential CTL epitopes from the E and M protein in the current experiment, based on those with the highest antigenicity and the results of toxicity and allergenicity analyses. In addition, the NetCTL 1.2 server provides a combinatorial score based on proteasomal cleavage efficiency and MHC class I binding. Aer the appropriate screening, we predicted one CTL epitope from the E protein and 3 CTLs from the M protein. The CTL epitopes used in the experiment for the multi-epitope vaccine construct are indicated in Table 1 . In our previous studies, we predicted HTL epitopes for the SARS-CoV-2 S protein. However, we did not perform HTL epitope prediction for the N protein. Thus, the present study also included the prediction of HTL epitopes for the E, M, and N proteins. The selection of HTL epitopes was primarily based on the analyses of antigenicity, IL4-and IL10-inducing capability, toxicity, and allergenicity. According to the selection criteria, the HTL epitopes with the best proles were selected for further analysis ( Table 2) . In addition to the best LBL epitopes from the previous two studies, we incorporated LBL epitopes from the E and M proteins of SARS-CoV-2, based on the most antigenic, non-toxic proles. The selected epitopes were also non-allergenic in nature. The LBL epitopes selected for incorporation into the multi-epitope vaccine are shown in Table 3 . We used the molecular docking simulation to delineate the interactions between the targeted CTL epitopes and their respective HLA alleles. We selected the HLA-B*15:01 allele for the E protein, which was paired with the CTL epitopes LVKPSFYVY and VSLVKPSFY. Additionally, the HLA-B*35:01 allele was selected for the analysis of the M protein interactions, and two selected CTL epitopes from the M protein were AGDSGFAAY and LVGLMWLSY. We also obtained the epitopes that were bound to each HLA allele in the PDB IDs for the validation of the docking studies. The result from the molecular docking studies revealed that our selected epitopes and the positive controls all interacted with their respective HLA alleles. Although the positive controls bound with more negative energies than the targeted epitopes, the interactions with the targeted epitopes were similar to the positive controls, except for the epitope AGDSGFAAY, which only formed a single hydrogen bond with HLA-B*35:01. Although VSLVKPSFY possessed a lower docking score than the other epitope and the positive control, VSLVKPSFY formed nine hydrogen bonds with HLA-B*15:01, which was more bonds than were formed by either LVKPSFYVY or the positive control. Moreover, all of the selected epitopes and the positive controls interacted with the receptors with hydrophobic interactions (Table 4 , Fig. 2 and 3 ). The formulation of the vaccine construct was performed by compiling the CTL, HTL, and LBL epitopes together, separated by the described linkers. In the current experiment, 3 CTLs were selected from the M protein, two CTLs were selected from the S protein, and a single CTL was selected from each of the E and N proteins. Among HTL epitopes, four were selected from the E protein, two were selected from the N protein, and a single epitope was selected from each of the S and M proteins. Moreover, 12 LBL epitopes were chosen from among the 4 SARS-CoV-2 structural proteins for the multi-epitope vaccine construct. The constructed vaccine consisted of 575 amino acid residues. The sequence of the nal vaccine construct was as follows: MAKLSTDELLDAFKEMTLLELSDFVKKFEETFEVTAAA PVAVAAAGAAPAGAAVEAAEEQSEFDVILEAAGDKKIGVIKVVREIV SGLGLKEAKDLVDGAPKPLLEKVAKEAADEAKAKLEAAGATVTVK EAAAKNTASWFTALAAYVSLVKPSFYAAYWTAGAAAYYAAYGAAA YYVGYAAYLVGLMWLSYAAYAGDSGFAAYAAYATSRTLSYYAAYG PGPGRWYFYYLGTGPEAGLGPGPGAQFAPSASAFFGMSRGPGPG VKPSFYVYSRVKNLNGPGPGKPSFYVYSRVKNLNSGPGPGPSFYV YSRVKNLNSSGPGPGVSLVKPSFYVYSRVKGPGPGAALQIPFAMQM AYRFGPGPGLAAVYRINWITGGIAGPGPGKKRTQLPPAYTNSKKQR QKKQQKKNVSLVKPSFYVYSRVKKKYVYSRVKNLNSSRVPDKKLE GKQGNKKKNHTSPDVDLGKKKFLPFQQKKDQLTPTWRVYKKLT-WICLLQFAYANRNRKKACFVLAAVYRINWITGKKSFRLFARTRSMW SFNPKKGIAIAMACLVGLMWLSKKLWPVTLACFVLAAVYR (Fig. 4) . The physicochemical properties of the vaccine construct were evaluated, as displayed in We further assessed the antigenicity and allergenicity of the vaccine construct. The construct was antigenic, with a score of 0.6153, and non-allergenic. Moreover, the vaccine construct was soluble, exhibiting a score of 0.701681 on a scale of 1 (Table 5) . Fig. 4 Graphical map of the formulated multi-epitope vaccine construct. The vaccine constructs included (left to right) an adjuvant, CTL, HTL, and LBL epitopes, which are shown in the dark blue, red, green, and blue rectangular boxes, respectively. The adjuvant and the first CTL epitope were linked using an EAAAK linker (blue), the CTL epitopes were joined using AYY linkers (off-white), the HTL epitopes were linked using GPGPG linkers (dark yellow), and the LBL epitopes were joined using KK linkers (black). The evaluation of secondary structural features, including ahelices, b-strands, and random coils, was performed using two servers. The SOPMA server anticipated 40.00% a-helices, 19.30% b-strands, and 40.70% random coils in the secondary structure (Table 6 and ESI Fig. S1 †) . By contrast, the PSIPRED server predicted 45.74% a-helices, 15.30% b-strands, and 38.96% random coils in the construct ( Table 6 and ESI Fig. S1 †) . The 3D structure of the constructed vaccine was generated using the Raptor-X online server. A total of 5 models were generated by the Raptor-X server, which was further evaluated by ProSA and Procheck web servers. In this study, model 1 returned the highest Z-score of À7.68, and the percentage of Ramachandran favored regions for model 1 was 90.6%. Although model 3 showed a similar percentage of Ramachandran favored regions as that for model 1, the Z-score for model 3 was À6.60. Thus, we selected model 1 for further analysis in the present study (Fig. 5 ). The molecular interaction between the vaccine molecule and the immune cell requires a stable immune response, which depends on the geometry of the protein surface and the electrostatic interactions between the protein and the cell. Patchdock tools were used to rank the top ten interaction models between the receptor and the immune cell. The best complexes were sent for renement in Firedock. The vaccine and TLR4 complex had better binding interactions in solution eight, in which the global energy was À33.27 kJ mol À1 , van der Walls energy (vdW) was À47.06, repulsive vdW was 25.91, atomic contact energy (ACE) was 11.41, hydrogen bond energy was À3.07 (Fig. 6 ). The RMSD of the c-alpha atoms in the protein complex was evaluated to understand the structural deviations across the simulation trajectory. As shown in Fig. 7 , the complex exhibited a sharp increase in the RMSD value at the beginning of the simulation. The protein complex then stabilized until the last phase of the simulation. The vaccine complex showed relatively less RMSD compared with the control, which indicated the stable nature of the vaccine candidate compared with the control. The RMSD value of the complex ranged from 2.5 to 2.7 A, which indicated structural stability and less exibility. The protein complex's SASA was also evaluated to determine the change in surface area. The vaccine construct possessed a higher SASA trend, which demonstrated an increase in the surface volume. However, this complex did not demonstrate a high level of SASA deviations, which indicated that no signicant changes were occurring to the protein's surface area. The R g descriptor for a protein system denes the compactness of the complex. The protein complex displayed a steady R g trend from the very beginning of the simulation. Aer a 30 ns simulation, the complex had a higher R g value, which is responsible for the protein's labile nature. This increase in R g might be due to the folding or unfolding of the protein complex. The hydrogen bonds in a biological system dene the stability of the system. Fig. 7 indicates that the vaccine complex's hydrogen bonds did not uctuate compared with those in the control, which indicated structural integrity. The RMSF of the vaccine complex and the control were evaluated to determine the amino acid residue exibility. Fig. 7 demonstrated that most residues exhibited RMSF values below 2.5Å. This RMSF prole denes a less exible and more rigid vaccine complex compared with the complex formed by the control. The stability of the vaccine complex was evaluated by deformability, eigenvalue, elastic network model, covariance map, and B factor analyses. Fig. 8 indicates that the hinge region represented a high deformability region; the average RMS was also present for the B factor. The eigenvalue was higher; 4.131509 Â 10 À6 , The elastic network model and correlation matrix are depicted in Fig. 8 . These results correlate with the lower chance of deformation among the vaccine complex molecules. To determine whether the vaccine sequence can be expressed in E. coli host cells, in silico cloning was performed. Codon optimization is essential for protein expression. To generate a cDNA sequence, the Jcat tool was used, which exhibits the 1725nucleotide sequence used for vaccine expression. The CAI score and GC content of our optimized vaccine sequences were 0.9485 and 50.49%, respectively, denoting an efficient and potentially stable expression by the E. coli K-12 strain (Fig. 9) . Generally, a GC content of 30%-70% and a CAI value greater than 0.8 are considered to denote good protein expression conditions in a host. The designed nucleotide sequence was inserted into the pET28(+) plasmid vector by inserting the appropriate restriction sites through the SnapGene tool. Currently, the whole world is experiencing a pandemic caused by SARS-CoV-2. Initially, some signicant events, including a cruise to Japan, attendance at ski resorts in Austria and Italy, a mass gathering of people in South Korea, and travel to a wellknown pilgrimage city in Iran, contributed to the rapid and global spread of this deadly virus. Since then, the global dissemination of SARS-CoV-2 has increased, and this pathogenic organism has resulted in many deaths across numerous countries. Genetically, SARS-CoV-2 is closely related to SARS-CoV, which was highly lethal and caused several deaths in late 2002; however, aer intensive health measures, the SARS-CoV virus was faded from the public domain. In addition, another deadly coronavirus, MERS-CoV, showed an even higher fatality rate than SARS-CoV. Although the novel coronavirus (SARS-CoV-2) has reduced mortality compared with the two previous coronaviruses, the transmission rate of SARS-CoV-2 is greatly higher than that of either SARS-CoV or MERS-CoV. 94 Additionally, SARS-CoV-2 possesses a longer incubation period than other viruses, such as the inuenza virus. Both SARS-CoV and MERS-CoV were found at low levels in the upper respiratory tract (URT), whereas the viral load of SARS-CoV-2 is the opposite. For SARS-CoV-2, a high viral load can be detected in the URT, which declines aer 5-6 days; the presence of the virus in the URT requires the isolation and quarantine of patients to prevent disease spread. 95, 96 Conversely, SARS-CoV loads peaked within 6-11 days aer the onset of symptoms. These differences can explain the scenario that has resulted in the pandemic situation. 97, 98 Recently, a new variant of SARS-CoV-2 has been reported in England, which is estimated to be approximately 70% more transmissible than the previous strain. Importantly, evidence has suggested that the new strain might be associated with a higher mortality rate compared with other variants. 99 The newly identied strain contains eight mutations in the S protein of SARS-CoV-2, one of which increases the chance of interaction with ACE2. 100 Additionally, in South Africa, another SARS-CoV-2 variant has emerged, which is associated with an increased rate of infection and a higher mortality rate. Further, the South Africa strain is less effectively neutralized by convalescent patients' plasma. Another novel variant that has emerged in the USA features a mutated lysine residue at position 452, which affects the binding of the S protein to certain monoclonal antibodies. 100 The development of vaccines for coronaviruses has generally been considered a low priority because most coronaviruses cause mild diseases. Although several vaccine candidates were tested pre-clinically to address SARS-CoV infections, vaccine development was halted aer the virus was exterminated from the human population, and no cases of SARS-CoV in humans have been reported since 2004. 101, 102 In addition, vaccines against MERS-CoV are currently under development. Previous studies reported the lucidity of the antigenic target for both SARS-CoV and MERS-CoV. 103, 104 Both SARS-CoV and MERS-CoV, in addition to other coronaviruses, encode an S protein, which is responsible not only for binding with the host receptor but also for the fusion of the virus with the cell membrane. 105 The S protein of SARS-CoV-2 was identied as being antigenic and was targeted for the early development of vaccines against SARS-CoV-2. However, a study by Zheng et al. reported that SARS-CoV-2 possessed 24.5% amino acid residues that are not conserved in comparison with the sequence of the S protein from SARS-CoV, which might be responsible for the antigenic differences between the two strains. 106 The N protein of SARS-CoV-2 was previously documented as being responsible for viral replication and the pathogenesis of SARS-CoV-2. 107 Another study predicted that the E protein of SARS-CoV-2 was more antigenic than other structural proteins. 108 Furthermore, the N protein interacted with the M protein in the infected cell lipid membrane, forming a vesicular bilayer structure, which further interacts with the host proteins. 109, 110 These ndings have led to the hypothesis that the SARS-CoV-2 structural proteins might be responsible for various immunogenic responses. Thus, in the current study, we sought to predict a peptide-based vaccine candidate against SARS-CoV-2 by utilizing the organism's structural proteins. In our previous two studies, we successfully predicted epitopes from two structural proteins, the S and N proteins. 36, 37 In this study, we also considered the other two structural proteins, the E and M proteins, to further predict antigenic peptides. The present study focused on designing a multi-epitope vaccine, which has greater immunogenic potential compared with classical and single-epitope-based vaccines. Furthermore, multi-epitope vaccines possess several unique characteristics, including the availability to present multiple HTL, CTL, and B-cell epitopes, allowing for the induction of both cellular and humoral immune responses. Likewise, it consists of multiple HLA epitopes, which can be easily recognized by several T-cell receptors and consists of epitopes from different immunogenic proteins, increasing the target organism's range. First, we downloaded the SARS-CoV-2 E protein sequences from the UniProt database and the M protein sequences from the NCBI protein database. The sequences were then inputted into the VaxiJen server, and the sequences of both the E and M proteins with the highest antigenicity scores were used for further analysis. Based on the highest combinatorial scores and MHC class I binding scores, four epitopes were chosen from the E protein, and ten epitopes were selected from the M protein, and the antigenicity, allergenicity, and toxicity of these selected epitopes were determined. Aer screening out, one CTL epitope, VSLVKPSFY, and three additional epitopes, including LVGLMWLSY, AGDSGFAAY, and ATSRTLSYY, were selected for further multi-epitope study. The results from the MHC I interaction showed that two epitopes from the E protein sequence interacted with two MHC I alleles, HLA-B*15:25 and HLA-B*15:01, with greater interaction. In the M protein, two of the selected CTL epitopes interacted with HLA-B*35:01 with greater affinities. Although both M protein epitopes interacted with HLA-A*29:02, we only considered those HLA molecules available on the PDB database. Therefore, for the molecular docking studies, HLA-B*15:25 was selected for docking with the epitopes from the E protein, and HLA-B*35:01 was selected for the epitopes from the M protein. Although the molecular docking simulation results showed that the chosen epitopes were able to signicantly interact with the targeted MHC I alleles, the binding affinities were less than those of the positive controls selected for the study. However, the CTL epitopes selected from the E protein formed signicantly more hydrogen bonds with HLA-B*15:01 than the control epitope. Only AGDSGFAAY from the M protein showed less interaction toward HLA-B*35:01, forming only a single hydrogen bond with Tyr7, which was fewer bonds than were formed by another CTL epitope from the M protein or the positive control. We also predicted HTL epitopes from both the E and M proteins of SARS-CoV-2. Importantly, in our previous study of the SARS-CoV-2 N protein, we did not analyze HTL epitopes from the N protein. We, therefore, also performed the prediction of HTL epitopes from the N protein of SARS-CoV-2, in addition to the prediction of HTL epitopes from the E and M proteins. We also analyzed the HTL epitopes obtained from our previous study of the SARS-CoV-2 S protein. We considered the antigenicity, allergenicity, toxicity, and IL-4-and IL-10-inducing capabilities during the selection of HTL epitopes from these four structural SARS-CoV-2 proteins. The selected epitopes were screened for further analysis. We predicted B-cell epitopes from the SARS-CoV-2 E and M proteins and considered selected B-cell epitopes from the previous two studies. In the current study, we analyzed the antigenicity, allergenicity, and toxicity of the selected B-cell epitopes. The selected epitopes were further considered for the multi-epitope vaccine. The selection of pertinent antigenic epitopes from the targeted proteins for inclusion in a multi-epitope vaccine is crucial for the design of such vaccines. In the current study, the antigenicity, allergenicity, toxicity, and degree of conservation of the epitopes were determined. Aerward, all targeted CTL, HTL, and B-cell epitopes were attached using the desired linkers, which were integrated to increase the stability, patterns of the expression, and folding capacity of the vaccine candidate. 111 The EAAAK linker was used to attach the adjuvant to the CTL epitopes, which is associated with inducing a higher degree of cellular and immunogenic humoral responses. 112 The attachment of the EAAAK linker to the adjuvant increases the stability and longevity of the vaccine construct. 113 Aer connecting the CTL, HTL, and B-cell epitopes with their respective linkers, we developed a nal vaccine construct that contained 575 amino acid residues. The antigenicity and allergenicity of the nal vaccine construct were determined, which showed that the nal vaccine construct was antigenic and non-allergenic, indicating that it could serve as a potential vaccine. The molecular weight was determined to be approximately 62 kDa, which was an average molecular weight. The theoretical pI was calculated as 9.91, which demonstrated that the vaccine construct was basic in nature. Solubility is considered to be an important characteristic of a recombinant vaccine construct. 114 We predicted the solubility of the constructed vaccine when expressed by a host E. coli strain and found that the vaccine construct was soluble inside the host E. coli. Aer predicting the 3D structure of the constructed vaccine, we identied a signicant Z-score, and most amino acid residues in the Ramachandran plot were in the favored region. Molecular studies were used to predict the interaction between the vaccine constructs and the viral glycoprotein-binding TLR4. The analysis resulted in a global binding score of À33.27 kJ mol À1 , indicated that the vaccine construct interacted well with TLR4 on the cell surface. TLRs belong to a family of conserved pattern recognition receptors (PPRs) that function to recognize the specic pattern of pathogens, including viruses, bacteria, or fungi, and are capable of distinguishing between self and non-self-materials. 115 As many as 10 TLRs have been found in the human gene database, primarily characterized as transmembrane proteins with leucine-rich repeats in the extracellular domain. The activation of TLRs by specic ligands eventually results in the production of cytokines and the upregulation of MHC molecules, which ultimately links the innate immune response with adaptive immune cells. 116 The cytoplasmic region of TLRs similar to the IL-1 receptor family; however, the extracellular portions of TLRs are structurally different. A critical functional of TLR4 is the recognition of microbial lipopolysaccharide (LPS), which is a potential immune activator. 117 In addition, TLR2 recognizes LPS from non-enterobacterial origins, which are structurally different from the LPS that is recognized by TLR4. 118 Additionally, we performed a molecular dynamics simulation, which is a powerful tool for assessing a protein's physical structure and the functional analysis of large macromolecules. To perform dynamics studies of the vaccine construct, 150 ns molecular dynamic simulations were performed. The results were analyzed based on the RMSD and RMSF scores, the R g score, the SASA prole, and the hydrogen bond analysis. The RMSD value primarily denotes different conformations of the targeted molecular system. In the present study, the RMSD value indicated the structural stability of the vaccine construct. Despite a sharp increase at the start of the simulation, the RMSD value stabilized continued to present stability. Additionally, the RMSF value indicated the rigidity of the vaccine construct. The vaccine candidate's robustness and stability were also evaluated using the R g score, hydrogen bond analysis, and SASA prole. We also evaluated a control in this study, which is currently under clinical investigation. The molecular dynamics simulation performed for the control indicated that the control had less uniformity than our predicted vaccine construct. When designing a multi-epitope candidate, the successful cloning and expression in a suitable vector is a crucial step. In silico cloning is considered to represent a tremendously useful protocol in the eld of biotechnology that can be applied prior to performing in vitro experiments using a vaccine construct. The in silico protocols allow for the reduction of humanintroduced errors and are less time-consuming and more cost-effective than other methods. 119 Several recent studies have already reported that the in silico cloning approach has significant applications for the elds of microbiology, molecular biology, and biotechnology and can yield a full or partial complementary DNA (cDNA) sequence. 120, 121 In addition, the secondary structure of RNA, particularly in the untranslated regions of the genome, has been demonstrated to be involved in various processes throughout the life cycle of the pathogen. Understanding the specic information that viruses use for replication will increase the chances of identifying the basic biological features that are necessary for pathogenic proliferation, which are crucial for designing vaccines. 122 Therefore, in the current study, in silico cloning was performed to validate the vaccine construct's expression and translation in the expression vector pET-28a(+). Because the vaccine construct consisted of CTL, HTL, and B-cell epitopes, it could play a crucial role in inducing host immune responses, which may pave the way for the activation of several immune cells through complex signaling. The development of multiple immunoinformatics tools has paved the way for designing and developing an epitope-based vaccine in a cost-effective manner within a short period of time. Because viruses can activate both cellular and humoral immunity, we combined a potential set of T-cell and B-cell epitopes from the major structural proteins of SARS-CoV-2 to construct a multi-epitope vaccine in this study. Aer evaluating the antigenicity, immunogenicity, and allergenicity of these epitopes, different linkers were efficiently applied to join the selected epitopes and present them to T-cell and B-cell receptors. Importantly, our vaccine construct strongly bound to TLR4, suggesting a robust immune response can be activated upon novel coronavirus infection. The vaccine construct demonstrated structural stability, less exibility, and more rigidity in the molecular dynamics simulation with a lower chance of deformation in the immune simulation study than the positive control. The vaccine construct also revealed high protein expression levels in the E. coli host and was able to be successfully inserted into the pET28(+) plasmid vector, which indicates that our construct may serve as a potential vaccine candidate. Despite showing signicant results in the in silico analyses, our preliminary design requires further experimental verication to assess the engineered vaccine's effectiveness. The datasets supporting the conclusions of this study are included within the article (and its additional les). Weekly epidemiological update Drug Discovery Today PyRx-Python Prescription v.0.8, The Scripps Research Institute The proteomics protocols handbook Accelrys Discovery Studio Expert Review of Vaccines Advances in virus research The authors extend their appreciation to the Deanship of Scientic Research at King Saud University for funding this work through research group no. (RG-1441-398).