key: cord-1009813-ua0ikrt1 authors: Ismail, Saba; Ahmad, Sajjad; Azam, Syed Sikander title: Immunoinformatics characterization of SARS-CoV-2 spike glycoprotein for prioritization of epitope based multivalent peptide vaccine date: 2020-06-17 journal: J Mol Liq DOI: 10.1016/j.molliq.2020.113612 sha: 10e5c71b0258dfe9d87052205a0dada327fb1c6e doc_id: 1009813 cord_uid: ua0ikrt1 The COVID-19 pandemic caused by SARS-CoV-2 is a public health emergency of international concern and thus calling for the development of effective and safe therapeutics and prophylactics particularly a vaccine to protect against the infection. SARS-CoV-2 spike glycoprotein is an attractive candidate for a vaccine, antibodies, and inhibitors development because of the many roles it plays in attachment, fusion and entry into the host cell. In the present investigation, we characterized the SARS-CoV-2 spike glycoprotein by immunoinformatics techniques to put forward potential B and T cell epitopes, followed by the use of epitopes in construction of a multi-epitope peptide vaccine construct (MEPVC). The MEPVC revealed robust host immune system simulation with high production of immunoglobulins, cytokines and interleukins. Stable conformation of the MEPVC with a representative innate immune TLR3 receptor was observed involving strong hydrophobic and hydrophilic chemical interactions, along with enhanced contribution from salt-bridges towards inter-molecular stability. Molecular dynamics simulation in aqueous milieu aided further in interpreting strong affinity of the MEPVC for TLR3. This stability is the attribute of several vital residues from both TLR3 and MEPVC as shown by radial distribution function (RDF) and a novel axial frequency distribution (AFD) analytical tool. Comprehensive binding free energies estimation was provided at the end that concluded major domination by electrostatic and minor from van der Waals. Summing all, the designed MEPVC has tremendous potential of providing protective immunity against COVID-19 and thus could be considered in experimental studies. In December 2019, a new strain of coronavirus emerged in Wuhan city of Hubei province in China and has since disseminated globally. The virus belongs to clade B of family Coronaviridae in the order Nidovirales, and genera Betacoronavirus and caused pulmonary disease outbreak [1, 2] . It is positive-sense RNA, enveloped and non-segmented virus and named as SARS-CoV-2 as it shares 82% genome homology with SARS coronavirus (SARS-CoV) [3, 4] . SARS-CoV-2 causes coronavirus disease-19 and evidence suggest a zoonotic origin of this disease [5] . Though the zoonotic transmission is not completely understood but facts provide the ground that it proliferates from the seafood market Huanan in Wuhan and human-to-human transmission resultant into the exponential increase in number of cases [6, 7] . As of May 12, 4, 320, 202 cases are reported worldwide with 291,545 deaths and 1,570,090 recovered. Among the active cases, 2,458,567 are currently infected, 2,412,235 (98%) are in mild conditions and 46,332 (2%) are seriously ill. Among the 1,861,635 closed cases, 1,570,090 (84%) are recovered whereas 291,545 (16%) die. On March 11, the World Health Organization (WHO) affirmed COVID-19 as a pandemic (https://www.worldometers.info/coronavirus/). SARS-CoV-2 utilizes a highly glycosylated, homotrimeric class I viral fusion spike protein to enter into host cells [8] . This protein is found in a metastable pre-fusion state which goes through structural readjustments facilitating fusion of the viral membrane to the host cell [9] [10] [11] . The binding of S1 subunit to the host angiotensin converting enzyme (ACE) initiates the fusion process and during this event the pre-fusion trimeric structure disrupts resulting in S1 subunit dispersion and stabilization of the S2 subunit to achieve a post-fusion conformation [12] . The receptor-binding domain (RBD) of S1 unit adopts a hinge-like conformation to temporarily hide or expose receptor binding residues for interaction with the host cell receptor [11] . Down and up conformation states are recognized where former is related to the receptor inaccessible state and the later one explains receptor accessible state and considered as less stable [13] [14] [15] [16] . This critical role of the spike protein makes it an important candidate for neutralization by antibodies, and detailed atomic level studies of the pre-fusion spike structure are important in the design and development of a vaccine [17] [18] [19] [20] [21] . Current data indicates that both SARS-CoV-2 and SARS-CoV spike share the same ACE2 as a binding receptor [22, 23] . Interestingly, ACE2 binds to SARS-CoV-2 spike with ∼15 nM affinity, about 10-20 folds higher than ACE2-SARS-CoV spike binding [24] . One possible reason for SARS-CoV-2 human-to-human transmission is SARS-CoV-2 spike's high affinity for human ACE2 [25] . Series of cellular immune and humoral responses can be triggered by SARS-CoV-2 infection [26] . Immunoglobulin G (IgG) and IgM can be noticed after 2 weeks of infection onset which are specific antibodies to SARS-CoV-2. High titers of neutralizing antibodies and SARS-CoV-2 specific cytotoxic T lymphocyte responses have been identified in the patients who cleared the SARS-CoV-2. This phenomenon clearly suggests that both cellular and humoral immune reactions are vital in clearing the SARS-CoV-2 infection [26] [27] [28] [29] [30] . The study presented, herein, is an attempt to get insights about antigenic determinants of SARS-CoV-2 spike glycoprotein and highlight all antigenic epitopes [31] of the spike that can be used specifically for the design of a multi-epitope peptide vaccine construct (MEPVC) [32] to counter COVID-19 infections. The epitopes predicted by immunoinformatics techniques were fused together as well as to β-defensin adjuvant [33, 34] to boost the antibody production and longlasting immunological responses. Blind docking protocol was implemented to gain stable J o u r n a l P r e -p r o o f 4 MEPVC binding conformation with respect to a representative TLR3, an important innate immune receptor in viral immunity [35] . The docked solution was further utilized in dynamics simulations to study the structural dynamics and stability of the complex [36] . Finally, to confirm intermolecular interactions and point all noteworthy residues crucial for intermolecular stability, binding free energies [37] for the system were estimated. In a nutshell, this study reflects excellent outcomes for experimental vaccinologists to develop a vaccine for prevention from pandemic COVID-19 infections. Complete flow of the in silico designed methodology for MEPVC targeting SARS-CoV-2 spike is demonstrated in Fig.1. The spike glycoprotein amino acid sequence was retrieved from NCBI SARS-CoV-2 data hub and considered first in the epitope mapping phase, where T cell epitopes derived from B cell were predicted using immune epitopes database (IEDB) [38] . Linear B cell epitopes were mapped using Bepipred Linear Epitope Prediction 2.0 [39] and those with score greater than 0.5 were subjected to T cell epitopes identification step. The epitopes were projected for association with reference set of major histocompatibility complex (MHC): MHC class I [40] and MHC class II [41] alleles sorted on percentile score basis. Epitopes with lowest percentile score are strong binders and were considered only. The selected epitopes were then used in MHCPred 2.0 [42] to decipher their binding affinity potential for predominant HLA II DRB*0101 and only those with IC 50 value < 100 nM were categorized as excellent DRB*0101 binders [43] . VirulentPred [44] was employed next to reveal virulent nature of the epitopes setting the cut-off to 0.5. Antigenic epitopes were highlighted by VaxiJen 2.0 [45] . Allergenic epitopes were discarded through AllerTop 2.0 [46] and toxic potential of non-allergic epitopes was evaluated via ToxinPred [47] . The non-toxic epitopes were lastly investigated for their ability to induce IFN-γ using an IFN epitope server [48] . Conservation across the world population of the final set of epitopes was done through IEDB epitope conservation analysis tool [49] . All filtered epitopes were linked together through AAY linkers [50] to design a multi-epitope peptide (MEP). The resultant peptide was further linked to an immunological β-defensin (an adjuvant) to construct a MEPVC and in this way, immunogenicity can be enhanced. The physicochemical properties of designed MEPVC were predicted by ProtParam tool [51] of EXPASSY server. The three dimensional (3D) structure of the MEPVC was modeled ab initio by 3Dpro of SCRATCH protein server [52] . Following, loop modeling was done in the 3D structure of MEPVC via GlaxyLoop [53] from GlaxyWeb and subsequently refined through GalaxyRefine [54] . Disulfide engineering was applied to the MEPVC refined model via Design 2.0 [55] as disulfide bonds strengthen structure stability. The MEPVC sequence was translated reversibly for optimization of codon usage according to Escherichia coli K12 expression system in order to get high expression rate [56] . For this, Java Codon Adaptation Tool (JCat) [57] was employed and expression rate of the cloned MEPVC was measured by codon adaptation index J o u r n a l P r e -p r o o f 5 (CAI) value. SnapGene (https://www.snapgene.com/) was used to clone the optimized MEPVC cDNA into pET-28a (+) expression vector. Immunogenic potential of the MEPVC was conducted in silico using the C-ImmSim server [58, 59] . The server used machine learning techniques along with position-specific scoring matrix (PSSM) for estimate of the human host immune system response to the antigen. The immune system responds from three sites: bone marrow, lymph nodes and thymus. The input parameters for the immune simulations are as follows: number of steps (100), volume (10), random seed (12345), HLA (A0101, A0101, B0702, B0702, DRB1_0101, DRB1_0101), number of injection set to 1. All remaining parameters were treated as default. The MEPVC affinity for an appropriate immune receptor as an agonist was checked in the step of molecular docking [60] . TLR3 available under PDB id of 1ZIW was retrieved and used as a receptor molecule. TLR3 also named CD283 is a pattern recognition receptors (PPRs) protein and is a transmembrane [61] . It detects viral infection associated dsRNA and evokes the activation of interferon regulatory transcription factor (IRF3) and Nuclear Factor kappa lightchain enhancer (NF-kB) [62] . Unlike other TLRs, TLR3 uses TIR-domain containing adapter inducing interferon-β (TRIF) as a primary adapter [63] . IRF3 eventually induces the development of type I interferons leading to the activation of both innate and adaptive immune systems [64] . The TLR3 receptor and MEPVC were used in a blind docking approach through an online PATCHDOCK server interface [65] . The interacting molecules were docked according to the shape complementarity principle. The clustering RMSD is allowed to default 4.0 Å. The output docked solutions were refined for interactions with Fast Interaction Refinement in Molecular Docking (FireDock) server [66] . The refined complexes were examined and one with lowest global energy was considered as top ranked. The opted complex was subjected to in-depth MEPVC conformation visualization with respect to the TLR3 using UCSF Chimera 1.13.1 [67] . MD simulation was applied to the selected top complex for 100-ns to understand complex dynamics and stability for practical applications. This assay was categorized into three phases: (i) parameters file preparation (ii) pre-processing, and (iii) simulation production [68] . In first phase using an antechamber module of AMBER16 [69] , complex libraries and set of parameters for TLR3 and MEPVC were generated. The complex system was solvated into 12 Å TIP3P solvation box achieved through Leap module of AMBER. The intermolecular and intramolecular interactions of the system were determined by ff14SB force field [70] . Counter ions in the form of Na + were added to the system for charge neutralization. In the system pre-processing stage, complex energy was optimized through several rounds: minimization of complete set of hydrogen atoms for 500 steps, solvation box energy minimization for 1000 steps with 200 kcal/mol -Å 2 restraint on the remaining system, minimization of complete set of system atoms again for 1000 steps with 5 kcal/mol -Å 2 applied restraint on carbon alpha atoms, and 300 minimization steps on non-heavy atoms with restraint of 100 kcal/mol -Å 2 on other system J o u r n a l P r e -p r o o f 6 components. The complex system then underwent a heating step where the complex was heated gradually to 300K through NVT ensemble, maintained through Langevin dynamics [71] and SHAKE algorithm [72] for restraining hydrogen bonds. Complex equilibration was achieved for 100-ps. System pressure was maintained using NPT ensemble allowing 5 kcal/mol -Å 2 energy restraint on Cα atoms. In the simulation production, trajectories of 100-ns were produced on time scale of 2-fs. Non-bounded interactions were differentiated by describing cut-off distance of 8.0 Å.CPPTRAJ module [73] was lastly used for statistical computation of different structure parameters to probe complex stability. The MD simulation frames were analyzed and visualized in Visual Molecular Dynamics (VMD) 1.9.3 [74] . Axial frequency distribution or simply AFD [75] is a novel analytical technique run on simulation trajectories to access ligand 3D conformation with respect to a reference receptor atom. Such local structural movements are not captured by any other available technique. AFD is mathematically presented by Eq.I, ……………………………………………………………………… Eq.I where, i and j are ligand atom coordinates on the X and Y axis with cut-off value k and l, respectively. The mi,j sums interactions frequency that falls in the coordinate (i,j). The interaction energy and solvation free energy for TLR3 receptor, MEPVC, TLR3-MEPVC complex were calculated utilizing the MMPBSA.py module [76] of AMBER16. Also, an average of the above was estimated as system net binding free energy. The binding free energy was computed through the MM-PBSA method and its counterpart MM-GBSA with objective to derive the energy difference between bound and unbound states of the same system [77] . Mathematically, the free energy can be calculated through Eq.II, ΔG bind,solv = ΔG bind, vaccum + ΔG solv, TLR3-MEPVC -(ΔG solv, MEPVC + ΔG solv, TLR3 )……………Eq.II For all three states of the system, the solvation energy was estimated by solving either Generalized Born (GB) or Poisson Boltzman (PB) equation and thus it will give electrostatic energy. It also allows the addition of empirical terms for hydrophobic contributions as shown in Eq.III. The SARS-CoV-2 spike protein was targeted for MEPVC designing because of the many filters it fulfilled and required for a potential vaccine candidate. First, it does not share any significant homology to the human host, and as such chances of autoimmune responses are negligible [78] . Second, the protein is also not found to have any sequence identity to the mouse proteome and thus accurate immunological findings can be deciphered from in vivo mice experimentations [79] . This spike protein only harbored one transmembrane helix ensuring the wet lab protein cloning and expression for antigen analysis easy [80] . Antigenicity is another factor that makes this candidate highly suitable for vaccine designing as this allows efficient binding to the products of host immune system [81] . Further, this protein is strongly adhesive which makes it an excellent target for creation of adhesion based vaccine [82] . Lastly, all the sequences of SARS-CoV-2 spike protein are highly conserved thus a vaccine based on its sequence will be highly likely to have broad spectrum immunological implications [83] . Prioritization of potential epitopes for the SARS-CoV-2 spike protein commenced with the mapping of B cell epitopes that predicted total of 34 epitopes of variable length ranging from 1 to 62 (S- Table 1 ). The average score predicted for these B cell epitopes is 0.470, maximum (max) of 0.696 and minimum (min) of 0.188 (S- Fig.1 ). Each B cell epitope was then analyzed in MHC-1 alleles binding regions prediction [84] . A set of reference alleles to which these epitopes interact with are: HLA The MHC-II predicted epitopes were also filtered on basis of percentile score and then cross checked with the selected MHC-I allele and those common in both classes were considered only which were 50 in numbers. The shortlisted common MHC-I and MHC-II epitopes then subjected to antigenicity check. In this check, ability of the filtered B cell derived T cell epitope's to evoke and bind to products of adaptive immunity was evaluated. This yielded 38 epitopes all of which have strong ability to bind to the most prevalent DRB*0101 with average IC 50 score of 35.6552, max of 98 and min of 0.89.The antigenic epitopes then underwent allergenicity check to discard allergic peptides that may cause allergic reactions [86] . This resulted into 31 epitopes. Non-toxic epitopes were 7 whereas 6 were IFN-gamma producer (Fig.2) . The set of J o u r n a l P r e -p r o o f 10 epitopes obtained at different stages of epitope mapping phase is tabulated in S- Table 2 .These epitopes appear to provide coverage to 98% of the world population (Fig.3) Prioritized T cell epitopes derived from B cells were fused together tandemly by AAY linkers to make a MEP. AAY linker avoids formation of junctional epitopes and as such enhance epitope presentation [87] . To the N-terminal of MEP, EAAAK linker was added to attach βdefensin as an adjuvant leading to the design of a MEPVC. The MEPVC is schematically shown in Fig.4A . MEPVC offers many advantages compared to a separate antigenic peptide. Such vaccines induce both CD4+ and CD8+ responses and the antigens optimization is optimal. EAAAK is a rigid spacer and allows separation of the attached domain and promoting efficient immune processing of the epitopes [88] . β-defensins are potent immune adjuvants as they are capable of significantly enhancing production of lymphokines resulting in antigen-specific Ig production and T cell-dependent cellular immunity. The sequence of MEPVC is GIINTLCKYYCRVRGGRCCVCSCCPKEEQIGKCSTRGRKCCRRKKECAAKYAWNRK CISACYIAPGQTGKICCYPRRARSVCSACYTVYDPCQPCAAYVYDPLCPELCAYCKN HTSCDV. Physicochemical properties of the MEPVC were evaluated in order to assist experimentalists in the field to set up in vitro and in vivo experiments accordingly. The length of MEPVC is spanned across 110 amino acids and has molecular weight of 13.30 kDa. Vaccine construct with weight less than 110 kDa is generally believed to effective vaccine target because of its easy purification. Theoretical pI of MEPVC is 9.8 and aids in locating MEPVC on 2D gel. MEPVC aliphatic index is 69.08 projecting the vaccine thermostable at different temperatures. The total number of negatively charged and positively charges residues are 8 and 22, respectively. The grand average of hydropathicity (GRAVY) score computed for the MEPVC is -0.545, illustrating hydrophilic nature of the protein and is likely to interact with water molecules. The estimated half-life of MEPVC in mammalian reticulocytes, in vitro is J o u r n a l P r e -p r o o f 30 hours, yeast, in vivo is greater than 20 hours, and E. coli, in vivo higher than 10 hours. The antigenicity of the MEPVC was cross-checked and predicted highly antigenic with value of 0.69. Total entropy of the protein is 17.0170 which is considered ideal and also the vaccine has no transmembrane helices (alpha helical transmembrane protein, 0.0474783 and beta barrel transmembrane protein, 0.0060384) hence no difficulties can be anticipated in cloning and expression analysis. The predicted solubility upon overexpression of MEPVC is 0.965751 reflecting higher solubility of MEPVC. The 3D model of the MEPVC was constructed using ab initio 3Dpro predictor as no appropriate template was available for homology modeling and threading methods. The 3D structure of MEPVC is shown in Fig.4B . The structure secured 85.4% of residues in the Ramachandran favored, 12.6% in additionally allowed, 1.9% in generously allowed, and 0% in disallowed region. As the predicted MEPVC unit has number of loop regions that need to be modeled properly before moving forward. In total, five sets of residues: Alas7-Lys32, Ile63-Gly69, Cys73-Arg77, Thr87-Pro102, and Asn113-Val119 were loop modeled. The loop modeled structure increased the Ramachandran favored residues percentage to 92.3%, residues in allowed region reduced to 6.7%, residues of generously allowed region to 1.0% and disallowed remained to 0%. The structure was subjected to structure perturbations and relaxations to obtain a refined model. Among the generated structures (S- Table 3 ), the first model was selected as it has improved Rama favored score, lowest stable galaxy energy of 0.96, improved clash score of 23.1 and good MolProbity value. Similarly, the structure lacks poor rotamers in contrast to the original structure. The Ramachandran statistics for the refined 3D structure are in following order: Ramachandran favored residues (93.2%), additionally allowed region (5.8%), generously allowed region (1.0%) and disallowed region (0 %). The Z-score of the refined MEPVC is -4.3 and within the score range of the same size protein in structure databases (S- Fig.2) . In the Ramachandran plot, the torsion angles are shown by black squares dispersed across the core secondary structures (colored as red). The allowed regions can be understood by color yellow, generously allowed by pale yellow and disallowed by white region. The top right, top left, bottom right and bottom left represent quadrants for left handed alpha helices, beta sheets, right handed alpha helix, and no elements, respectively. Further, disulfide engineering of the MEPVC was performed in order to optimize molecular interactions and confer considerable stability by attaining precise geometric conformation [89, 90] . Eight pairs of residues were selected to be replaced with cysteine amino acids. These pairs are: Gln7-Ala19 (χ3 angle,+118, energy value, 4.20 kcal/mol), Cys18-Leu21 (χ3 angle,+84.35, energy value, 3.69 kcal/mol), Lys44-Ala47 (χ3 angle,+74.17, energy value,5.59 kcal/mol), Arg57-Ala61 (χ3 angle,+122.71 , energy value,6.14 kcal/mol), Ala72-Ala85 (χ3 angle,-62.92, energy value,4.40 kcal/mol), Ala73-Ala82 (χ3 angle, , energy value, kcal/mol), Leu92-Glu95 (χ3 angle, -102.37 , energy value, 3.86 kcal/mol), and Phe111-Pro117 (χ3 angle,-96.20 , energy value,4.14 kcal/mol). These residues have either higher energy level i.e. > 2 kcal/mol or χ3 angle out of range (< −87 and > + 97) were selected on purpose to make them stable. The original and disulfide mutant MEPVC structures are shown in Fig.5 . The primary purpose of in silico cloning of the MEPVC was to guide molecular biologists and genetic engineers about the possible cloning sites and predicted levels of expression in a specific expression system for instance here in this study we used E. coli K12 system. Prior to cloning, reverse translation of the MEPVC sequence was conducted to have an optimized codon usage as per E. coli K12 to yield its max expression. The CAI value of the improved MEPVC sequence is 1 indicating ideal expression of the vaccine [91] . The GC content whereas is 53.2 % nearly to the E. coli K12 and range within the optimum ranged between 30 % and 70%. The cloned MEPVC is shown in Fig.6. . Both primary and secondary immune responses seem to play a significant contribution against the pathogen and may be compatible with the actual immune response. The in silico host immune system response to the antigen is shown in Fig.7 . High concentration of IgG +IgG and IgM was characterized at the primary response, followed by IgM, IgG1+ IgG2 and IgG1 at both primary and secondary stages with concomitant of antigen reduction. Additionally, robust responses of interleukins and cytokines were observed. All this suggests the efficient immune response and clearance of the pathogen upon subsequent encounters. Bioinformatic modeling driven molecular docking of the designed MEPVC to one representative innate immune response receptor (TLR3) was carried out in order to decode MEPVC potential of binding to the innate immune receptors. This was fundamental to understand as TLR3 is significant in recognition of virus associated molecular patterns and activation of type I interferons and NF-kappa B. The docking assessment predicted top 20 complexes sorted mainly on scoring function along with area size of interacting molecules, actual rigid transformation of complexes and desolvation energy. Following, the complexes were subjected to FireDock web server for refinement assay. This ease in discarding flexibility errors of the docking procedure and provides a deep refinement of the predictions thus limiting the chances of false positive docking calculations. According to the global energy, solution 8 was ranked as top with net global energy of -20.78 kJ/mol (Table 1 ). This energy is the output of -16.88 kJ/mol attractive van der Waals (vdW), 3.81 kJ/mol repulsive vdW, 8.14 kJ/mol atomic contact energy (ACE), and -0.93 kJ/mol hydrogen bond energy. The docked conformation and chemical interacting residues of the MEPVC with TLR3 is shown in Fig.8 . Visual inspection of the complex leads to observation of deep binding of the MEPVC at the center of TLR3 and favor rigorously hydrogen and weak van der Waals interactions with various residues of TLR3. Within 3 Å, the MEPVC was noticed to formed interactions with His39,Val55,Asn57,Asp81,Phe84,Val103,Asn105,Gln107,His108,Thr126,Glu127,Ser132,Hi s129,Thr151,His156,Gln174,GLu175,Lys200,Lys201,Glu203,Asn229,Ser256,Asp280,Ser28 2,Tyr283,Tyr302,Phe304,Tyr307,Lys330,Tyr383,Tyr326,Asn328,His359,Asn361, and Glu363. J o u r n a l P r e -p r o o f J o u r n a l P r e -p r o o f The stability of MEPVC with TLR3 was further investigated through MD simulations. The trajectories of MD simulations were used in vital statistical analysis to decode backbone stability and residual flexibility. Root mean square deviation (RMSD) [92, 93] was performed first that compute the average distance of backbone carbon alpha atoms of superimposed frames (Fig.9A) . The average RMSD for the system is 3.23 Å with max of 4.4 Å at 24-ns. An initial sudden change in RMSD can be seen up to 2.7-ns that may be due to adjustments adopted by the complex when exposed to dynamics forces and milieu. The second minor RMSD shift can be noticed between 22-ns to 26-ns. Afterward, the system is quite stable with no global and local conformational changes specified. To affirm this observation, the simulation period was extended to 100-ns (S- Fig.3) . As can be seen, the complex is attaining stability as simulation time proceeds and the average RMSD reduced to 3.0 Å. Next, root mean square fluctuations (RMSF) [94] was applied to the system trajectories (Fig.9B) . RMSF is the average residual mobility of complex residues from its mean position. Mean RMSF for the MEPVC-TLR3 complex calculated is 1.60 Å with max of 8.6 Å pointed at the N-terminal of the MEPVC. Most of the interacting residues of the MEPVC with TLR3 are subject to minor fluctuations, a fact in analogy to complex high stability. The thermal residual deviation was assessed afterward by beta-factor (β-factor) [95] , the outcomes of which are strongly correlated to the RMSF and hence further affirming system stability (Fig.9C) . The average βfactor of the system analyzed is 88.64 Ų with max of 1956.23 Ų. Lastly, we evaluated the compactness of the system by means of radius of gyration (Rg) [96] analysis (Fig.9D) . High Rg and low Rg illustrate the magnitude of system compactness and system less tight packing. It further tells us whether the system of interest in order or not. Highly compact system is an indication of system stability and vice versa. The mean Rg for our system is 55.8411 Å with max score of 74.884 reflecting higher ordered and compact nature of the system. J o u r n a l P r e -p r o o f Hydrogen bonds are dipole-dipole attractive forces and formed when a hydrogen atom bonded to a highly electronegative atom such as F, N, and O is attracted by another electronegative atom [97] . The strength of a hydrogen bond varies from 4 kJ to 50 kJ per mole. Hydrogen bonds are deemed vital in molecular recognition and provide rigidity in achieving stable conformation [98] . The frequency of hydrogen bonds in each frame of the MD simulation trajectories can be visualized in Fig.10A . These hydrogen bonds were extracted by means of VMD HBonds plugin and are 104 in number ( Table 2 ). The cut-off distance set is 3.0 Å and cut-off angle 20 degrees. Each residue pair may for one, two or more each of which is counted separately. The min, mean and max number of hydrogen bonds between MEPVC and TLR3 are 1, 5, and 12, respectively. The distribution and bonding pattern of intermolecular interactions of the MEPVC residues atom(s) with respect to the TLR3 were studied through radial distribution function (RDF) [94, 99, 100] . RDF mainly describes distance 'r' between two entities and is symbolized by g(r). The factor 'r' is extracted from simulation frames and range from o to ∞ [75] . The hydrogen bonds predicted by VMD were utilized in RDF that shown only 8 interactions between MEPVC and TLR3 with good affinity for each other. In these interactions, TLR3 residues (atoms) are Asp52:OD1, Gln78:HE21, Glu98:OE1, Asp124:OD2, Lys171:HZ3, Asp251:OD2, Glu323:OE2, and Glu328:OE1 are found to have strong radii distribution to their counterpart MEPVC residues (atoms): Arg705:HH11, Arg705: O, Lys679:HZ3, Arg706:HH12, Glu675:OE1, Asn633:HD22, Arg643:HH22, and Tyr638: HH, respectively. The RDF plots for the above said interactions are illustrated in Fig.10B . The interaction between Asp124-OD2 and Arg706-HH12 has a refined distribution pattern and highest density distribution among all. The max g(r) value for this interaction is 3.51 observed at distance range of 1 Å. This is followed by Glu328-OE1-Tyr638-HH with max g(r) value of 3.26 mostly interacting at distance range of 0.6 Å. The Glu323-OE2-Arg643-HH22 is also much refined having g(r) value of 1.98 and mostly interacts within distance range of 0.6 Å. The remaining interactions density distribution is not confined and vary considerably but important from MEPVC and TLR3 interaction point of view. Salt bridges are non-covalent in nature and the outcome of interactions between two ionized states [101] . These interactions comprised two parts: an electrostatic interaction and a hydrogen bond. In salt bridges, lysine or arginine typically behave as base where glutamine or aspartate as acid and the bridge is created when carboxylic acid group allows a proton migration to guanidine and amine group in arginine. Salt bridges are the strongest among all non-covalent interactions and contribute to a major extent in biomolecular stability [102] [103] [104] . In total, 17 salt bridges were identified between TLR3 and MEPVC within the cut-off distance of 3.2 Å as can be depicted from Fig.10C . The higher numbers of salt bridges were recorded for TLR3-Glu628 and MEPVC-Lys685. The mean number of salt bridges for this interaction is 18, max, 35, and min, 3. The count for other salt bridges from TLR3 to MEPVC is in following order: Asp124-Arg706 (mean, 3, max,7 and min,3), Glu98-Lys679 (mean, 5, max,10 and min,4), Asp402-Arg641 (mean, 12, max, 18 and min, 4), Glu146-Arg706 (mean,7 max,11 and min,3), Glu146-Lys679 (mean,5 max,13 and min,2), Glu272-Lys655 (mean,9 max,22 and min,2), Glu323-Arg643 (mean,4 max,8 and min,3), Glu323-Arg646 The vital hydrogen bond interactions involved between the TLR3 receptor and MEPVC shortlisted by VMD were subjected to a novel AFD analysis to elucidate 3D movements of MEPVC atoms with respect to a reference TLR3 residues atom in simulation time. To this objective, interactions mentioned in the RDF were used in AFD. Preliminary investigation suggested that only three interactions: TLR3-Asp52-MEPVC-Arg705, TLR3-Glu328-MEPVC-Tyr638, and TLR3-Glu323-MEPVC-Arg643 are mainly represented frequently and found in most of the simulation frames. The TLR3-Asp52-MEPVC-Arg705 is uncovered in 4997 frames, TLR3-Glu328-MEPVC-Tyr638 in 4988, and TLR3-Glu323-MEPVC-Arg643 in 4985 making these interactions ideal for interpreting density distribution of the interactions on XYZ planes and also appropriate for gaining ideas about conformational changes of the interacting atoms with respect to each other. As the local structure movements and rotations are responsible for functional shifts, their understanding in our system is important to be unveiled. For TLR3-Asp52-MEPVC-Arg705 (Fig.11A) , the density distribution is not uniform, dispersed and behave flexibility in affinity on all three axis for the receptor atom. Parallel, the strength of interaction is also observed affected due to these minor structural movements of the MEPVC residue atom. Though, the mentioned interaction depicts MEPVC is still within the vicinity of the TLR3 reference residue and enjoys this interaction flexibility with the said MEPVC residue during simulation. TLR3-Glu328-MEPVC-Tyr638 interaction (Fig.12B ) has less distribution area and has much higher intensity illustrating strong affinity of the interacting atoms for each other. It also gives an idea of the lesser movements of the atoms with respect to each other, an indication of a correct system conformation. The distribution area TLR3-Glu323-MEPVC-Arg643 is much dispersed though high intensity of the interaction can be seen in close vicinity (Fig.11C) . J o u r n a l P r e -p r o o f The ΔTOTAL free energy in both GB and PB models are revealed favorable MEPVC-TLR3 complex in solvation. The net GB and PB contribution for the MEPVC-TLR3 system is -53.81 kcal/mol and -89.02 kcal/mol, respectively. To this energy, a high contribution was noticed from gas phase energy (ΔG gas) compared to highly insignificant contributions from solvation energy (ΔG solv). In GB model, the ΔG gas energy for the system is -1889. 76 The net free energy of the simulated system was subjected to per residues and pairwise residues decomposition to point residues that contribute majorly in system stabilization and The binding free energy of the TLR3-MEPVC complex, TLR3 receptor, MEPVC, and the net system energy is further decomposed into 100 frames extracted from simulation trajectories (S- Fig.5 ). This information deemed vital in predicting the simulation time where higher intermolecular affinity was observed and can guide about the most suitable docked conformation. In general, the complex, receptor and construct energies are higher in PB compared to GB but for the total energy, the PB energies are quite lower for frames in contrast to GB. For the complex, the min, max and average binding energy reported are -67264.7 kcal/mol, -66901.5 kcal/mol, and -67069.5 kcal/mol, respectively in GB. Pair-wise energy contribution to the system total energy was accomplished in order to understand pair residues roles from both TLR3 and MEPVC in complex stability. We found that the Thr122 and Asp124 (-4.56 kcal/mol in GB and -5.45 kcal/mol in PB), Glu174 and Ser176 (-3.45 kcal/mol in GB and -3.77 kcal/mol in PB), Glu323 and Hie324 (-2.86 kcal/mol in GB and -3.99 kcal/mol in PB) of TLR3 receptor have high combined contribution to the total energy. In case of MEPVC, Asn633 and Thr634 (-3.21 kcal/mol in both GB and PB), Val642 and Arg643 (-5,87 kcal/mol in GB and -3.27 kcal/mol in PB) and Arg705 and Arg708 (-2.74 kcal/mol and 2.04 kcal/mol). Detail findings of entropy calculation are tabulated in Table 4 . The net value of -52.52 kcal/mol symbolizes the fact that microstate level of the system decreases when the TLR3 receptor and MEPVC bind to make complex. The decrease in the available microstate level also points towards MEPVC efficient trapping at the docked site of TLR3. Taken together, we characterized SARS-CoV-2 spike glycoprotein for antigenic peptides and proposed a MEPVC by means of several computational immunological methods and biophysical calculations. The outcomes of this study could save time and associated costs that go into experimental epitope targets study. The MEPVC is capable of activating all components of the host immune system, have suitable structural and physicochemical properties. Also, it seems to have very stable dynamics with TLR3 innate immune receptor and thus has higher chances of presentation to the host immune system. However, additional in vivo and in vitro experimentations are needed to disclose its real potential in the fight against COVID-19. No conflict of interest was reported by the authors. Authors are highly grateful to the Pakistan-United States Science and Technology Cooperation Program (Grant No. Pak-US/2017/360) for granting the financial assistance. Fig.1 . B cell epitopes prediction by IEDB Bepipred linear epitope prediction method. Yellow and green peaks are those predicted as epitopes and non-epitopes, respectively. Fig.2 . PROSA-Z energy plot for the MEPVC. Fig.3 . Extended simulation RMSD for TLR3-MEPVC complex. Fig.4 . Residue wise decomposition of net binding energy into TLR3 receptor and MEPVC interacting residues. Top (GB) and bottom (PB). Fig.5 . Binding energy decomposition per frame for TLR3-MEPVC complex (A), TLR3 receptor (B), MEPVC (C) and net energy (D). S- Table 1 . B cell epitopes predicted for the SARS-CoV-2 spike glycoprotein. S- Table 2 . Number of epitopes obtained at each step of epitope mapping phase. S- Table 2 . Top 5 refined model of the MEPVC. The input MEPVC is also provided. J o u r n a l P r e -p r o o f Highlights  SARS-CoV-2 spike glycoprotein is characterized in silico for the design of a multivalent vaccine.  The designed vaccine is producing high level of immunoglobulins, cytokines and interleukins.  The vaccine has a stable conformation with TLR3 innate immune receptor.  Radial distribution function and axial frequency distribution analysis highest several vital interacting residues.  Major electrostatic energy and minor van der Waals were observed in the complex system. COVID-19, an emerging coronavirus infection: advances and prospects in designing and developing vaccines, immunotherapeutics, and therapeutics A review of the 2019 Novel Coronavirus (COVID-19) based on current evidence Emerging coronaviruses: genome structure, replication, and pathogenesis The deadly coronaviruses: The 2003 SARS pandemic and the 2020 novel coronavirus epidemic in China Zoonotic origins of human coronaviruses Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Novel, others, The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China, Zhonghua Liu Xing Bing Xue Za Zhi= Zhonghua Liuxingbingxue Zazhi Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein Structure, function, and evolution of coronavirus spike proteins The coronavirus spike protein is a class I virus fusion protein: structural and functional characterization of the fusion core complex Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Tectonic conformational changes of a coronavirus spike glycoprotein promote membrane fusion Cryo-electron microscopy structures of the SARS-CoV spike glycoprotein reveal a prerequisite conformational state for receptor binding Shi, others, Immunogenicity and structures of a rationally designed prefusion MERS-CoV spike antigen Lanzavecchia, others, Unexpected receptor functional mimicry elucidates activation of coronavirus fusion Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains others, Characterization of novel monoclonal antibodies against MERScoronavirus spike protein Wu, others, Identification of the immunodominant neutralizing regions in the spike glycoprotein of porcine deltacoronavirus Development of epitope-based peptide vaccine against novel coronavirus 2019 (SARS-COV-2): Immunoinformatics approach Therapeutic options for the 2019 novel coronavirus (2019-nCoV) Receptor recognition by the novel coronavirus from Wuhan: an analysis based on decade-long structural studies of SARS coronavirus Greenough, others, Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus The novel coronavirus 2019 (2019-nCoV) uses the SARS-coronavirus receptor ACE2 and the cellular protease TMPRSS2 for entry into target cells Stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis others, A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Perspectives on therapeutic neutralizing antibodies against the Novel Coronavirus SARS-CoV-2 Longitudinal profile of immunoglobulin G (IgG), IgM, and IgA antibodies against the severe acute respiratory syndrome (SARS) coronavirus nucleocapsid protein in patients with pneumonia due to the SARS coronavirus Disappearance of antibodies to SARS-associated coronavirus after recovery An efficient method to make human monoclonal antibodies from memory B cells: potent neutralization of SARS coronavirus Anti--spike IgG causes severe acute lung injury by skewing macrophage responses during acute SARS-CoV infection Peptide vaccine: progress and challenges Multi-epitope vaccines: a promising strategy against tumors and viral infections Avian antimicrobial peptides: the defense role of $β$-defensins Exploring the Zika Genome to Design a Potential Multiepitope Vaccine Using an Immunoinformatics Approach Exploring NS3/4A, NS5A and NS5B proteins to design conserved subunit multi-epitope vaccine against HCV utilizing immunoinformatics approaches Molecular dynamics simulations of biomolecules The MM/PBSA and MM/GBSA methods to estimate ligandbinding affinities The immune epitope database (IEDB): 2018 update BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes Gapped sequence alignment using artificial neural networks: application to the MHC class I system Peptide binding predictions for HLA DR, DP and DQ molecules MHCPred: a server for quantitative prediction of peptide--MHC binding Pangenome and immuno-proteomics analysis of Acinetobacter baumannii strains revealed the core peptide vaccine targets VirulentPred: a SVM based prediction method for virulent proteins in bacterial pathogens VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines AllerTOP-a server for in silico prediction of allergens Peptide toxicity prediction Designing of interferon-gamma inducing MHC class-II binders Development of an epitope conservancy analysis tool to facilitate the design of epitope-based diagnostics and vaccines Exploring dengue genome to construct a multi-epitope based subunit vaccine by utilizing immunoinformatics approach to battle against dengue infection SCRATCH: a protein structure and structural feature prediction server others, Galaxy: a platform for interactive large-scale genome analysis GalaxyRefine: protein structure refinement driven by sidechain repacking Disulfide by Design 2.0: a web-based tool for disulfide engineering in proteins Codon usage: nature's roadmap to expression and folding of proteins JCat: a novel tool to adapt codon usage of a target gene to its potential expression host Innate immune pattern recognition: a cell biological perspective Molecular docking, in: Mol. Model. Proteins A family of human receptors structurally related to Drosophila Toll Signaling to NF-$κ$B by Toll-like receptors Medical Microbiology E-Book: A Guide to Microbial Infections: Pathogenesis, Immunity, Laboratory Diagnosis and Control. With STUDENT CONSULT Online Access Recognition of doublestranded RNA and activation of NF-$κ$B by Toll-like receptor 3 PatchDock and SymmDock: servers for rigid and symmetric docking FireDock: fast interaction refinement in molecular docking UCSF Chimera-a visualization system for exploratory research and analysis Combating tigecycline resistant Acinetobacter baumannii: A leap forward towards multi-epitope based vaccine discovery The FF14SB force field Langevin stabilization of molecular dynamics A fast SHAKE algorithm to solve distance constraint equations for small molecules in molecular dynamics simulations CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data VMD: visual molecular dynamics AFD: an application for bi-molecular interaction using axial frequency distribution MMPBSA.py: An efficient program for end-state free energy calculations Assessing the Performance of the MM_PBSA and MM_GBSA Methods. 1. The Accuracy.pdf Identification of putative vaccine candidates against Helicobacter pylori exploiting exoproteome and secretome: a reverse vaccinology based approach Vaxign: the first web-based vaccine design program for reverse vaccinology and applications for vaccine development PanRV: Pangenome-reverse vaccinology approach for identifications of potential vaccine candidates in microbial pangenome Prioritization of potential vaccine targets using comparative proteomics and designing of the chimeric multi-epitope vaccine against Pseudomonas aeruginosa Adhesins as targets for vaccine development Exoproteome and secretome derived broad spectrum novel drug and vaccine candidates in Vibrio cholerae targeted by Piper betel derived compounds The MHC class I antigen presentation pathway: strategies for viral immune evasion MHC class II proteins and disease: a structural perspective Vaccine allergies Predefined spacers between epitopes on a recombinant epitope-peptide impacted epitope-specific antibody response In silico design of multimeric HN-F antigen as a highly immunogenic peptide vaccine against Newcastle disease virus Disulphide bonds and protein stability Protein disulfide engineering Novel immunoinformatics approaches to design multi-epitope subunit vaccine for malaria by investigating anopheles salivary protein Molecular dynamics simulation studies of novel $β$-lactamase inhibitor Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins Interaction mechanisms of a melatonergic inhibitor in the melatonin synthesis pathway Binding mode analysis, dynamic simulation and binding free energy calculations of the MurF ligase from Acinetobacter baumannii Radius of gyration as an indicator of protein structure compactness The hydrogen bond in molecular recognition Hydrogen bonds in proteins: role and strength Ultrafast scalable parallel algorithm for the radial distribution function histogramming using MPI maps Radial Distribution Functions of Some Structures of the Polypeptide Chain Defining the role of salt bridges in protein stability Do salt bridges stabilize proteins? A continuum electrostatic analysis Protein stabilization by salt bridges: concepts, experimental approaches and clarification of some misunderstandings Contribution of surface salt bridges to protein stability: guidelines for protein engineering