key: cord-0712554-71q4wfqv authors: Fatoba, Abiodun J.; Maharaj, Leah; Adeleke, Victoria T.; Okpeku, Moses; Adeniyi, Adebayo A; Adeleke, Matthew A. title: Immunoinformatics prediction of overlapping CD8(+) T-cell, IFN-γ and IL-4 inducer CD4(+) T-cell and linear B-cell epitopes based vaccines against COVID-19 (SARS-CoV-2) date: 2021-01-18 journal: Vaccine DOI: 10.1016/j.vaccine.2021.01.003 sha: f3079c6312f3d5a560ab9e0c4731a2b51838882d doc_id: 712554 cord_uid: 71q4wfqv At the beginning of the year 2020, the world was struck with a global pandemic virus referred to as SARS-CoV-2 (COVID-19) which has left hundreds of thousands of people dead. To control this virus, vaccine design becomes imperative. In this study, potential epitopes-based vaccine candidates were explored. Six hundred (6 0 0) genomes of SARS-CoV-2 were retrieved from the viPR database to generate CD8(+) T-cell, CD4+ T-cell and linear B-cell epitopes which were screened for antigenicity, immunogenicity and non-allergenicity. The results of this study provide 19 promising candidate CD8(+) T-cell epitopes that strongly overlap with 8 promising B-cells epitopes. Another 19 CD4(+) T-cell epitopes were also identified that can induce IFN-γ and IL-4 cytokines. The most conserved MHC-I and MHC-II for both CD8(+) and CD4(+) T-cell epitopes are HLA-A*02:06 and HLA-DRB1*01:01 respectively. These epitopes also bound to Toll-like receptor 3 (TLR3). The population coverage of the conserved Major Histocompatibility Complex Human Leukocyte Antigen (HLA) for both CD8(+) T-cell and CD4(+) T-cell ranged from 65.6% to 100%. The detailed analysis of the potential epitope-based vaccine and their mapping to the complete COVID-19 genome reveals that they are predominantly found in the location of the surface (S) and membrane (M) glycoproteins suggesting the potential involvement of these structural proteins in the immunogenic response and antigenicity of the virus. Since the majority of the potential epitopes are located on M protein, the design of multi-epitope vaccine with the structural protein is highly promising though the whole M protein could also serve as a viable epitope for the development of an attenuated vaccine. Our findings provide a baseline for the experimental design of a suitable vaccine against SARS-CoV-2. The recent emergence of the new coronavirus, referred to as 'COVID-19 0 , is posing a global challenge to public health and has placed the global economy under financial burden. The virus, which was first reported in Wuhan, a city in the Hubei province of China [1] in late December 2019, has since spread to different continents of the world causing severe illness, which range from mild sicknesses to death. Globally, 16,523,815 confirmed cases and 655,112 death cases of COVID-19 have been reported across 216 countries as of 29 July 2020 (https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200729-covid- 19sitrep-191 .pdf?sfvrsn = 2c327e9e_2). The World Health Organization on January 30, 2020, declared COVID-19 outbreak as Public Health Emergency of International concern. The epidemiological transmission of this virus from human to human has been attributed to sneezing and coughing, which release the droplet that transmits the virus [2] . Some of the symptoms associated with this viral infection include fever, sore throat and pneumonia [2] . To date, there is no certified cure [3] . Presently, treatment relies on symptomatic relief and self-isolation to prevent infection of other people. The genetic component of this virus is a positive-sense singlestranded RNA with a similar genome to SARS-CoV and bat coronavirus, hence the name SARS-CoV-2 [4] . It belongs to the Coronhttps://doi.org/10.1016/j.vaccine.2021.01.003 0264-410X/Ó 2021 Elsevier Ltd. All rights reserved. aviridae family and classified as b-coronavirus with a 2B group [5] . Contrary to other human coronaviruses such as HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1 which are mostly associated with the common cold [6] , the emergence of some human coronavirus such as SARS (Severe Acute Respiratory Syndrome), MERS (Middle East Respiratory Syndrome) and the recently COVID- 19 have led to fatal endemics and pandemic [7] . Similar to other coronaviruses, SARS-CoV-2 consists of both structural and non-structural proteins. The structural proteins include N-protein (Nucleocapside), S-protein (Spike), E-protein (Envelop), and M-protein (Membrane) [8] . Based on the similarity of COVID-19 (SARS-CoV-2) to SARS-CoV, it has been suggested that the previous understanding of the protective immune response in SARS-CoV could be helpful in the design of effective vaccine for this novel SARS-CoV-2 [8] . Different reports have indicated the crucial role of the humoral and cell-mediated immune response in the control of SARS-CoV [9] [10] [11] . Although antibodies have been reported to be produced against N-protein of SARS-CoV, the responses were only for a short time [12] . On the other hand, Tcell response is highly immunogenic against all structural proteins of SARS-CoV and has been reported to be dominant and longlasting specifically against N and S proteins [11] . Currently, immunoinformatics studies on different potential vaccine epitopes have been carried out by various researchers and several are under clinical trials [13] [14] [15] [16] . Although most of these studies explored potential T-cell and B-cell epitopes from different proteins of the viral genome, the protein sequences retrieved were based on limited data of the viral genome that were available at different databases at the early onset of the pandemic. Due to the drastic impact of this virus and the emergence of updated genomic data, further study on the potential immunogenic T-cell and B-cell epitopes are crucial as this could guide the experimental design of vaccines against SARS-CoV-2. Therefore, this study provided details of the immunoinformatics approach towards the development of multi-epitope vaccine candidates against COVID-19 (SARS-CoV-2) with long-lasting immune response and effectiveness when validated. The flow chart of all the major steps in the design of the potential CD4 + and CD8 + T-cell, and B-cell epitopes and multi-epitope vaccines are shown in Figure S1 . The details of the various steps are described in this section. The genomic sequences of SARS-CoV-2 isolates were retrieved from the ViPR database (Virus Pathogen Database and Analysis Resource) (http://www.viprbrc.org/). All available sequences were used, including partial and complete. Human Leukocyte Antigen (HLA) CD8 + and CD4 + were also obtained from protein database (RCSB PDB (www.rcsb.org) [17] with identifiers 3QZW and 3S4S respectively. All retrieved sequences were aligned by multisequence alignment using the CLUSTALW server [18] with the default parameters. The CLUSTAL W server uses the k-tuple method to generate a pairwise alignment, thereafter the server runs tree construction by the Neighbour-Joining method which will inform the final multiple sequence alignment [18] . From the alignment, conserved regions of 15 amino acids or more with the CD8 + and CD4 + epitopes were selected. The identified sequences were then subjected to antigenic (Vaxijen V2.0) [19] and transmembrane helix property prediction (based on a hidden Markov model) (TMHMM V2.0) [20] . For antigenicity prediction, the virus was the target organism and the threshold value was 0.4 as used elsewhere in similar research [21] . The VaxiJen server uses auto cross-covariance transformation of protein sequences into uniform vectors and is alignment-independent (alignment to known antigens is not a reliable way to predict antigenicity) [19] . From these antigenic sequences, the transmembrane sequences as determined by TMHMM [20] were extracted for further analysis. For this study, the definitions of 'immunogenicity' and 'antigenicity' were used as outlined by [22] . A substance is considered to be immunogenic if it can induce cellular and humoral responses. Conversely, a substance is antigenic if it can be recognized by antibodies that arise due to the immune response to the substance. NetCTL v1.2 was employed to predict nonamers that can bind major histocompatibility (MHC) class I (HLA allele) molecules [23] . NetCTL incorporates aspects of proteosomal cleavage, TAP transport efficiency and MHC class I affinity to predict epitopes [23] . Within the NetCTL parameters, the A1 supertype was selected while the weight on C terminal cleavage, weight on TAP transport efficiency and a threshold for epitope identification were at 0.15, 0.05, and 0.75 respectively which were the default parameters. Nonamers above the threshold of 0.5 were input into the IEDB analysis tool (http://tools.iedb.org/mhci/) [24] to identify frequently and non-frequently occurring MHC class I binding alleles. The Stabilized Matrix Base Method (SMM) was used with an amino acid length of 9 and IC 50 value of < 250 to determine the CD8 + T-cell epitopes. To predict CD4 + T-helper lymphocytes and the MHC class II alleles, the IEDB MHC II binding tool was used (http://tools.iedb.org/ mhcii/) [25] . SMM method was utilized and peptides with an IC 50 < 250 were selected as the CD4 + T-cell epitopes. Immunogenicity of the CD8 + T-cell epitopes was deducted from the MHC I immunogenicity tool of IEDB (http://tools.iedb.org/immunogenicity/) [26] . This server describes immunogenicity as Tcell recognized peptide-MHC complexes [26] . VaxiJen (http:// www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) [15] was used to determine protective antigenicity scores for all epitopes, only those with scores above 0.5 were used as done in other studies [27, 28] . The threshold was increased from the initial value of 0.4 to ensure higher accuracy at this stage of the protocol. To assess conserved epitopes that correspond to a portion of protein sequence that will restrain the epitope [29] , both the CD8 +and CD4 + T-cell epitopes were input into the IEDB conservation across antigens tool (https://tools.iedb.org/conservancy/) [30] . Following this, AllerTOP (http://www.ddg-pharmfac.net/AllerTOP/) was used to determine if the epitopes would be expected to cause an allergic reaction in patients or not. Thus the resulting epitopes were predicted to be immunogenic, conserved across antigens and had no allergenicity. To identify IFN-c inducer properties in CD4 + T-cells, the online tool IFNepitope (https://crdd.osdd.net/raghava/ifnepitope) tool was utilized (Dhanda et al., 2013) . The CD4 + T-cell epitopes that were IFN-c inducing then underwent IL-4 inducer prediction by the IL4pred tool (http://crdd.osdd.net/raghava/il4pred/) [31] . For IL4 prediction, the support vector machine (SVM) model was used with a threshold value of 0.1. As it is known that different HLA alleles are differentially expressed in different ethnicities, it is important to determine the HLA-allele distribution globally to inform successful peptidebased vaccine development. To do this, the CD8 + and CD4 + T-cell epitopes were input into the IEDB population coverage analysis tool (http://tools.iedb.org/population/) [32] . In this, the initial sequences with an antigenicity score ! 0.4 from the VaxiJen server and identified as the outer membrane protein in TMHMM were input in the ABCpred server (http://osddlinux.osdd.net/raghava/abcpred) [33] . ABCpred predicts linear Bcell epitopes through the use of a recurrent neural network [33] . The window length and threshold values were hexadecamer and 0.51 respectively. The B-cell epitopes generated were later tested for their antigenicity and allergenicity. Overlapping B-cell epitopes with CD8 + T-cell epitopes were also filtered from the list of epitopes. All the selected promising CD4 + T-cells and B-cells epitopes were mapped to the surface of the COVID-19 complete genome (GenBank: MN908947.2) obtained from NCBI using the software Snapgene. The region of the epitopes that mapped well with the in house protein region as shown in Figs. 2 and S3 was selected as multi-epitopes. The multi-epitopes peptides located on surface (S) glycoprotein also called spike protein was cut from the pdb protein structure (6vxx), the ones located on the orf1ab polyprotein and nucleocapsid (N) phosphoprotein were generated using trRosetta tool (https://yanglab.nankai.edu.cn/trRosetta/help/) [34] while the long chains multi-epitopes peptides (Table 5 ) from the membrane (M) and ORF3a were obtained through the homology model. To determine the interaction between CD8 + T-cell epitopes with the chosen MHC-I HLA allele HLA-A*02:06 and CD4 + T-cell epitopes with MHC-II allele HLA-DRB1*01:01, the respective protein geometries (pdb: 3OXR and 1AQD respectively) were retrieved from the RCSB Protein Data Bank in the PDB format [35] . Similarly, the interaction of multi-epitope with Toll-like receptor 3 (TLR) was also evaluated. The protein '2A0Z' pdb was retrieved for TLR3 and docked with multi-epitope. Docking was conducted on pepAT-TRACT (http://www.attract.ph.tum.de/services/ATTRACT/peptide. html) [36] wherein 50 structures were added to the resulting pdb file and run on 12 processing cores. pepATTRACT is fully blind but performs similarly to local docking methods FlexPepDock and HADDOCK [36] . All other criteria were left as default. The docking file was retrieved from this and run on the Centre for High Performance Computing (CHPC) server, Lengau cluster to complete docking calculations. This was conducted for each CD8 + T-cell epitope and the HLA allele. From the 50 structures generated for each docking run, the structure with the lowest energy state was selected and viewed in Visual Molecular Dynamics software (VMD) [37] . The CD8 + T-cell epitope was identified from the HLA structure and indicated accordingly. The image was then rendered in VMD and subsequently Chimera [38] for publication. Further docking of the multiepitopes to TLR was carried out using HADDOCK 2.2 [39] [40] [41] . The solvent accessibility (SA) of > 50 was used to define the active residues of the entire multiepitopes that will interact with TLR3. The active site residues of the TLR was selected and the active residues define for residues with SA > 50. Three docking results were generated from the Haddock dockings which were the results from the constrained docking, flexible docking and refinement in water solvent medium. Out of the 600 genomes of SARS-CoV-2 that were subjected to multiple sequence alignment, 495 conserved protein sequences were generated. Antigenicity analysis with threshold ! 0.4 showed only 43 out of 495 conserved sequences to be antigenic and after being subjected to transmembrane topology analysis, only 38 of these antigenic sequences met exomembrane criteria (Table S1 ). Among these 38 protein sequences that met both antigenic and transmembrane criteria, the VaxiJen scores ranged from 0.4031 to 0.9547. These sequences were selected for further analysis. The 38 protein sequences obtained were used to generate the CD8 + T-cell epitopes using NetCTL v1.2 servers with a threshold of greater than or equal to 0.5 to ensure a high confidence in the prediction of the epitopes. A total of 975 nonamers were generated, but 330 were selected after being subjected to the IEDB MHC-I prediction tool under the criteria that the IC 50 value < 250. Antigenicity and immunogenicity analysis were conducted on the 330 epitopes of CD8 + T-cell and only 18 CD8 + T-cell epitopes were both antigenic and immunogenic. Similarly, the 18 CD8 + T-cell epitopes interact with at least one different MHC-I HLA alleles. The epitopes IFLWLLWPV and YIIKLIFLW had the highest and lowest antigenic score of 1.4835 and 0.5220 respectively ( Table 1 ). The highest and lowest immunogenic scores of 0.3785 and 0.0076 were also observed for IFLWLLWPV and AFLPFAMGI epitopes respectively. were observed for the 18 CD8 + T-cell epitopes. The epitope YIIK-LIFLW with 5 HLA alleles had the maximum number of alleles. All the 18 epitopes were most conserved with CD4 + T-cell epitopes with a 100% conservancy score. The HLA-A*02:06 was the most (Table 1 ). Allergenicity screening of the 18 CD8 + T-cell epitopes revealed that they were non-allergens. The 495 conserved protein sequences were subjected to the IEDB MHC-II prediction tool under the following criteria: SMMalign based and IC 50 value < 250. A total of 200 pentadecamer peptides (epitopes) were generated and then subject to antigenicity test. Only 122 pentadecamers epitopes were antigenic and regarded as the potential CD4 + T-cell epitopes. The 122 pentadecamers were also subjected to overlapping test with the immunogenic CD8 + T-cell epitopes. The epitopes TLACFVLAAVYRINW and RNRFLYIIKLIFLWL had the highest and lowest antigenic score of 1.3949 and 0.5114 respectively as shown in Table 2 (Table 2) . Conservancy analysis was carried out on the 19 CD4 + T-cell epitopes with the immunogenic and antigenic 18 CD8 + T-cell epitopes ( Table 1 ) and all the epitopes showed 100% conservancy. The overlapped CD4 + T-cell epitopes together with their antigenicity scores and interacting MHC-II alleles are shown in Table 2 . The overlapped CD4 + T-cells were also tested for their ability to induce interferon-gamma (IFN-c) and interleukin 4 (IL-4) using IFNepitope and IL4pred prediction tool. The final 19 IFN-c and IL-4 inducing CD4 + T-cell epitopes were overlapped 18 CD8 + T-cell epitopes indicated in Table 2 . All these epitopes were also nonallergic. Table 1 The 18 final CD8 + T-cells epitopes of SARS-CoV-2 that are found to be antigenic, immunogenic, overlapping with CD4 + T-cell epitopes and interacting with MHC class I HLA-alleles. Evaluating the frequency of distribution of HLA alleles in different ethnicity is an important step that determines the possible effectiveness of a potential vaccine. Thus, the different HLA alleles that bound to the predicted CD8 + and CD4 + T-cell epitopes were subjected to population coverage analysis. Population coverage of combined MHC-I and MHC-II that is close to 100% indicates good coverage of a potential vaccine. The predicted T-cell epitopes indicated in Table 1 and 2 respectively were subjected to population coverage across 15 different geographical areas of the world as shown in Fig. 1 . Among the 18 CD8 + T-cell epitopes, the highest coverage was observed in West Africa (75.16%) followed by North America (74.63%), North Africa (71.16%) and West Indies (70.37%) while the lowest coverage was found in Oceania (26.78%). Better population coverage was obtained for CD4 + T-cell epitopes compared to CD8 + T-cell epitopes (Fig. 1) . The highest coverage for CD4 + T-cell epitopes was found in North America (99.99%), Europe (99.94%), East Africa (99.84%), West Africa (99.84%), Central Africa (99.73%) and South Asia (99.52%) while the least coverage was in South Africa (27.07%). The combination of the two T-cell epitopes (CD8 + and CD4 + T-cell epitopes) resulted in improved population coverage. The best coverage for both was found in North America (100%) followed by Europe (99.98%), West Africa (99.96%), East Africa (99.95%), Central Africa (99.89%), South Asia (99.76%) and South America (99.49%) while the lowest coverage was still South Africa but of an improved percentage coverage (68.05%). The plots showing the percentage distribution of the epitopes hits for the countries South Africa, West Africa, Europe and North America are shown in Figure S2 . A better distribution was observed in countries like West Africa, Europe and North America that had a higher percentage than South Africa where the percentage was small. B-cell epitopes were predicted from the 495 conserved sequences using the ABCpred server. A total of 187B-cell epitopes were generated. These were thereafter subjected to antigenicity and allergenicity tests of which only 21B-cell epitopes were selected. Further overlapping was then carried out with the CD8 + T-cell epitopes indicated in Table 1 resulting in 8B-cell epitopes that had 100% conservancy with CD8 + T-cell epitopes (Table 3) . To design target-specific vaccines, B-cells epitopes is important because of the ability of the extreme specificity of B-cell to neutralize pathogenic molecules through secretion of antibodies [42, 43] . All the final 19 CD4 + T-cell and 8B-cell epitopes were mapped to the complete viral genomes as shown in Fig. 2 to know the position where they can be identified. The positions on the complete gen-omes where the potential epitope-based vaccines candidates were identified are the orf1ab polyprotein, surface (S) glycoprotein, ORF3a protein, membrane (M) glycoprotein and nucleocapsid (N) phosphoprotein ( Figure S3 ). All the 19 CD4 + T-cell epitopes and 8B-cell epitopes were identified in the complete genome. As shown in Fig. 2 , the large protein in COVID-19 genome is orflab, and a total of 5 CD4 + T-cell epitopes (number E2, E3, E4, E9 and E10 corresponding to their numbering in Table 2 ) and 1B-cell epitope (E7 corresponding to their numbering in Table 3 ) were identified and were found within a very close range (11075 to 11,143 DNA base pair) as shown in Figure S3 . The next big protein to orflab is the Spike glycoprotein where a total of 3 CD4 + T-cell epitopes (E14, E18 and E19) and 1B-cell (E6) where identified also within a very close range (23084 to 23,137 DNA base pair). All other proteins in COVID-19 are small proteins, 3 CD4 + T-cell epitopes (E11, E15 and E17) and 1B-cell epitope (E8) was identified but were found to be widely separated from each other (25465 to 25,872 DNA base pair). Therefore, we proposed three types of the multi-epitopes for the ORF3a; one is the whole region select called ORF3a, the second one is called ORF3a-1a, which comprised of the region of E11, E15 and E17 while the third one is called ORF3a-1b that encompassed all the residues after CD4 + T-cell epitopes E17 to B-cell E8 in Figure S3 . The highest number of epitopes are found in the membrane (M) protein of the genome even though it is a very short protein compared to the rest of the proteins where the potential vaccine epitopes are identified. A total of 7 CD4 + T-cell epitopes (E1, E5, E7, E8, E12, E13, E16) and 3B-cell epitopes (E1, E4, E5) were identified. All the identified epitopes are within a short-range (26646 to 26,804 DNA base pair). Since the whole M protein is also a small protein, we propose taking the whole protein as a potential vaccine (name MÀwhole in Table 5 ). This can be attenuated if necessary should there be any identified virulence in it. The MÀprotein is known to be the most structured protein that helps in the determination of the shape of the virus envelope [44] . One unique feature of the MÀprotein is the ability to bind to other structural proteins. The binding of the nucleocapsids or N-protein with MÀprotein helps to stabilize the N protein-RNA complex inside the internal virion and therefore promotes completion of viral assembly [44] . The N-protein from the genome is the last protein where our proposed epitopes are identified, and only one CD4 + T-cell epitope (E) was identified in it. All the proposed multi-epitopes potential vaccines candidates are shown in Table 5 . The region was selected with some extra residues that corresponded to the in house protein identified using the Snapgene as shown in Figure S3 . The 3D modelling of all the promising epitopes for CD4 + and CD8 + T-cell and B-cell epitopes were modelled using online Table 3 Putative linear B-cell epitopes of SARS-CoV-2 with their antigenicity and allergenicity. Table 4 . The most promising CD8 + T-cell epitopes are RNRFLYIIK, LWLLWPVTL, ATRRIRGGD, ELLHAPATV, RVVVLSFEL, and NLLLLFVTV with the binding energy of À152.292, À143.572, À137.499, À136.211, À131.722, and À130.618 respectively ( Table 4 ). The binding site orientation of all the 18 final CD8 + Tcell epitopes binds in a similar to HLA-A*02:06 with coil structure except ATRRIRGGD (among the most promising list) and VTLACFVLA (-104.150) that assumed alpha-helix structure ( Figure S4 ). From Table 4 , all the HLA-epitope complexes contained an appreciable number of hydrogen bonds which are important in promoting epitopes binding affinities. The overall interacting residues and non-bonded contacts are also presented. The binding site interaction of RNRFLYIIK that has the best binding energy among the CD8 + T-cell epitopes shows that it fits into many of the binding site pockets of HLA-A*02:06 (Fig. 3a) . Many of the residues of HLA-A*02:06 have hydrophobic interactions with RNRFLYIIK while its specifically formed hydrogen bond interactions with Lys 66, Arg 97, Lys 146, Tyr 159, Glu 166, Glu 173 and a salt bridge with Glu 173 residues of HLA-A*02:06 (Fig. 3b) . The binding site interaction of the epitope IGYYRRATRRIRGGD that was ranked the best among the 19 epitopes that bind to HLA-DRB1*01:01 is shown in Fig. 4a . The epitopes explored the binding pockets of the receptor and stretched across the binding site. Many of the receptor's residues participated in hydrophobic interaction with IGYYRRATRRIRGGD. The epitope IGYYRRATRRIRGGD forms many hydrophobic interactions and more specific hydrogen bonds interactions with GLN 7, SER 51, TRP 237, ASP 242, THR 253, HIS 257 and a salt bridge with ASP 64 of HLA-DRB1*01:01 (Fig. 4b) . As Toll-Like Receptor 3 (TLR-3) is known to greatly facilitate the induction of immune response, the interaction of constructed multi-epitope with TLR3 was studied ( Table 5 ). The minimum binding energies of the multi-epitope were best with waterflexible. The multi-epitope binding to TLR-3 (Fig. 5) is that of the selected region (red colour) of membrane (M) protein and the whole MÀprotein. The whole MÀprotein did not use the epitopes selected region for the interaction (yellow) but rather used the unselected region. This resulted in better TLR-3 interaction of the whole MÀprotein compared to the selected epitopes part which is an indication that the other parts that are not found to be antigenic can help in improving the immunogenicity of the MÀprotein. Table 4 The interaction energies of CD4 + T-cell epitopes with MHC-II HLA allele HLA-DRB1*01:01 and CD8 + T-cell epitopes with MHC-I HLA allele HLA-A*02:06 with the number of interacting residues, number of hydrogen bonds and number of non-bonded contacts. Binding energy (kcal/mol) Number of interacting residues The CD4 + T-cell epitopes were named from E1-CD4 to E19-CD4 representing alphabetically the epitopes as found in Table 2 , the B-cell epitopes number E1-Bcell to E8-Bcell represented as found in Table 3 . The ongoing fight against the COVID-19 pandemic has become a global phenomenon and has placed demands on scientists for the search of a suitable vaccine against this virus. Although some vaccines are already at clinical and preclinical trials, there is still a need for extensive continuous research on COVID-19 vaccines as the solution is still yet to be found. Due to the limited knowledge of the immunological response of this virus and the urgency for the design of an effective vaccine, the computational design of T-cell and B-cell immunological epitopes becomes important. Furthermore, vaccines developed through this mean does not contain live pathogen of the virus that can lead to pathogenicity reversal. Presently, COVID-19 has been declared as a global issue with even first-world countries struggling to contain the disease outbreak. Different HLA type frequencies vary in different ethnicities due to high polymorphism of the MHC molecule, too much polymorphism may limit the proportion of the human population that responds to a particular antigen [45] . As such, the alleles considered in this study had to be proven to show sufficient population coverage on a large scale as shown in Fig. 1 . The combined population coverage of both CD8 + CD4 + T-cell was highest in North America (100%) followed by Europe (99.98%), West Africa (99.96%), East Africa (99.95%), Central Africa (99.89%), South Asia (99.76%) and South America (99.49%), an indication of good qualities of potential vaccine epitopes. While the results showed that the selected alleles were sufficiently prevalent in most countries, South Africa showed the lowest coverage. This is similar to the separate study on the coverage of the epitopes based vaccine for Rift valley fever that also found the percentage coverage in South Africa to be small [29] . The reason for this can be traced to a low level of this type of data analysis in the area like South Africa [46, 47] which may affect the population coverage results. The present investigation presented epitopes-based vaccine candidates of 18 overlapping CD8 + T-cell epitopes and 19 IFN-c and IL-4 inducer CD4 + T-cell epitopes that can bind to MHC-I and MHC-II respectively as represented in Tables 1 and 2 respectively. These epitope vaccine candidates contained cytotoxic Tlymphocyte that are 100% conserved in both T-cell and B-cell epitopes predicted. This gives uniqueness to these candidate vaccines in addition to their antigenic, non-toxic and immunogenic characteristics (Tables 1 and 2) . A similar method has been used in the prediction of suitable vaccine epitopes for various infectious pathogens such as Zika virus glycoprotein, Dengue virus protein, Chikungunya virus protein, and Ebola virus [48] [49] [50] [51] . All the predicted vaccine candidates in this study have strong potential to fight against COVID-19 when validated and tested. The tools used to design these epitopes were carefully selected by their accuracy. The IEDB T-cell epitope prediction servers are widely used and accepted in literature [52, 53] . B-cell epitope prediction tools generally have low accuracy [54] . ABCpred has an accuracy of 65.93% [33] which can be considered to be average, but it is still considered to be one of the most effective epitope prediction servers available online [55] . This justifies the need for further analysis of results obtained from T-or B-cell prediction tools, including antigenicity testing and docking studies. The allerTOP server has been identified as the best allergenicity prediction server and has an accuracy of 88.7% [56] . While this is high, it must be acknowledged that there is still room for error. Therefore, the results presented here are predictive and must be further validated in the laboratory. Cytotoxic T-cell (CD8 + T-cell) kills infected cells or secrets antiviral cytokine and as such restricts infection from tissues [57] . A robust immune response against most infectious pathogens such as viruses is elicited by CD8 + T-cell epitopes [58, 59] . The immune response generated by T-cell is known to be long-lasting compared to B-cell immune response thus necessitating the importance of T-cell in vaccine design [60] . Similarly, CD8 + CD4 + T-cell responses play a crucial role in antiviral immunity [15] . According to the report of [11] , T-cell response is highly immunogenic, dominant and long-lasting against N and S protein of SARS-CoV. A similar study by Grifoni et al [61] showed that T and B-cell epitopes from structural proteins of SARS-CoV-2 were conserved. One of the T-cell epitopes 'TLACFVLAAV' from MÀprotein that was found in their study was also found in our study. The reason for the difference in other epitopes predicted by the Grifoni et al [61] from those found in this present study could be that the predicted epitopes in this study were found on M and Orf-3a proteins different from S and Orf-1ab proteins reported by the authors. The study by Enayatkhani et al [7] , found many similar T and Bcell epitopes (NLLLLFVTV, LLWPVTLAC, WPVTLACFV, ITGGIAIAM, IAIAMACLV) from N, M and Orf-3a proteins as found in our present study. Our study further presented other epitopes besides the similar epitopes that passed the screening test for immunogenicity and cytokines-inducing ability. Another similar study that was done by Bhattacharya et al [16] predicted T and B-cell epitopes from the Spike protein of the virus. However, the epitopes selected for multi-epitope vaccine construction differs from this present study. Although few of our predicted epitopes were found on the Spike (S) protein, the differences in the predicted epitopes on Sprotein from those predicted by Bhattacharya et al could be due to differences in screening criteria. The selected epitopes from the present study were subjected to allergenicity test and cytokines-inducing ability which make it different from theirs that were subjected to only the antigenicity. A recent study on samples from 20 convalescing COVID-19 patients showed that helper T-cell (CD4 + T-cell) induced a robust immune response against S, M and N protein [62] . The study also reported immunogenic epitopes of S, M and N proteins to induce Table 5 The multi-epitopes vaccine selected from the region of complete COVId-19 genomes that correspond with the proposed CD4 T-Cell epitopes and B-cell epitopes and the interaction energy with TLR. CD8 + T-cells though this was not specific to only recovered patients but highlights the promising potentials of epitopes located on these structural proteins of SARS-CoV-2. Studies on antibody screening among COVID-19 patient have also confirmed antibodies detection such as IgA, IgG and IgM against S and N-proteins of the SARS-CoV-2 virus [63, 64] . Occasional detection of antibodies against MÀproteins has also been reported in COVID-19 patient [65] . According to [66] , antibodies detection in SARS-CoV-2 patient shows cross-reactivity with SARS-CoV with IgM and IgA detected 5 days after symptom onset while IgM was detected 14 days following the symptom onset. A similar study by [67] also shows that combined antigens from S and N protein-induced optimal antibody detection with high cross-reactivity of IgA and IgG in both SARS-CoV-2 and SARS-CoV. Based on the studies, predicted epitopes on S, M and N protein from SARS-CoV-2 as shown in this study could serve as a potential vaccine candidate and induce an appreciable level of antibodies when validated experimentally. It is important to note that this study is based on the present data of the virus, considering that new information is always being discovered. If any new mutations are observed, they are not expected to affect this proposed vaccine design given that they do not occur in the studied regions [8] . There was a strong binding affinity of CD8 + and CD4 + T-cell epitopes generated with both MHC-I and II respectively. Besides the immunogenicity and antigenicity of the 18 CD8 + and 19 CD4 + T-cell epitopes, six of the CD8 + epitopes (RNRFLYIIK, LWLLWPVTL, ATRRIRGGD, ELLHAPATV, RVVVLSFEL, and NLLLLFVTV) and four CD4 + epitopes (IGYYRRATRRIRGGD, RNRFLYIIKLIFLWL, SDFVRATA-TIPIQAS, and VHFVCNLLLLFVTVY) demonstrate strong interaction with MHC-I and MHC-II. TLRs are a group of transmembrane receptors that assist with detection of an invading pathogenic organism [68] by detecting dsRNA produced by viruses during DNA and RNA replication [69] . TLR activation results in a series of steps that lead to the regulation of the expression of cytokines, chemokines and type I IFNs [70] , therefore characterization of this interaction is essential. Studies have shown that this interaction has been associated with the immune-activating roles of TLRs [71] . This allows the direct increase of IFN-c production by antigen-specific CD8 + T-cells [72] . CD4 + T-cells are required for the induction of cytotoxic T lymphocyte activation to minor histocompatibility and some viruses. Interaction of TLR3 with multi-epitope generated in this study is expected to induce cytokines through TLR pathway that will promote and enhance the immune response against COVID-19 disease. Conclusion Overall, this study provides potential 18 and 19 T-cell epitopes (each of CD8 + T-cell and CD4 + T-cell), 8B-cell epitopes and multiepitope based vaccines that are highly antigenic, immunogenic and non-allergic with good population coverage that can assist in the experimental design of suitable vaccine for COVID-19 (SARS-CoV-2). They need to be validated, synthesized and tested to meet the urgent need for the prevention of COVID-19. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. A novel coronavirus outbreak of global health concern Emerging coronaviruses: Genome structure, replication, and pathogenesis Pharmacologist's view of the new corona virus Genomic variance of the 2019-nCoV coronavirus The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health — The latest 2019 novel coronavirus outbreak in Wuhan, China Human Coronaviruses: A Review of Virus-Host Interactions Reverse vaccinology approach to design a novel multi-epitope vaccine candidate against COVID-19: an in silico study Preliminary Identification of Potential Vaccine Targets for the COVID-19 Coronavirus (SARS-CoV-2) Based on SARS-CoV Immunological Studies Identification of an epitope of SARS-coronavirus nucleocapsid protein Assessment of immunoreactive synthetic peptides from the structural proteins of severe acute respiratory syndrome coronavirus Virus-specific memory CD8 T cells provide substantial protection from lethal severe acute respiratory syndrome coronavirus infection Lack of Peripheral Memory B Cell Responses in Recovered Patients with Severe Acute Respiratory Syndrome: A Six-Year Follow-Up Study Epitope based vaccine prediction for SARS-COV-2 by deploying immuno-informatics approach Comparative computational analysis of SARS-CoV-2 nucleocapsid protein epitopes in taxonomically related coronaviruses Design of multi epitope-based peptide vaccine against E protein of human 2019-nCoV: An immunoinformatics approach pp. 04.934232 Development of epitope-based peptide vaccine against novel coronavirus 2019 (SARS-COV-2): Immunoinformatics approach The protein data bank Multiple Sequence Alignment Using ClustalW and ClustalX pp. 3.1-2.3.22 VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen Immunogenicity Prediction by VaxiJen: A Ten Year Overview Vaccine xxx (xxxx) xxx Understanding the immunogenicity and antigenicity of nanomaterials: Past, present and future Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction Gapped sequence alignment using artificial neural networks: Application to the MHC class i system Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method Properties of MHC Class I Presented Peptides That Enhance Immunogenicity In silico identification and characterization of common epitope-based peptide vaccine for Nipah and Hendra viruses Immunoinformatic Analysis to Identify Proteins to Be Used as Potential Targets to Control Bovine Anaplasmosis Overlapping CD8+ and CD4+ T-cell epitopes identification for the progression of epitope-based peptide vaccine from nucleocapsid and glycoprotein of emerging Rift Valley fever virus using immunoinformatics approach Development of an epitope conservancy analysis tool to facilitate the design of epitope-based diagnostics and vaccines Prediction of IL4 Inducing Peptides Predicting population coverage of T-cell epitope-based diagnostics and vaccines Prediction of continuous B-cell epitopes in an antigen using recurrent neural network Improved protein structure prediction using predicted interresidue orientations Crystal structure of an NK cell immunoglobulin-like receptor in complex with its class I MHC ligand The pepATTRACT web server for blind, large-scale peptide-protein docking VMD: Visual molecular dynamics UCSF Chimera -A visualization system for exploratory research and analysis The HADDOCK2. 2 web server: user-friendly integrative modeling of biomolecular complexes HADDOCK: a proteinÀ protein docking approach based on biochemical or biophysical information Solvated docking: introducing water into the modelling of biomolecular complexes Antibody specific Bcell epitope predictions: leveraging information from antibody-antigen protein complexes A combined view of B-cell epitope features in antigens Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): An overview of viral structure and host response Population coverage analysis of T-Cell epitopes of Neisseria meningitidis serogroup B from Iron acquisition proteins for vaccine design Human leukocyte antigen diversity: A Southern African perspective Human leukocyte antigen (HLA) diversity and clinical applications in South Africa A Computational Approach for Identification of Epitopes in Dengue Virus Envelope Protein: A Step Towards Designing a Universal Dengue Vaccine Targeting Endemic Regions In silico prediction of epitopes for Chikungunya viral strains From ZikV genome to vaccine: in silico approach for the epitope-based peptide vaccine against Zika virus envelope glycoprotein In silico-based vaccine design against Ebola virus glycoprotein Immunoinformatics Approach for Epitope-Based Peptide Vaccine Design and Active Site Prediction against Polyprotein of Emerging Oropouche Virus Immunoinformatics Approach for Designing an Epitope-Based Peptide Vaccine against Treponema pallidum Outer Membrane Beta-Barrel Protein The immune epitope database and analysis resource in epitope discovery and synthetic vaccine design Computational approach for predicting the conserved B-cell epitopes of hemagglutinin H7 subtype influenza virus AllergenFP: allergenicity prediction by descriptor fingerprints Structural Basis of T Cell Recognition A novel HIV T helper epitope-based vaccine elicits cytokine-secreting HIV-specific CD4+ T cells in a Phase I clinical trial in HIV-uninfected adults Approaching rational epitope vaccine design for hepatitis C virus with meta-server and multivalent scaffolding A Sequence Homology and Bioinformatic Approach Can Predict Candidate Targets for Immune Responses to SARS-CoV-2 Overview of Immune Response During SARS-CoV-2 Infection: Lessons From the Past Kinetics of SARS-CoV-2 specific IgM and IgG responses in COVID-19 patients Persistence and decay of human antibody responses to the receptor binding domain of SARS-CoV-2 spike protein in COVID-19 patients Viral epitope profiling of COVID-19 patients reveals cross-reactivity and correlates of severity Profiling Early Humoral Response to Diagnose Novel Coronavirus Disease (COVID-19) Analysis of SARS-CoV-2 Antibodies in COVID-19 Convalescent Blood using a Coronavirus Antigen Microarray Sensing of viral infection and activation of innate immunity by toll-like receptor 3 A novel TLR3 inhibitor encoded by African swine fever virus (ASFV) Toll-like receptor signaling pathways Insights into the relationship between toll like receptors and gamma delta T cell responses Human Effector CD8 + T Lymphocytes Express TLR3 as a Functional Coreceptor Vaccine xxx (xxxx) xxx The College postdoctoral fellowship awarded to the first author by the College of Agriculture, Engineering and Science, University of KwaZulu-Natal is gratefully appreciated. The authors also wish to acknowledge Centre for High Performance Computing, South Africa for the computer programs and facilities that were used in this project. Supplementary data to this article can be found online at https://doi.org/10.1016/j.vaccine.2021.01.003.