key: cord-0769804-99xju8ik authors: Kaur, Navpreet; Singh, Rimaljot; Dar, Zahid; Bijarnia, Rakesh Kumar; Dhingra, Neelima; Kaur, Tanzeer title: Genetic comparison among various coronavirus strains for the identification of potential vaccine targets of SARS-CoV2 date: 2020-08-01 journal: Infect Genet Evol DOI: 10.1016/j.meegid.2020.104490 sha: 9e26907cb07971a7fd3905822f5e84c186ba3811 doc_id: 769804 cord_uid: 99xju8ik On-going pandemic pneumonia outbreak COVID-19 has raised an urgent public health issue worldwide impacting millions of people with a continuous increase in both morbidity and mortality. The causative agent of this disease is identified and named as SARS-CoV2 because of its genetic relatedness to SARS-CoV species that was responsible for the 2003 coronavirus outbreak. The immense spread of the disease in a very small period demands urgent development of therapeutic and prophylactic interventions for the treatment of SARS-CoV2 infected patients. A plethora of research is being conducted globally on this novel coronavirus strain to gain knowledge about its origin, evolutionary history, and phylogeny. This review is an effort to compare genetic similarities and diversifications among coronavirus strains, which can hint towards the susceptible antigen targets of SARS-CoV2 to come up with the potential therapeutic and prophylactic interventions for the prevention of this public threat. Coronaviruses (CoVs) are single-stranded, positive-sense RNA viruses belonging to the order Nidovirales, family Coronaviridae, and subfamily Coronavirinae (Spaan et al., 2012) . CoVs possess the largest genomes amongst all RNA viruses ranging from 26 to 32 kilobases in length, with G + C contents varying from 32% to 43% (Woo et al., 2009 (Woo et al., , 2007 . CoVs are predominantly associated with enteric and respiratory diseases in animals and humans (Cheng et al., 2004; Gélinas et al., 2001) . Subfamily coronavirinae further diverges into three major generas or groups -the alpha-CoVs (group 1), the beta-CoVs (group 2), and the gamma-CoVs (group 3) characterized by varying genetic makeup and antigenic cross-reactivity (Cleri et al., 2010; Gorbalenya et al., 2004; Khan et al., 2020; Woo et al., 2010) . Delta-CoVs, representing a novel genus of Coronaviruses were also later found in birds and pigs. Out of these, only alpha-CoV strains (HCoV-229E and HCoV-NL63) along with beta-CoVs (HCoV-HKU1, HCoV-OC43, severe acute respiratory syndrome coronavirus [SARS-CoV] , and Middle East respiratory syndrome coronavirus [MERS-CoV]) have been identified as human pathogenic strains (Cortellis, 2020) . Swine acute diarrhea from RBCs and thus infect other cells (Izaguirre, 2019) . This furin cleavage site might have been acquired by RNA recombination and its presence in SARS-CoV2 might be responsible for infecting human cells. Also, this cleavage site might have allowed bat CoV to jump into humans and thus initiate the ongoing pneumonia outbreak COVID-19 (Andersen et al., 2020; WO, 2020; Wrapp et al., 2020; . utilizes angiotensin-converting enzyme-2 (ACE-2) (Hofmann et al., 2005) receptors to enter in the host cell, similar to SARS-CoV, which is otherwise a beta-CoV. (Shults et al., 2012; Zaki et al., 2012) are two bat-viruses included in this genus. Novel COVID-2019 disease-causing virus is also J o u r n a l P r e -p r o o f classified as a beta-CoV and has been named as SARS-CoV2. The angiotensin-converting enzyme 2 (ACE2) (Gorbalenya et al., 2004; Woo et al., 2010) acts as one of the main receptors in SARS-CoV to enter the host cell, whereas, MERS-CoV, on the other hand, utilizes dipeptidyl peptidase 4 (DPP4, also known as CD26) (Song et al., 2014) as the primary receptor. Other than these, HCoV-OC43 and HCoV-HKU1, are two non-SARS CoV species included in this genus that probably utilize sialic acid residues as receptors and have hemagglutinin-esterase activity (Vlasak et al., 1988) . Mostly Avian Coronaviruses are included in this genus; the infectious bronchitis virus of chickens is the most prominent of all. This virus is responsible for causing respiratory and reproductive tract disease in chickens. The S gene and upstream oforf8 are the two regions of the viral genome suggested to be essential break-points for viral RNA recombination, where the former encodes the spike (S) protein containing the receptor-binding domain (RBD) whereas, later encodes an accessory protein (Hon et al., 2008; Hu et al., 2017; Wu et al., 2016) . Frequent RNA recombination of CoVs and immense genetic diversity, as well as the prevalence of SARS-CoV, is likely to be one of the reasons for the emergence of novel coronavirus variant i.e., SARS-CoV2 and subsequently COVID-19 outbreak. Andersen et al., 2020 ,(Andersen et al., 2020 have tried to explain the proximal origin of SARS-CoV2 by providing two hypotheses of its emergence. First, natural selection in an animal host prior to zoonotic transfer, and the second hypothesis is natural selection in humans after the zoonotic transfer (Andersen et al., 2020) . Natural selection is a process that allows organisms to adapt to their environment by selectively reproducing suitable variations in their genotype or genetic constitution (Gregory, 2009 ). 2.1.1. Natural selection in an animal host prior to zoonotic transfer: The RBD region of S protein in SARS-CoV2 is optimized for binding to human-like ACE2 receptor, and the most probable reason for this change is natural selection (Tao et al., 2020; . As bat SARS-CoV-like Coronaviruses share the closest homology to SARS-CoV2 therefore, bats are thought to be the natural reservoir of SARS-CoV2 (Tao et al., 2020) . Additionally, the genome of Rhinolophus affinis bat (P. J o u r n a l P r e -p r o o f Furthermore, no strain of bat beta-CoV is found to have a polybasic cleavage site as found in the genome of novel coronavirus SARS-CoV2 strain. Also, no direct progenitor of SARS-CoV2 is identified until now. All these shreds of evidence suggest that mutations, insertions, and deletions at the S1 and S2 junction of Spike protein of SARS-CoV2 are responsible for evolutionary changes (Andersen et al., 2020; . A high population density of an animal host is a must for a precursor virus to achieve both mutations and polybasic cleavage site in S protein of SARS-CoV2 (Andersen et al., 2020) . Another hypothesis of adaptation of above mentioned genomic features and consequently the origin of SARS-CoV2 could be that a progenitor of SARS-CoV2 containing all the genomic characteristics (polybasic cleavage site and mutations in S protein) might have jumped into humans via yet unknown human to human transmission (Andersen et al., 2020; Tao et al., 2020; . Various strains of SARS-CoV2 sequenced so far carry similar genomic characteristics as described above, pointing towards a common progenitor from which they have adapted these features. The RBD region in SARS-CoV2 shows features very similar to the viral strains isolated from pangolins suggesting that the virus that jumped to humans probably consisted of this region and polybasic cleavage insertion might have occurred during human to human transmission (Andersen et al., 2020) . This is true in the case of MERS-CoV where repeated jumps of the virus from dromedary camels were the reason for all human cases producing short transmission chains or single infections that without adapting to sustained transmission, eventually resolved (Dudas et al., 2018) . Retrospective serological studies and examination of banked human samples can be informative and assist in determining whether such cryptic spread has occurred or not (Andersen et al., 2020; Dudas et al., 2018; Wang et al., 2018) . Most of the evolutionary changes in organisms occur as a consequence of mutations. The mutation is an alteration of the nucleotide sequence in the genome of an organism, these variations bring novelty that we observe in the course of evolution; also, natural selection can act upon these alterations (Baer, 2008) . Usually, small count of these variations is beneficial, some are inconsequential (neutral) and most of them are harmful or non-beneficial for the organisms. Whether an organism has a low mutation rate or high, the count of harmful mutations always outnumbers beneficial mutation (Loewe and Hill, 2010) . The largest group of molecular parasites known to infect humans, animals, and plants are RNA Environmental alterations along with genetic plasticity of RNA viruses favour the emergence of several new RNA viruses, consequently, new viral pathogens come in contact with potential hosts and facilitate host jumping (Morse and Schluederberg, 1990) . CoV mutations: RNA viruses usually have relatively high mutation rates than DNA viruses as well as a million times higher than their hosts; this is the reason for enhanced viral adaptability J o u r n a l P r e -p r o o f and evolvability (Duffy, 2018) . Mutation rate or mis-insertion errors during RNA replication have been suggested to fall in the range of 10 −3 to 10 −5 substitutions per nucleotide (Domingo et al., 1988) . Immensely high rates of mutations in RNA viruses results in a yield of offsprings that differ by 1-2 mutations from their parent, forming a diverse mutant cloud of descendants (Vignuzzi and Andino, 2012) . Absence or lack of efficiency of proof-reading of RNA polymerases is one of the contributing factors of such high mutation rates in these viruses. In contrast, the RNA viruses with the largest known genome sizes i.e., coronaviruses, show relatively low rates of mutations than other RNA viruses. The slower mutation rates or preservation of such large genomes is probably associated with the exceptional characteristics of CoV RTC (replication-transcription complex) which contains 3′-5′ exoribonuclease activity that probably provides proof-reading function which is unique to CoV genomes among all other variants of RNA viruses Minskaia et al., 2006) . The rate of RNA virus mutations usually lies in between the range of 10 -6 to 10 -4 substitutions per nucleotide site per cell infection, (Peck and Lauring, 2018) whereas, studies suggest that the rate of mutation of SARS-CoV falls in a range between 0.80 -2.38 × 10 -3 nucleotide substitutions per site per year, much lower than other RNA virus counterparts (Zhao et al., 2004) . The whole-genome sequence alignment of CoV revealed 54% identity among varying CoV strains, whereas, the genome sequence alignment of the nsp-coding region alone of CoV shows 58% similarity and that of the structural protein-coding region shows 43% similarity, suggesting that nsps (non-structural proteins) form the conserved region of the genome with high percent identity, contrarily, structural proteins in need of adaptation to new hosts are more diverse . The mutation rates in the RNA virus replication are relatively much higher than DNA viruses. RNA virus genomes are usually small, with less than 10kb length, however, the genome length of CoVs are comparatively larger with approx. ~30kb length permitting easier accommodation and modification of genes (Vega et al., 2004) . The slower mutation rates or preservation of such large genomes is probably associated with the exceptional characteristics of CoV RTC (replication-transcription complex) which contains 3′-5′ exoribonuclease activity of nsp14 as well as various other RNA processing enzymes (Lauber et al., 2013; Minskaia et al., 2006) . The 3′-5′ exoribonuclease activity probably provides a proof-reading function that is unique to CoV genomes among all other variants of RNA viruses Minskaia et al., 2006) . In pursuance of recent studies, the genome of SARS-CoV2 is believed to have originated from Orthocoronavirinae subfamily . (Table 2 ) The results of zpicture comparative genomic analyses of SARS-CoV2 and SARS-CoV have revealed extremely high homology between the two strains at the nucleotide level. Additionally, the genomes of these two strains differ from each other in six regions. The first three regions of difference belong to the partial coding sequences of ORF1a/b (448nt, 55nt, and 278nt, respectively). The next two regions belong to the partial coding sequences of the S gene (315nt and 80nt, respectively) and the last region of difference is a part of the coding sequence of the orf7b and orf8 genes(214nt) (Jiabao Xu, 2014) . The spike gene of J o u r n a l P r e -p r o o f SARS-CoV2 shows more homology to bat-CoV, whereas two accessory genes 3a and 8b possess homology to SARS-CoV. Proteomic similarity analyses of SARS-CoV and SARS-CoV2 have suggested that most of the proteins are highly homologous (95%-100%). RdRp and 3CLpro protease share over 95% of sequence similarity even though at the genome level these two strains share only 82% similarity (Chan et al., 2020; Dong et al., 2020; Lu et al., 2020; Morse et al., 2020) . Additionally, both these strains share 76% of sequence similarity in their S proteins, a highly conserved receptorbinding domain (RBD), and a domain of S protein (Chan et al., 2020; Dong et al., 2020; Lu et al., 2020; Morse et al., 2020) . Also, PLpro sequences of SARS-CoV and SARS-CoV2 share 83% similarity with a large number of similar active sites (Morse et al., 2020) . All these pieces of evidence suggest a common evolutionary history of both these viral strains. However, SARS-CoV2 possesses two proteins (orf8 and orf10) that share no homology to the SARS-CoV strain. SARS-CoV conserved sequences of orf8 are different from the amino acid sequence of orf8 derived from SARS-CoV2. Since these two proteins of SARS-CoV2 possess no homology to other CoV strains, it might be therapeutically beneficial to study the biological function of these two proteins i.e., orf8 and orf10 in SARS-CoV2 (Chan et al., 2020) . . Additionally, spike protein was found to possess 27 amino acid substitutions with a length of 1,273 amino acids, including amino acid region 357-528 in the RBD with six substitutions and another six at amino acid region 569-655 in the underpinning subdomain (SD). Furthermore, the C-terminal of the receptor-binding subunit S1 domain was found to have four substitutions (Q560L, S570A, F572T, and S575A) (Guo et al., 2004; Wu et al., 2020) . The receptor-binding motifs that interact with the human ACE-2 receptor were found to be exactly similar to the SARS-CoV strain with no amino acid substitutions, whereas, six mutations occurred in the other RBD region (Ge et al., 2013) . Due to limited knowledge about SARS-CoV2, the reasonable explanations for these amino acid substitutions are not known yet. Novel SARS-CoV2 is currently being researched the most around the globe to cope up with the current pandemic scenario affecting a great number of people worldwide for coming up with efficient vaccines and other therapeutic interventions. Globally a bunch of research groups are working to deduce an ample amount of information from the viral genome and to find out the most potential target for the development of the viral vaccine. (Robson, 2020) . So, we also carried sequence alignment for this particular region and found it to be completely conserved as we observed a 100% match between the SARS-CoV2 strain and SARS bat RaTG13 strain, whereas the motif differs at 3 amino acid positions in MERS strain. For further visual confirmation, of the level of genomic divergence and similarity between SARS-CoV2, SARS Bat coronavirus RaTG13, and other common beta CoV strains, via MSA, in the S and N structural proteins, please refer to Figure 7 . The adherence of the host receptor to the virus via S protein marks the initiation of the viral infection. As soon as the virus adheres to the host receptor, the process of cleavage of viral S protein into two subunits i.e., S1 (N-terminal receptor binding domain) and S2 (C-terminal J o u r n a l P r e -p r o o f domain) are triggered by host proteases (Huang et al., 2006; Qiu et al., 2006; ). The S1 subunit interaction with the host receptor plays a major role in determining the host range of CoVs (Kuo et al., 2000) . Additionally, this interaction brings about a conformational change in the S2 subunit of S protein thus exposing the hidden fusion peptide which aids in viral entry into the host cellular membrane. The conformational change leads to the formation of a six-helix bundle fusion core that brings the virus and hosts cellular membrane nearby hence, fusing the lipid bilayers. This fusion ensures the entry of viral nucleocapsid into the host cellular cytoplasm (Fung and Liu, 2014; Masters, 2006) . Once the viral nucleocapsid is inside the host cell, it is uncoated thus revealing the viral genomic RNA. This genomic RNA now acts as mRNA for initiating translation of replicase polyprotein. Two open reading frames i.e., ORF1a and ORF1b are present in the replicase gene. Translation of these open reading frames i.e., ORF1a and ORF1b results respectively in the development of polyprotein 1a (pp1a) and a larger polyprotein 1ab (pp1ab) (Brierley et al., 1987) . Newly synthesized pp1a and pp1ab undergo autoproteolytic cleavage. Cleavage of pp1a produces 11 non-structural proteins (nsps) i.e., nsp1-nsp11, and that of pp1ab results in the formation of 15 nsps (nsp1-nsp10andnsp12-nsp16). The functions of many nsps are not completely understood, however, the papain-like protease in nsp3 and main protease in nsp5 are responsible for this autoproteolytic cleavage of pp1a and pp1ab, whereas the RNA-dependent RNA polymerase (RdRp) is contained within nsp12 (Baker et al., 1993) . After translation, the replicase now utilizes the viral genomic RNA as a template to synthesize negative sense genomic RNAs which will later serve as a template for synthesis of progeny positive-sense genomic RNAs. Additionally, replicase synthesizes a nested set of subgenomic RNAs (sgRNAs) via discontinuous transcription process (Sawicki et al., 2007) . The processes of The endoplasmic reticulum is one of the main organelle involved in the synthesis and proper folding of newly synthesized proteins and thus the growth of viral infection (GM, 2000) . However, under certain circumstances, the protein folding capacity of ER might not be adequate to carry out proper folding of all the newly synthesized proteins which can result in the accumulation of unfolded proteins in ER leading to ER stress. Unfolded protein response (UPR) is the signaling pathway evolved by the cells to combat ER stress and maintain cellular homeostasis (David J. Shapiro1,*, Mara Livezey1, Liqun Yu1, Xiaobin Zheng1, 2017; Ron and Walter, 2007) . The accumulation of unfolded proteins in ER triggers the activation of three ER J o u r n a l P r e -p r o o f stress transducers, namely: PKR-like ER protein kinase(PERK), activating transcriptional factor-6 (ATF6), and inositol-requiring protein-1 (IRE1) (Tabas and Ron, 2011) . The activation of these transducers further initiates UPR signaling. Once activated, this signaling pathway tries to regain cellular homeostasis by increasing the pace of proper protein folding and decreasing the protein synthesis inside ER and if the stage of homeostasis is impossible to achieve, UPR activates apoptotic pathways for the betterment of the organism (Fung and Liu, 2014; Tabas and Ron, 2011) . Research studies around the globe have proposed that SARS-CoV infected cells exhibited an upregulated expression of genes associated with ER stress i.e., glucose-regulated protein 94 (GRP94) and glucose-regulated protein 78 (GRP78) (Jiang et al., 2003; Yeung et al., 2008) . Moreover, current pieces of evidence suggest three chief phenomenons involved in the induction of ER stress by CoV namely: Formation of double-membrane vesicles, glycosylation of CoV structural proteins, and depletion of ER lipid (Fung and Liu, 2014) . induce the formation of DMVs. Cellular membrane modifications are suggested to occur during the replication of various plus-strand RNA viruses and CoV is one amongst them (David-Ferreira and Manakar, 1965) . As specified by the data gained using electron microscopy, DMVs are found to be located probably in the sites near RTCs in the vicinity of major CoV replicase proteins (Gosert et al., 2002; Snijder et al., 2006) . The source of DMV formation is not yet clear but the late endosomes, autophagosomes, and the early secretory pathway are thought to be the originators of DMVs (Prentice et al., 2004; van der Meer et al., 1999; Verheije et al., 2008) . A recent study by Reggiori et al., 2010 , has suggested that to form DMVs, CoVs seize the EDEMosomes to obtain the ER membrane. The COPII-independent vesicles, EDEMosomes, are usually found in the ER and are responsible for maintaining the level of mannosidase alpha-like1 (EDEM1), a regulator of ER-linked degradation (Calì et al., 2008) . So, the above pieces of evidence point towards ER-origin of CoV induced DMVs (Reggiori et al., 2010) . Protein glycosylation is an integral part of protein folding in the ER. The reaction involves the addition of a carbohydrate moiety to the protein molecule (Roth et al., 2012) . Excluding N proteins, all other structural proteins are synthesized in a massive amount by ER. In CoVs, based on amino-acid side-chain atoms to which glycans are attached, two types of protein glycosylation are found to take place in ER i.e., O-linked (in beta-CoVs) and N-linked (in alpha and gamma-CoVs) (Cavanagh, 2007; Jacobs et al., 1986; Nal et al., 2005) . M protein is one of the most abundant proteins present in CoV and glycosylation of this protein is associated with induction of alpha interferon (IFN) function as well as in vivo tissue tropism (Charley and Laude, 1988; De Haan et al., 2003; Laude et al., 1992) . S protein is also highly glycosylated (Masters, 2006) , moreover, the glycans on S protein in SARS-CoV have been shown to interact with two alternative receptors of SARS-CoV (independent of major ACE-2 receptor) namely: L-SIGN (liver lymph node-specific intercellular adhesion molecule-3- molecule-3-grabbing non-integrin) (Han et al., 2007) . Chaperones are the proteins present inside ER that assist in proper protein folding, maturation as well as assembly. S proteins are highly dependent on calnexin, an ER protein chaperone; moreover, studies have suggested that S2 subunit in S glycoproteins of SARS-CoV interact with calnexin and a decrease in the infectivity of pseudotype lentivirus carrying SARS-CoV S protein was observed after knocking out calnexin (Fukushi et al., 2012) . So, striking out chaperones will hamper proper protein folding and consequently assembly of virions, thus decreasing viral spread and infectivity (Fung and Liu, 2014) . After synthesis and folding of proteins in the ER, they are transported to ERGIC for assembly of the virion. Mature virions are released by the process of exocytosis leaving behind an ER with depleted lipid levels (Fung and Liu, 2014) . The above-mentioned factors play a vital role in the generation of ER stress, which in turn activates the UPR pathway that brings stress response factors like PERK, IRE1, and ATF6 into action ( Figure 5 ). Although the detailed mechanism of molecular interactions between CoV S proteins and stress factors (PERK/IRE1/ATF6) has not been determined, but the layout of CoV induced UPR activation pathways are demonstrated below: CoV induced activation of the PERK pathway: PERK is the first branch to get activated in the UPR pathway after ER stress (Szegezdi et al., 2006) . The translation process entirely collapses as the activation of PERK (protein kinase RNA-like endoplasmic reticulum kinase) pathway results in phosphorylation of α-subunit ofeukaryoticinitiationfactor2 (eIF2α), that regulates the mRNA translation machinery, bringing a halt to protein translation (Ron and Walter, 2007) . Additionally, the transcription factor GADD153 is responsible for the initiation of CoV induced J o u r n a l P r e -p r o o f apoptosis and it was found that the expression of GADD153 is up-regulated during late CoV infection stages (Fung and Liu, 2014; Marciniak et al., 2004; Puthalakath et al., 2007) . CoV induced activation of the IRE1 pathway: Inositol-requiring kinase 1 (IRE1), self-activates its free luminal domain by homodimerization and transautophosphorylation. The activated domain, in turn, activates the transcription factor XBP1 (Xbox binding protein) mRNA by splicing and removing a 26bp intron (Calfon et al., 2002; Yoshida et al., 2001) . Activated XBP1 then induces upstream UPR gene expression (Bechill et al., 2008; Versteeg et al., 2007) . Besides, apoptosis (Fung and Liu, 2014; Urano et al., 2000) . CoV induced activation of the ATF6 pathway: Activating transcription factor (ATF6) is basic in nature consisting of leucine residues. As per studies, CoV infected cells lead to activation of ATF6 pathways which in turn results in the up-regulation of ER chaperone proteins to combat ER stress (Fung and Liu, 2014; Sung et al., 2009) . A vaccine is a biological preparation designed to protect humans from viral and bacterial infections. Vaccines stimulate the production of antibodies inside the human body before disease J o u r n a l P r e -p r o o f generation, in the same manner as antibodies are produced after the individuals are exposed to the disease pathogen (Siegrist, 2013) . In the last decade, vaccine development technology has evolved significantly, involving the formulation of cell-culture based vaccines (e.g., Flucelvax Tetra cell-based vaccine (Bühler and Ramharter, 2019) There is limited knowledge of how the immune system of humans reacts naturally against SARS-CoV2. The fact that SARS-CoV2 displays a high level of homology to SARS-CoV, the target epitopes for vaccine development of SARS-CoV might prove to be effective for SARS-CoV2 as well (Hoffmann et al., 2020; Letko et al., 2020; Lu et al., 2020) . Despite the high level of similarity between both, there are certain genetic variations as well. Studies reveal that only 16% of B cell epitopes and 23% of T cell epitopes of the SARS-CoV map identically to SARS-CoV2 (Ahmed et al., 2020) . Key SARS-CoV2 target antigenic sites for vaccine development (Table 3 ) are enlisted below: identified surface S glycoprotein of CoVs to be the most ideal target for vaccine development (Du et al., 2009; Schindewolf and Menachery, 2019) . During host cell receptor interaction with the S glycoprotein of the CoV, S protein undergoes cleavage directed by host cell proteases and divides into two subunits: S1 and S2 (Huang et al., 2006; Kuo et al., 2000; ). The S2 subunit then undergoes some conformational change to reveal the hidden fusion peptide that aids in fusing S protein to host cell membrane and thus making a path for viral RNA to enter the host cell cytoplasm (Masters, 2006) . Since this surface glycoprotein (S protein) plays J o u r n a l P r e -p r o o f the most significant role in the initiation and spread of CoV infection inside an organism, therefore, S protein serves as the most vulnerable target for vaccine development of CoVs (Wrapp et al., 2020) . Developing vaccines against full length S protein is beneficial as full-length proteins can provide more target epitopes along with maintaining proper protein conformation and consequently higher immunogenicity (Pallesen et al., 2017; . To date, a SARS-CoV-2 S protein trimmer (S-Trimer) vaccine prepared using its patented Trimer-Tag© technology is reported to have been developed by Clover Biopharmaceuticals, which will be launched within next 4-5 weeks (Biopharmaceuticals, 2020) . Other than full-length protein-based vaccines, specific regions of S protein such as S2 subunit, RBD, NTD, and FP also serve as potential targets for vaccine development ( Figure 6 ). Studies reveal that B cell epitopes in the S2 subunit derived from S protein of SARS-CoV2 map identically to SARS-CoV and thus epitopes of this region might prove to be promising candidates for induction of protective antibody response and thus vaccine development. Preliminary studies for confirming the compatibility of the S2 subunit for vaccine development have already been carried out and the results depict promising nature of these epitopes in generating cross-reactivity and neutralizing antibodies (Ahmed et al., 2020) . RBD is present in the S1 subunit of SARS-CoV2 and aids in binding the host cell receptor and then fusing the membrane of the host cell to the virus with the help of S2 subunit. SARS-CoV2 binds to the same host cell receptor i.e., ACE-2 (angiotensin-converting enzyme-2) as SARS-CoV (Brielle et al., 2020; Lan et al., 2020) . Therefore, like the S2 subunit of the spike, the RBD region also makes a promising candidate as a target region for the development of vaccines against SARS-CoV2. As per the findings from a J o u r n a l P r e -p r o o f recent study, the SARS-CoV RBD specific antibodies were observed to cross-react with SARS-CoV2 protein, additionally, SARS-CoV2 was found to be cross-neutralized by induction of RBD SARS-CoV antisera, suggesting that RBD-SARS CoV based vaccines have potential to prevent SARS-CoV2 infection (Tai et al., 2020) . In addition, the manipulation of ACE-2 receptor binding domain will prove to be a great therapy as it will ensure preventing viral entry into the host cell. In several CoV species, NTD like RBD is reported to constitute carbohydrate receptor binding activity. Thus, NTD also serves as a candidate target region for vaccine development. A study demonstrated that an antibody that binds to the MERS-CoV S1 subunit NTD region was found to cross-neutralize the wild-type strain EMC of MERS-CoV (2012) (Chen et al., 2017) , thus, depicting the potential of spike NTD region as potential antigen target. However, so far there is not much knowledge available about the function of S1-NTD of SARS-CoV2, a gain of higher understanding might demonstrate its interaction with certain receptors and make it a potential vaccine development target (J. . Membrane protein is the most abundant protein present on the surface of SARS-CoV2 and is involved in virion assembly in the cells. As specified by immunogenic and structural analysis, M protein is found to harbour a T cell epitope cluster that holds the potential to generate a strong cellular immune response (van der Meer et al., 1999) , thus depicting the potential of M protein to serve as a target for vaccine development (Jin et al., 2005; Oh et al., 2012; . Envelope glycoproteins are small proteins that are composed of 75 amino acids in SARS-CoV2. The E proteins of CoVs along with M proteins play a crucial role in virion morphogenesis and their assembly within the cell (Siu et al., 2008) . As per studies, the J o u r n a l P r e -p r o o f potential of recombinant SARS and MERS-CoV with mutated E protein live attenuated vaccines have been explored earlier (Graham et al., 2013; Schoeman and Fielding, 2019; Shang et al., 2020) . Moreover, E protein is believed to be a chief virulence factor as knock-out of this protein results in reduced secretion of major inflammatory factors i.e., IL-1, TNF, and IL-6 (Nieto-Torres et al., 2014; . Nucleocapsid protein is an integral protein of SARS-CoV2 that aids in the formation of ribonucleoprotein complex also known as capsid by packaging viral genome inside the viral envelope. The formation of the capsid is important for viral self-assembly and replication (Jin et al., 2005; S. Li et al., 2020) . As specified by the data obtained from a study, 89% of SARS infected patients were able to produce antibodies to this antigen, depicting high antigenicity of N protein (Leung et al., 2004) . However, previous studies demonstrate very fluctuating observations regarding the potential of N protein as a target antigen for vaccine development. As per the results obtained from a previous study, the SARS-CoV N protein DNA vaccine was able to induce the production of antibodies in vaccinated C57BL/6 mice and was thus used for the treatment of vaccinia virus (Kim et al., 2004) . In contrast, a recent study on SARS-CoV2 claims N protein to be unsuitable for vaccine development because the antibodies induced by the N protein of SARS-CoV2 were found to be incapable of providing immunity to SARS-CoV infection (Gralinski and Menachery, 2020; Shang et al., 2020) . Recent pandemic novel coronavirus emerged from Wuhan, China has caused widespread fear and concern and has turned on a global public health security alarm. (Chan et al., 2020; Chen et al., 2020; Gralinski and Menachery, 2020; Malik et al., 2020; Ren et al., 2020; Zaki et al., 2012; S Preliminary identification of potential vaccine targets for the COVID-19 Coronavirus (SARS-CoV-2) Based on SARS-CoV Immunological Studies Polymerases of Coronaviruses: Structure, Function, and Inhibitors The proximal origin of SARS-CoV-2 Does mutation rate depend on itself? The SARS-coronavirus papain-like protease: Structure, function and inhibition by designed antiviral compounds Identification of the catalytic sites of a papain-like cysteine proteinase of murine coronavirus Coronavirus Infection Modulates the Unfolded Protein Response and Mediates Sustained Translational Repression Clover Initiates Development of Recombinant Subunit-Trimer Vaccine for Wuhan Coronavirus Genomic Characterization of a Newly Discovered Coronavirus Coronavirus genome structure and replication The SARS-CoV-2 exerts a distinctive strategy for interacting with the ACE2 human receptor An efficient ribosomal frame-shifting signal in the polymerase-encoding region of the coronavirus IBV Flucelvax Tetra: A surface antigen, inactivated, influenza vaccine prepared in cell cultures IRE1 couples endoplasmic reticulum load to secretory capacity by processing the XBP-1 mRNA Segregation and rapid turnover of EDEM1 by an autophagy-like mechanism modulates standard ERAD and folding activities Coronavirus avian infectious bronchitis virus Genomic variance of the 2019-nCoV coronavirus Spike Protein , S , of Human Coronavirus HKU1 : Role in Viral Life Cycle and Application in Antibody Detection Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan Induction of alpha interferon by transmissible gastroenteritis coronavirus: role of transmembrane glycoprotein E1 Emerging coronaviruses: Genome structure, replication, and pathogenesis Erratum: A novel neutralizing monoclonal antibody targeting the Nthe MERS-CoV spike protein Viral Replication in the Nasopharynx Is Associated with Diarrhea in Patients with Severe Acute Respiratory Syndrome Severe Acute Respiratory Syndrome (SARS) Disease Briefing: Coronaviruses. Dis. Brief. Coronaviruses Human aminopeptidase N is a receptor for human coronavirus 229E Evaluation of modified vaccinia virus Ankara based recombinant SARS vaccine in ferrets an Electron Microscope Study of the Development of a Mouse Hepatitis Virus in Tissue Culture Cells Anticipatory UPR Activation: A Protective Pathway and Target in Cancer Monitoring of S Protein Maturation in the Endoplasmic Reticulum by Calnexin Is Important for the Infectivity of Severe Acute Respiratory Syndrome Coronavirus Coronavirus infection, ER stress, apoptosis and innate immunity Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor Bovine coronaviruses associated with enteric and respiratory diseases in Canadian dairy cattle display different reactivities to anti-HE monoclonal antibodies and distinct amino acid changes in their HE, S and ns4.9 protein The Cell: A Molecular Approach Severe Acute Respiratory Syndrome Coronavirus Phylogeny: toward Consensus RNA Replication of Mouse Hepatitis Virus Takes Place at Double-Membrane Vesicles SARS-Coronavirus ancestor's foot-prints in South-East Asian bat colonies and the J o u r n a l P r e -p r o o f refuge theory A decade after SARS: Strategies for controlling emerging coronaviruses Return of of the the Coronavirus : Viruses Understanding Natural Selection : Essential Concepts and Common Misconceptions 156-175 SARS corona virus peptides recognized by antibodies in the sera of convalescent cases Specific Asparagine-Linked Glycosylation Sites Are Critical for DC-SIGN-and L-SIGN-Mediated Severe Acute Respiratory Syndrome Coronavirus Entry The novel coronavirus 2019 (2019-nCoV) uses the SARS-coronavirus receptor ACE2 and the cellular protease TMPRSS2 for entry into target cells Human coronavirus NL63 employs the severe acute respiratory syndrome coronavirus receptor for cellular entry Evidence of the Recombinant Origin of a Bat Severe Acute Respiratory Syndrome (SARS)-Like Coronavirus and Its Implications on the Direct Ancestor of SARS Coronavirus Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus SARS coronavirus, but not human coronavirus NL63, utilizes cathepsin L to infect ACE2-expressing cells Biotechnology and the transformation of vaccine innovation: The case of the hepatitis B vaccines The Proteolytic Regulation of Virus Cell Entry by Furin and Other Proprotein Convertases Characterization and translation of transmissible gastroenteritis virus mRNAs Systematic Comparison of Two Animal-to-Human Transmitted Human Coronaviruses: SARS-CoV-2 and SARS-CoV Phosphorylation of the Subunit of Eukaryotic Initiation Factor 2 Is Required for Activation of NF-B in Response to Diverse Cellular Stresses Induction of Th1 type response by DNA vaccinations with N, M, and E genes against SARS-CoV in mice Coronaviruses and the human airway: A universal system for virus-host interaction studies Coronaviruses: Emerging and re-emerging pathogens in humans and animals Susanna Lau Emerging viruses COVID-19 infection: origin, transmission, and characteristics of human coronaviruses Generation and Characterization of DNA Vaccines Targeting the Nucleocapsid Protein of Severe Acute Respiratory Syndrome Coronavirus Characterization of the budding compartment of mouse hepatitis virus: Evidence that transport from the RER to the Golgi complex requires only one vesicular transport step Coronavirus by Substitution of the Spike Glycoprotein Ectodomain: Crossing the Host Cell Species Barrier Genetic diversity of coronaviruses in bats in Lao PDR and Cambodia Coronavirus: Organization, Replication And Expression Of Genome Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats The Footprint of Genome Architecture in the Largest Genome Expansion in RNA Viruses Single amino acid changes in the viral glycoprotein M affect induction of alpha interferon by the coronavirus transmissible gastroenteritis virus Functional assessment of cell entry and receptor usage J o u r n a l P r e -p r o o f for SARS-CoV-2 and other lineage B betacoronaviruses Antibody Response of Patients with Severe Acute Respiratory Syndrome (SARS) Targets the Viral Nucleocapsid Coronavirus infections and immune responses The Epitope Study on the SARS-CoV Nucleocapsid Protein Shuting Bats Are Natural Reservoirs of SARS-like Coronaviruses Published by Evolution, antigenicity and pathogenicity of global porcine epidemic diarrhea virus strains The population genetics of mutations: Good, bad and indifferent Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding A Novel Coronavirus from Patients with Pneumonia in China CHOP induces death by promoting protein synthesis and oxidation in the stressed endoplasmic reticulum The Molecular Biology of Coronaviruses Trypsin Treatment Unlocks Barrier for Zoonotic Bat Coronavirus Infection Discovery of an RNA virus 3′→5′ exoribonuclease that is critically involved in coronavirus RNA synthesis From the National Institute of Allergy and Infectious Diseases, the Fogarty International Center of the National Institutes of Health, and the Rockefeller University. Emerging viruses: the evolution of viruses and viral diseases Differential maturation and subcellular localization of severe acute respiratory syndrome coronavirus surface proteins Genetic Predisposition To Acquire a Polybasic Cleavage Site for Highly Pathogenic Avian Influenza Virus Hemagglutinin Naganori Severe Acute Respiratory Syndrome Coronavirus Envelope Protein Ion Channel Activity Promotes Virus Fitness and Pathogenesis Understanding the T cell immune response in SARS coronavirus infection Localization and Membrane Topology of Coronavirus Nonstructural Protein 4: Involvement J o u r n a l P r e -p r o o f of the Early Secretory Pathway in Replication Immunogenicity and structures of a rationally designed prefusion MERS-CoV spike antigen Complexities of Viral Mutation Rates Coronavirus Replication Complex Formation Utilizes Components of Cellular Autophagy ER Stress Triggers Apoptosis by Activating BH3-Only Protein Bim Endosomal Proteolysis by Cathepsins Is Necessary for Murine Coronavirus Mouse Hepatitis Virus Type 2 Spike-Mediated Entry Coronaviruses hijack the LC3-I-positive ER-derived vesicles exporting short-lived ERAD regulators, for replication Identification of SARS-like coronaviruses in horseshoe bats (Rhinolophus hipposideros) in Slovenia COVID-19 Coronavirus spike protein analysis for synthetic vaccines, a peptidomimetic antagonist, and therapeutic drugs, and analysis of a proposed achilles' heel conserved region to minimize probability of escape mutations and drug resistance Signal integration in the endoplasmic reticulum unfolded protein response Review Article Identification and Quantification of Protein Glycosylation A Contemporary View of Coronavirus Transcription Middle east respiratory syndrome vaccine candidates: Cautious optimism Coronavirus envelope protein: Current knowledge The outbreak of SARS-CoV-2 pneumonia calls for viral vaccines Severe Respiratory Illness Associated Coronavirus Vaccines -CH 2 How Do Vaccines Mediate Protection? Vaccines 16-23 The M, E, and N Structural Proteins of the Severe Acute Respiratory Syndrome Coronavirus Are Required for Efficient Assembly, Trafficking, and Release of Virus-Like Particles Ultrastructure and Origin of Membrane Vesicles Associated with the Severe Acute Respiratory Syndrome Coronavirus Replication Complex Identification of residues on human receptor DPP4 critical for MERS-CoV binding and entry Family Coronaviridae Rapid Evolution Of RNA Viruses Epidemiology, Genetic Recombination, and Pathogenesis of Coronaviruses The 8ab protein of SARS-CoV is a luminal ER membrane-associated protein and induces the activation of ATF6 Mediators of endoplasmic reticulum stress-induced apoptosis Integrating the mechanisms of apoptosis induced by endoplasmic reticulum stress Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine A new coronavirus associated with human respiratory disease in China Detection of novel SARS-like and other coronaviruses in bats from Kenya Coupling of stress in the ER to activation of JNK protein kinases by transmembrane protein kinase IRE1. Science (80-. ) Localization of Mouse Hepatitis Virus Nonstructural Proteins and RNA Synthesis Indicates a Role for Late Endosomes in Viral Replication Emerging WuHan (COVID-19) coronavirus: glycan shield and J o u r n a l P r e -p r o o f structure prediction of spike glycoprotein and its interaction with human CD26 Mutational dynamics of the SARS coronavirus in cell culture and human populations isolated in 2003 Mouse hepatitis coronavirus RNA replication depends on GBF1-mediated ARF1 activation The Coronavirus Spike Protein Induces Endoplasmic Reticulum Stress and Upregulation of Intracellular Chemokine mRNA Concentrations Closing the gap: The challenges in converging theoretical, computational, experimental and real-life studies in virus evolution Human and bovine coronaviruses recognize sialic acid-containing receptors similar to those of influenza C viruses Diversity of coronavirus in bats from Eastern Thailand J o u r n a l P r e -p r o o f Emerging viruses Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Longitudinal surveillance of SARS-like coronaviruses in bats by quantitative realtime PCR Serological Evidence of Bat SARS-Related Coronavirus Infection in Humans Sequence Analysis Indicates that 2019-nCoV Virus Contains a Putative Furin Cleavage Site at the Boundary of S1 and S2 Domains of Spike Protein 1-9 Coronavirus genomics and bioinformatics analysis Characterization and Complete Genome Sequence of a Novel Coronavirus, Coronavirus HKU1, from Patients with Pneumonia Comparative Analysis of Complete Genome Avian Coronaviruses Reveals a Novel Group 3c Coronavirus Discovery of Seven Novel Mammalian and Avian Coronaviruses in the Genus Deltacoronavirus Supports Bat Coronaviruses as the Gene Source of Alphacoronavirus and Betacoronavirus and Avian Coronaviruses as the Gene Source of Gammacoronavirus and Deltacoronavi Coronaviruses Reveals Unique Group and Subgroup Features Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China ORF8-related genetic evidence for Chinese horseshoe bats as the source of human severe acute respiratory syndrome coronavirus Proteolytic Activation of the Spike Protein at a Novel RRRR / S Motif Is Implicated in Furin-Dependent Entry Acquisition of cell-cell fusion activity by amino acid substitutions in spike protein determines the infectivity of a coronavirus in cultured cells Transcriptional profiling of Vero E6 cells over-expressing SARS-CoV S2 subunit: Insights on viral regulation of apoptosis and proliferation XBP1 mRNA is induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor Receptor recognition by novel coronavirus from Wuhan: An analysis based on decade-long structural studies of SARS Progress and Prospects on Vaccine Development against SARS-CoV-2. Vaccines Moderate mutation rate in the SARS coronavirus genome and its implications SARS-CoV-2 : an Emerging Coronavirus that Causes a Global Threat 16 A novel bat coronavirus reveals natural insertions at the S1/S2 cleavage site of the Spike protein and a possible recombinant origin of HCoV-19 Fatal swine acute diarrhoea syndrome caused by an HKU2-related coronavirus of bat origin The authors duly acknowledge the support provided by the Indian Council of Medical Research (ICMR), New Delhi, India, for providing fellowship to Ms.Navpreet kaur (Grant number-5/4-5/188/Neuro/2019-NCD-1) and Panjab University, Chandigarh, for providing essential research facilities. Authors declare no conflicts of interest