key: cord-1030424-xdfnqb6r authors: Pagadala, Nataraj Sekhar; Landi, Abdolamir; Maturu, Paramahamsa; Tuszynski, Jack title: In silico identification of RBD subdomain of spike protein from Pro(322)-Thr(581) for applications in vaccine development against SARS-CoV2 date: 2021-04-30 journal: J Mol Struct DOI: 10.1016/j.molstruc.2021.130534 sha: e51e12864aa6cee9b7b0f88bff716df0d3548739 doc_id: 1030424 cord_uid: xdfnqb6r The three-dimensional hybrid structures of coronavirus spike proteins including the C-terminal sequence and receptor binding motif (RBM) was remodeled and energy minimized. Further, protein-protein docking show that Receptor Binding Domain (RBD) of SARSCoV 2 Lys(457)-Pro(490) bind on the surface of ACE2 receptor near N-terminal helices to form host-pathogen attachment. In this binding interface, SARS-CoV 2 shows a tight network of hydrogen bonds than other spike proteins from BtRsRaTG13-CoV, SARS-CoV, BtRsBeta-CoV, BtRsCoV-related, Pangolin-CoV (PCoV), human-CoV (hCoV), MERS-CoV (MCoV), Avian-CoV (ACoV) and PEDV1-CoV. Further studies show that subdomains from SARS-CoV 2 RBD Pro(322)-Thr(581), SARS-CoV RBD Pro(309)-Pro(575), BtRsRaTG13 RBD Thr(581)-Thr(323), BtRsBeta-CoV RBD Ser(311)-Thr(568), BtRsCoV-related Arg(306)-Pro(575) and PCoV RBD Gln(319)-Ser(589) show binding conformations with ACE2 like their full-length structures of spike proteins. In addition, the subdomains MCoV RBD Gly(372)-Val(616), ACoV RBD Gly(372)-Val(616) and PEDV1-CoV RBD Ala(315)-Tyr(675) also binds on the surface of ACE2 similar to their full-length spike proteins. The B-Cell epitope mapping also identified main antigenic determinants predicting that these nine subdomains are highly useful in recombinant vaccine development in inducing cross neutralizing antibodies against SARS-CoV 2 spike protein and inhibits its attachment with ACE2. [26], the interface details for Spike/ACE2 elucidated that SARS-CoV 2 transmissibility is due to 134 efficient use of ACE2 as a key determinant at the atomic level [27] [28] . 135 Regardless of strict health measures such as social distancing, lock down of businesses 136 and recreation centres, flight, travel, and tourism bans in many parts of the world, the high 137 transmissibility of the virus still results in a significant number of infected cases around the 138 world, which makes a fatality rate of 2% a very significant loss. To tackle this crisis, scientists 139 have started lots of efforts in two major paths to first develop a vaccine to control transmission 140 and spread of the infection and second to manufacture antivirals to treat the infected cases. As of 141 now, more than 10 vaccines are approved for SARS-CoV 2 while over 250 teams are still 142 working to develop vaccines against the virus using different methods 143 (https://www.who.int/publications/m/item/draft-landscape-of-covid-19-candidate-vaccines). 144 These includes development of inactivated/weakened virus particles, nucleic acid (DNA or 145 RNA) vaccines, non/replicating viral vectors, and protein based vaccines including recombinant 146 subunit proteins or virus-like particles [29] . Although the non-protein developed vaccines may 147 help with the urgent need to protect at risk population, vaccine previous experiences suggest that 148 recombinant protein-based vaccine would be likely the most efficient and safest vaccine for 149 long-term use as a prophylactic vaccine for public. Current evidence almost unanimously 150 recommends spike protein as the best candidate to develop an optimal vaccine with respect to 151 humoral and cellular immune responses. Since antibody dependent enhancement is also a 152 potential concern for SARS-CoV 2 vaccine, it is reasonable to pick as small as possible part of 153 spike protein that is critical target to be used as vaccine. In this study, we have studied the spike 154 protein from 10 different coronaviruses of animals and humans, including SARS-CoV and 155 SARS-CoV 2 to pinpoint the most critical region of S protein to be used as an antigen for 156 vaccine development. models of full-length spike protein from SARS-CoV 2 was remodeled including the missing C- 164 terminal sequence and receptor binding motif using the crystal structure of 2019-ncov chimeric 165 receptor-binding domain (PDB ID: 6VW1) with MODELLER 9v7 on windows operating 166 system [30] . The co-ordinates for the structurally conserved regions (SCRs) of RBD SARS-CoV 167 2 sequence were assigned from the template using pair wise sequence alignment, based on the 168 Needleman-Wunsch algorithm [31] [32] . In addition, BtRsRaTG13-CoV, SARS-CoV, 169 BtRsBeta-CoV and BtRsCoV-related, PCoV, hCoV, MCoV, ACoV, and PEDV1-CoV Lys 417 -508 of the exposed loop regions of the SARS-CoV 2 receptor binding motif and Ser 19 -183 Met 81 of ACE2 receptor were specified in a filter, feature blocking all other residues to involve 184 in the binding interface with the receptor cavity of the ACE2. Finally, ZRANK, a scoring 185 algorithm that relies on the usage of a combination of three atom-based terms, i.e., Van percentage of identity between the sequences reveals that SARSCoV2 has 97%, 92%, 76%, 76%, 212 75%, 26%, 24%, 21% and 19% identity with, BtRsRaTG13-CoV, PCoV, BtRsCoV-related, 213 BtRsBeta-CoV, SARS-CoV, MCoV, hCoV, ACoV and PEDV1-CoV, respectively. This shows 214 that SARS-CoV 2, BtRsRaTG13-CoV and PCoV are very closely related to each other compared 215 to others in the evolution (Fig. 1 ). Furthers structural studies shows that the RMSD of the full-216 length SARS-CoV 2 with other species used shows a wide range of deviation from 2.6 to 17.2 Å while 217 the super pose structures of CoV spike subdomain-ACE2 complexes show a least back bone 218 RMSD difference with its full-length spike protein-ACE2 complexes within a range of 0. Glu 35 and Thr 78 with ACE2 receptor surface (Fig. 4F) . 322 Conversely, the residues of the hCoV forms hydrogen bonds with N-acetyl-D- and Gly 449 also shows hydrogen bonds with NAG710 oxygens (Fig. 4G) . Similar to hCoV, 326 MCoV also shows a different mode of binding and allows the residues Pro 471 , Gly 483 , Thr 487 to 327 form hydrogen and π-interactions at the attachment site with the residues of the receptor ACE2. (Table 3) . 338 Over all, the critical residues at receptor binding motif of spike proteins shows identified with an unidentified cellular receptor [43] . The same way, MCoV also shows S-446 mediated attachment to sialosides and entry into human airway epithelial cells [44] . In addition, 447 studies also shown that corona viruses that belong to group-I namely, human coronavirus-229E epitopic regions that were predicted in other five subdomains (Fig. 12A) . Also, the epitopic There is a health and medical emergency to control the rapid and global ever-growing 523 SARS-CoV 2 transmission and infection. Since we are at the beginning of understating the 524 immune responses to the virus and due to lack of knowledge, we may need to use our previous 525 experiences with coronaviruses along with in silico approaches to design vaccines as the ultimate 526 way to protect healthy individuals. In this study, we have comprehensively compared the 527 sequences of spike protein from 10 different coronaviruses in the context of their interaction with 528 ACE2 to identify the best subdomain of spike protein to be used for vaccine development. Although a full-length S protein may be a better candidate to induce immunity, a more focused Third, shorter peptide can be easily scaled up and are less costly to manufacture compared to 546 longer peptides. This is a critical industrial concern when large quantities of vaccine doses are 547 required as such in the current SARS-CoV 2 pandemic. This is the first in silico study that comprehensively compares the RBD subdomain of 549 spike protein from ten closely related coronaviruses and their interaction with ACE2. Our (Fig 2A) with BtRsRaTG13-CoV RBM (Fig 808 2B) , SARS-CoV RBM (Fig 2C) , BtRsBeta-CoV RBM (Fig 2D) , BtRsCoV-related RBM (Fig 809 2E ), PCoV RBM (Fig 2F) , hCoV RBM (Fig 2G) , MCoV RBM (Fig 2H) , ACoV RBM ( Fig 810 2I ) and with PEDV1-CoV RBM (Fig 2J) with its receptor ACE2 predicted using MOE Origin and evolution of pathogenic coronaviruses Coronavirus genomics and bioinformatics 575 analysis Nervous system 577 involvement after infection with COVID-19 and other coronaviruses Middle East Respiratory Syndrome: Emergence 582 of a Pathogenic Human Coronavirus SARS and MERS: recent insights 584 into emerging coronaviruses Isolation and characterization of viruses related to the SARS 588 coronavirus from animals in southern China Clinical features of 592 patients infected with 2019 novel coronavirus in A novel coronavirus outbreak of global 595 health concern A pneumonia outbreak associated with a new coronavirus of probable 600 bat origin Mechanisms of coronavirus cell 602 entry mediated by the viral spike protein Structural 604 basis for coronavirus-mediated membrane fusion. Crystal structure of mouse hepatitis virus 605 spike protein fusion core The coronavirus spike protein is a 607 class I virus fusion protein: structural and functional characterization of the fusion core 608 complex Human 610 coronavirus NL63 employs the severe acute respiratory syndrome coronavirus receptor for 611 cellular entry Angiotensin-converting 615 enzyme 2 is a functional receptor for the SARS coronavirus Expression 619 cloning of functional receptor used by SARS coronavirus Devil and angel in the renin-angiotensin system: ACE-angiotensin II-622 AT1 receptor axis vs. ACE2-angiotensin-(1-7)-Mas receptor axis Angiotensin-converting enzyme 2 and new 625 insights into the renin-angiotensin system Evidence 628 that TMPRSS2 activates the severe acute respiratory syndrome coronavirus spike protein for 629 membrane fusion and reduces viral control by the humoral immune response Efficient 632 activation of the severe acute respiratory syndrome coronavirus spike protein by the 633 transmembrane protease TMPRSS2 A 635 transmembrane serine protease is linked to the severe acute respiratory syndrome 636 coronavirus receptor and activates virus entry SARS-CoV-2 638 and Coronavirus Disease 2019: What We Know So Far Emergence of SARS-CoV-2 through Recombination and Strong Purifying Selection The proximal origin of 643 SARS-CoV-2 Structural basis for the recognition of 645 SARS-CoV-2 by full-length human ACE2 Coronavirus Emerging in China -Key Questions for Impact Assessment Structure of SARS coronavirus spike receptor-650 binding domain complexed with receptor Receptor and 653 viral determinants of SARS-coronavirus adaptation to human ACE2 The race for coronavirus vaccines: a graphical guide Comparative protein modelling by satisfaction of spatial restraints A general method applicable to the search for similarities in 660 the amino acid sequence of two proteins CLUSTAL W: improving the sensitivity of 662 progressive multiple sequence alignment through sequence weighting, position-specific gap 663 penalties and weight matrix choice ZDOCK server: 665 interactive docking prediction of protein-protein complexes and symmetric multimers Accelerating protein docking in ZDOCK using an 668 advanced 3D convolution library Integrating statistical 670 pair potentials into protein complex prediction ZDOCK: an initial-stage protein-docking algorithm Pushing structural information into the 674 yeast interactome by high-throughput protein docking experiments ZRANK: reranking protein docking predictions with an optimized 677 energy function Prediction of continuous B-cell epitopes in an antigen using 679 recurrent neural network Prediction to Four-Dimensional Description of Antigenic Specificity Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 685 receptor Characterization of 687 the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development 688 of RBD protein as a viral attachment inhibitor and vaccine Human Coronavirus HKU1 Spike Protein Uses O-692 Acetylated Sialic Acid as an Attachment Receptor Determinant and Employs 693 Protein as a Receptor-Destroying Enzyme Structures of MERS-CoV spike glycoprotein in complex with sialoside 697 attachment receptors Aminopeptidase N is a major receptor for the entero-pathogenic coronavirus TGEV Molecular analysis 702 of the coronavirus-receptor function of aminopeptidase N Porcine aminopeptidase N is a functional receptor for the PEDV 705 coronavirus Feline aminopeptidase N is a receptor for all group I 707 coronaviruses A 193-amino acid fragment of the 709 SARS coronavirus S protein efficiently binds angiotensin-converting enzyme 2 The spike protein of SARS-CoV--a 712 target for vaccine and therapeutic development Evaluation of candidate vaccine approaches for MERS-CoV Two-way antigenic cross-reactivity between severe acute respiratory syndrome 721 coronavirus (SARS-CoV) and group 1 animal CoVs is mediated through an antigenic site in 722 the N-terminal region of the SARS-CoV nucleoprotein Structural and Functional Basis of SARS-CoV-2 Entry by 725 Using Human ACE2 Cross-neutralization of 727 SARS coronavirus-specific antibodies against bat SARS-like coronaviruses Cross-neutralization of SARS-CoV-2 by a human 733 monoclonal SARS-CoV antibody SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically 737 Determination and application of 740 immunodominant regions of SARS coronavirus spike and nucleocapsid proteins recognized 741 by sera from different animal species Lack of cross-neutralization by SARS 744 patient sera towards SARS-CoV-2 CoV-2 and SARS-CoV Infections Lack of antibody affinity maturation due to poor Toll-like receptor stimulation leads to 752 enhanced respiratory syncytial virus disease Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS 755 Author's contributions Nataraj Sekhar Pagadala performed the complete study, 765 processed information Amir Landi interpreted the results and written the Jack Tuszynski interpreted the results and written the ☒ The authors declare that they have no known competing financial interests or personal relationships 782 that could have appeared to influence the work BtRsCoV-related RBD Arg 306 -Pro 575 (E) MCoV RBD Gly 372 -Val 616 (H), ACoV RBD Gly 372 -Val 616 (I) RBD Ala 315 -Tyr 675 (J) with its receptor ACE2 predicted using MOE software suite 821 (Molecular Operating Environment (MOE) Spike protein is 823 represented in maroon ribbons with residues in cyan color while the ACE2 receptor is