key: cord-0998984-ip38x8bu authors: Malaiyan, Jeevan; Arumugam, Suresh; Mohan, Kamalraj; Gomathi Radhakrishnan, Gokul title: An update on the origin of SARS‐CoV‐2: Despite closest identity, bat (RaTG13) and pangolin derived coronaviruses varied in the critical binding site and O‐linked glycan residues date: 2020-07-14 journal: J Med Virol DOI: 10.1002/jmv.26261 sha: 453ba2461bfb376912ab7e7002bd49b0648a1e02 doc_id: 998984 cord_uid: ip38x8bu The initial cases of severe acute respiratory syndrome coronavirus‐2 (SARS‐CoV‐2) occurred in Wuhan, China, in December 2019 and swept the world by 23 June 2020 with 8 993 659 active cases, 469 587 deaths across 216 countries, areas or territories. This strongly implies global transmission occurred before the lockdown of China. However, the initial source's transmission routes of SARS‐CoV‐2 remain obscure and controversial. Research data suggest bat (RaTG13) and pangolin carried CoV were the proximal source of SARS‐CoV‐2. In this study, we used systematic phylogenetic analysis of Coronavirinae subfamily along with wild type human SARS‐CoV, MERS‐CoV, and SARS‐CoV‐2 strains. The key residues of the receptor‐binding domain (RBD) and O‐linked glycan were compared. SARS‐CoV‐2 strains were clustered with RaTG13 (97.41% identity), Pangolin‐CoV (92.22% identity) and Bat‐SL‐CoV (80.36% identity), forms a new clade‐2 in lineage B of beta‐CoV. The alignments of RBD contact residues to ACE2 justified? Those SARS‐CoV‐2 strains sequences were 100% identical by each other, significantly varied in RaTG13 and pangolin‐CoV. SARS‐CoV‐2 has a polybasic cleavage site with an inserted sequence of PRRA compared to RaTG13 and only PRR to pangolin. Only serine (Ser) in pangolin and both threonine (Thr) and serine (Ser) O‐linked glycans were seen in RaTG13, suggesting that a detailed study needed in pangolin (Manis javanica) and bat (Rhinolophus affinis) related CoV. syndrome coronavirus-2 (SARS-CoV-2) remain obscure. 7, 8 The mystery of intermediate host finding will provide support to prevent further spread, to develop the targeted vaccine and antiviral drugs. The recent studies documents Rhinolophus affinis, bat-CoV (RaTG13) and Manis javanica, Pangolin-CoV were proximal to SARS-CoV-2. [9] [10] [11] Here, we aimed to update the origin of SARS-CoV-2 by systematic phylogenetic classification and spike glycoprotein (S protein) amino acid sequences. representative viruses of the Coronavirinae subfamily (α-CoV, β-CoV, γ-CoV, and Δ-CoV). 12 For phylogenetic analysis, the full length S protein sequences of 11 countries SARS-CoV-2 were compared with SARS-CoV, MERS-CoV, bat-CoV (RaTG13), Pangolin-CoV, bat-SL-CoV and previously published representative viruses of the Coronavirinae subfamily sequences by BLAST-EXPLORER program that uses the neighbor-joining method with 1000 bootstrap replicates. 13 The resulting dendrograms were used to verify previously proposed genera assignments and identify areas for clarification. Alignment of RBD and O-linked glycan residues sequences between SARS-CoV-2 strains, RaTG13, pangolin-CoV, bat-SL-CoV, and SARS-CoV, were analyzed by MEGA-10. 14 Efforts to identify the reservoir of human CoV led to the discovery of diverse CoV, which are genetically close related. For the first time, we have constructed an "S" protein sequence-based phylogenetic tree with all the known Coronavirinae subfamily viruses for the betterment of understanding of current SARS-CoV-2 clustering and classified them into genera α, β, γ, and Δ CoV. To cross-check the proximal to SARS-CoV-2; we had chosen wild type human CoV spike protein sequence to compare with all species of CoV along with recently documented closest CoV (RaTG13 and pangolin-CoV) ( Figure 1 ). The protein sequences were nearly identical across the S protein of eleven isolates, with sequence identity above 99.70%, indicative of a very recent emergence into the human population and justification here why we selected those 11 isolates than mutated and variant strains being updated globally. The phylogenetic analysis result showed that eleven SARS-CoV-2 isolates were closely clustered to inner joint neighbor Table 1 . Wuhan had 88% to 89% nucleotide identity with bat-SL-CoV (bat-SL-CoVZC45 and bat-SL-CoVzxc21), 79% to 89% nucleotide identity with human SARS-CoV and more distant from MERS-CoV (50%). 1, 22, 24, 25 Although the SARS-CoV-2 epidemic was linked to the Wuhan seafood market, Huang et al 26 reported a total of 41 patients, and 14 cases are not related to the seafood market and no trace of bats has been found, so exact place of origin need to be studied in detail. 26 permissive Vero-E6 cells leads to the loss of this adaptive function. 41 Overall, we demonstrate the key residues of RBD (455, 486, 493, 494, 501, and 505) and polybasic cleavage sites varies significantly; need to be studied in detail for a better understanding of cross-species transmission. PubMed search results showed only three bat (Rhinolophus affinis) and five pangolin CoV sequences were available and more CoV isolation need to verify the origin of RaTG13. Although RaTG13 and pangolin-derived CoV is very proximal to SARS-CoV-2, the key receptor binding and O-linked glycan residues vary significantly, except a Malayan pangolin (PRJNA573298) isolate has 100% identity. The polybasic cleavage site (PRRA insertion) was absent in RaTG13 and pangolin (PRJNA573298), whereas it is only PRR in other pangolin isolates with unique amino acid changes within. Thus, animal study, isolation of CoV from pangolin (Manis javanica) and bat (Rhinolophus affinis) is necessary to help in the understanding of SARS-CoV-2 origin and intermediate transmission. The authors thank the global doctors and scientists who identified the SARS-CoV-2, RaTG13, pangolin-CoV, and related gene sequences. T A B L E 1 3D structural difference found in receptor-binding domain ACE2 contact residues and O-linked glycan residues among SARS-CoV Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Epidemiological identification of a novel pathogen in real time: analysis of the atypical pneumonia outbreak in Wuhan, China, 2019-2020 A genomic perspective on the origin and emergence of SARS-CoV-2 The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak -an update on the status A Review of Coronavirus Disease-2019 (COVID-19) WHO Coronavirus Disease The deadly coronaviruses: The 2003 SARS pandemic and the 2020 novel coronavirus epidemic in China Viral metagenomics revealed sendai virus and coronavirus infection of Malayan pangolins (Manis javanica) A pneumonia outbreak associated with a new coronavirus of probable bat origin Probable pangolin origin of SARS-CoV-2 associated with the COVID-19 outbreak The proximal origin of SARS-CoV-2 Evolutionary trajectory for the emergence of novel coronavirus SARS-CoV-2 BLAST-EXPLORER helps you building datasets for phylogenetic analysis Molecular evolutionary genetics analysis across computing platforms CDD/SPARCLE: functional classification of proteins via subfamily domain architectures CDD/SPARCLE: the conserved domain database in 2020 Structure of the SARS-CoV-2 spike receptorbinding domain bound to the ACE2 receptor Structure of severe acute respiratory syndrome coronavirus receptor-binding domain complexed with neutralizing antibody Structural basis of receptor recognition by SARS-CoV-2 Epidemiology, genetic recombination, and pathogenesis of coronaviruses SARS and other coronaviruses as causes of pneumonia A novel coronavirus from patients with pneumonia in China Characterization and complete genome sequence of a novel coronavirus, coronavirus HKU1, from patients with pneumonia A new coronavirus associated with human respiratory disease in China Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The lancet Intraspecies diversity of SARS-like coronaviruses in Rhinolophus sinicus and its implications for the origin of SARS coronaviruses in humans Mystery deepens over animal source of coronavirus Identifying SARS-CoV-2 related coronaviruses in Malayan pangolins Protein structure and sequence reanalysis of 2019-nCoV genome refutes snakes as its intermediate host and the unique similarity between its spike protein insertions and HIV-1 Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2 Are pangolins the intermediate host of the 2019 novel coronavirus (SARS-CoV-2)? Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine Comparison of SARS-CoV-2 spike protein binding to ACE2 receptors from human, pets, farm animals, and putative intermediate hosts Predicting the angiotensin converting enzyme 2 (ACE2) utilizing capability as the receptor of SARS-CoV-2. Microbes Infect Spike protein recognition of mammalian ACE2 predicts the host range and an optimized ACE2 for SARS-CoV-2 infection Composition and divergence of coronavirus spike proteins and host ACE2 receptors predict potential intermediate hosts of SARS-CoV-2 In silico studies on the comparative characterization of the interactions of SARS-CoV-2 spike glycoprotein with ACE-2 receptor homologs and human TLRs Emergence of SARS-CoV-2 through recombination and strong purifying selection Bioinformatic analysis indicates that SARS-CoV-2 is unrelated to known artificial coronaviruses Attenuated SARS-CoV-2 variants with deletions at the S1/S2 junction. Emerging Microbes & Infections An update on the origin of SARS-CoV-2: Despite closest identity, bat (RaTG13) and pangolin derived coronaviruses varied in the critical binding site and O-linked glycan residues The authors declare that there are no conflict of interests. JM contributed to the conceptualization, study design, critical review of the content and approved the final version of the manuscript. SA contributed to study design, data analysis, and approved the final version of the manuscript. KM contributed to data analysis and approved the final version of the manuscript. GGR contributed to data analysis and approved the final version of the manuscript. http://orcid.org/0000-0001-6466-308XSuresh Arumugam http://orcid.org/0000-0001-6247-1156