key: cord-1039266-75of9ys6 authors: Shah, Abdullah; Rashid, Farooq; Aziz, Abdul; Jan, Amin Ullah; Suleman, Muhammad title: Genetic characterization of structural and open reading Fram-8 proteins of SARS-CoV-2 isolates from different countries date: 2020-09-14 journal: Gene Rep DOI: 10.1016/j.genrep.2020.100886 sha: ce90ba5219a2a82eb3f41193d1854c140ac8ec31 doc_id: 1039266 cord_uid: 75of9ys6 Since December 2019, a severe pandemic of pneumonia, COVID-19 associated with a novel coronavirus (SARS-CoV-2), have emerged in Wuhan, China and spreading throughout the world. As RNA viruses have a high mutation rate therefore we wanted to identify whether this virus is also prone to mutations. For this reason we selected four major structural (Spike protein (S), Envelope protein (E), Membrane glycoprotein (M), Nucleocapsid phosphoprotein (N)) and ORF8 protein of 100 different SARS-CoV-2 isolates of fifteen countries from NCBI database and compared these to the reference sequence, Wuhan NC_045512.2, which was the first isolate of SARS-CoV-2 that was sequenced. By multiple sequence alignment of amino acids, we observed substitutions and deletion in S protein at 13 different sites in the isolates of five countries (China, USA, Finland, India and Australia) as compared to the reference sequence. Similarly, alignment of N protein revealed substitutions at three different sites in isolates of China, Spain and Japan. M protein exhibits substitution only in one isolates from USA, however, no mutation was observed in E protein of any isolate. Interestingly, in ORF8 substitution of Leucine, a nonpolar to Serine a polar amino acid at same position (aa84 L to S) in 23 isolates of five countries i.e. China, USA, Spain, Taiwan and India were observed, which may affect the conformation of peptides. Thus, we observed several mutations in the isolates thereafter the first sequencing of SARS-CoV-2 isolate, NC_045512.2, which suggested that this virus might be a threat to the whole world and therefore further studies are needed to characterize how these mutations in different proteins affect the functionality and pathogenesis of SARS-CoV-2. Another episode of the outbreak was witnessed in 2012 in the form of MERS, a severe respiratory disease outbreak in the Middle East and currently since December, 2019 in China, the COVID-19 eruption has taken the world by storm [11, 12] . It is believed that SARS-CoV and MERS-CoV might have transmitted from bats to palm civets or dromedary camels and finally to humans and both are considered as highly pathogenic [13] [14] [15] . The genome of the SARS-CoV-2 has been sequenced. The genome has been described to have ~29.8 kb and contained 14 ORFs which encode 27 proteins. The two genes, OFR1ab and OFR1a, encoding two long polypeptides, pp1ab and pp1a respectively, are located on the 5′terminus of the genome. Proteolytically, these two polypeptides, pp1ab and pp1a, have further been processed into 14 nonstructual proteins (nsps) including nsp1 to nsp10 and nsp12 to nsp16. The four structural proteins constituting the main envelop of the virus and eight accessary proteins are encoded by the genes located on the 3′-terminus. The spike surface glycoprotein (S), small envelope protein (E), membrane protein (M) and nucleocapsid protein (N) are constituted by the four structural proteins. The eight accessary proteins include 3a, 3b, p6, 7a, 7b, 8b, 9b and ORF14 [16] . The binding of the virus to the receptors on the host cell and fusion with the cell membrane is aided by the spike surface glycoprotein [9, 17], while the N protein is assigned to interact with the viral RNA to form the ribonucleoproteins. The E proteins in collaboration with the M protein helps in viron assembly. The E protein is also known to function in comprising ion channel actions [17] [18] [19] [20] . The OFR8 gene encode 121 amino acid residues [18] . Interestingly, in ORF8 we found The analysis of N protein from different isolates revealed the presence of three different substitutions of amino acid in three locations. Furthermore, three Spanish isolates have substitution mutation at position aa197 S to L, one Chinese isolate at aa289 H to Q and one each Japanese and Chinese isolate have substitution at aa344 P to S (Fig. 2) . Likewise, the present study identified mutation in M protein of only one isolate from USA (MT (163721) (Fig. 3) . However, no mutation was found in the envelope protein which is supporting by the previous studies that indicated that E protein of SARS-CoV-2 is relatively conserved [25] . Interestingly, in ORF8 we found an accumulated substitution of Leucine, a nonpolar to Serine a polar amino acid at same position (aa84 L to S) in 23 different isolates as compared to the reference counterpart (NC_045512.2) (Fig. 4) . All the observed mutations in these isolates especially L to S could theoretically create a novel phosphorylation target for the mammalian host Serine/Threonine kinases of the host organism. And thus, can affect the conformation of the peptide. The results of the present study suggested that this virus is prone to mutations like most of other RNA viruses and with the passage of time it may become more potential threat to the world. Thus, further studies are required to characterize how these amino acids substitutions in different proteins affect the functionality and pathogenesis of SARS-CoV-2. Moreover, a A Novel Coronavirus from Patients with Pneumonia in China WHO (2020) Coronavirus Disease Clinical features of patients infected with 2019 novel coronavirus in Wuhan Genomic characterization and epidemiology of 2019 novel coronavirus: implications of virus origins and receptor binding Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Clinical characteristics of 2019 novel coronavirus infection in China Coronavirus pathogenesis Middle East respiratory syndrome coronavirus: another zoonotic betacoronavirus causing SARS-like disease Origin and evolution of pathogenic coronaviruses Isolationand characterization of viruses related to the SARS coronavirus from animals in southern China Evidence for camel-to-human transmission of MERS coronavirus A new coronavirus associated with human respiratory disease in China Molecular dynamics of Middle East Respiratory Syndrome Coronavirus (MERS CoV) fusion heptad repeat trimers Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan Genomic characterisation and epidemiology of J o u r n a l P r e -p r o o f