key: cord-0864765-idu4j78a authors: Sohrab, Sayed S.; Azhar, Esam I. title: Genetic diversity of MERS-CoV spike protein gene in Saudi Arabia date: 2019-12-09 journal: J Infect Public Health DOI: 10.1016/j.jiph.2019.11.007 sha: 575eb53a312d827a688f809be60e1eb0bc52d99d doc_id: 864765 cord_uid: idu4j78a BACKGROUND: Middle East respiratory syndrome coronavirus (MERS-CoV) was primarily detected in 2012 and still causing disease in human and camel. Camel and bats have been identified as a potential source of virus for disease spread to human. Although, significant information related to MERS-CoV disease, spread, infection, epidemiology, clinical features have been published, A little information is available on the sequence diversity of Spike protein gene. The Spike protein gene plays a significant role in virus attachment to host cells. Recently, the information about recombinant MERS-CoV has been published. So, this work was designed to identify the emergence of any another recombinant virus in Jeddah, Saudi Arabia. METHODS: In this study samples were collected from both human and camels and the Spike protein gene was amplified and sequenced. The nucleotide and amino acid sequences of MERS-CoV Spike protein gene were used to analyze the recombination, genetic diversity and phylogenetic relationship with selected sequences from Saudi Arabia. RESULTS: The nucleotide sequence identity ranged from 65.7% to 99.8% among all the samples collected from human and camels from various locations in the Kingdom. The lowest similarity (65.7%) was observed in samples from Madinah and Dammam. The phylogenetic relationship formed different clusters with multiple isolates from various locations. The sample collected from human in Jeddah hospital formed a closed cluster with human samples collected from Buraydah, while camel sample formed a closed cluster with Hufuf isolates. The phylogenetic tree by using Aminoacid sequences formed closed cluster with Dammam, Makkah and Duba isolates. The amino acid sequences variations were observed in 28/35 samples and two unique amino acid sequences variations were observed in all samples analyzed while total 19 nucleotides sequences variations were observed in the Spike protein gene. The minor recombination events were identified in eight different sequences at various hotspots in both human and camel samples using recombination detection programme. CONCLUSION: The generated information from this study is very valuable and it will be used to design and develop therapeutic compounds and vaccine to control the MERS-CoV disease spread in not only in the Kingdom but also globally. Coronaviruses (CoVs), belongs to the family Coronaviridae. They have single stranded positive-sense RNA genome with ∼26-32 kb in size [1, 2] . The genomic organization and expression pattern are similar in all coronaviruses. It is well known that multiple CoVs are found naturally and their genetic recombination hap-contact with camels, as well as community settings [6] [7] [8] . MERS-CoV is responsible for causing lower respiratory infections with fever and cough followed by shortness of breath and organ failure in severe cases with comorbidities [9] [10] [11] . Earlier it was believed that inter-human transmission is limited but the nosocomial infection was reported in healthcare facilities due to inadequate infection control which resulted in larger outbreaks with MERS-CoV confirming human to human infection [12] . MERS-CoV has been isolated and sequenced from camel and its infected owner patient as well as in air sample collected from the same barn that sheltered the infected camel and showed the identical sequences from both camel and human which confirms the direct transmission from camel to human [7, 13] . MERS-CoV has been detected in upper and lower respiratory secretions at relatively high virus load and in fecal samples [14] . The identification of MERS-CoV and the neutralizing antibodies as well as a similar coronavirus has already been identified from camels and bats from multiple locations; Ghana, Europe, South Africa, Oman, the Canary Islands, UAE, Korea, and Egypt [12, 14, 15] . Recently, a mutant MERS-CoV has been identified from South Korea. The mutation was observed in Ribosomal Binding Domain (RBD) domain of Spike protein gene [16, 17] . Additionally, the MERS-CoV neutralizing antibodies were detected recently in the young goats and sheep from Jordan and Egypt [15, 18] . Recently, a large outbreak observed due to the unusual presentation of MERS-CoV from Riyadh, Saudi Arabia [19] . The CoVs have high sequence variation which favors the possible recombination, mutations, and emergence of a novel and recombinant virus with variable characters and extended hosts. The emergence of dominant and recombinant MERS-CoV which caused a human outbreak in 2015 has been reported from camel [3] . The zoonotic introduction was suspected after the MERS-CoV identification [18] and in a recent study from UAE the genetic diversity of MERS-CoV full-genome from both human and dromedary camel was analyzed, and very closed sequence similarity was observed which confirms the zoonotic introductions [20, 21] . Additionally, the zoonotic introduction time and seasons play an important role in the disease spread. Based on the analysis of the distribution of human outbreak cluster size, it has also been demonstrated that the time of zoonotic introduction and season plays a significant role in human outbreak driven by MERS-CoV in Arabian Peninsula [22] . The MERS-CoV is known to have genetic diversity. The Spike gene plays a significant role in host cell attachment and the entry of the virus in the host cells [23] . The RBD of Spike protein gene mediates the virus interaction to the host cell and binds with dipeptidyl peptidase 4 (DPP4, CD26) known as a cellular receptor which favors the viral entry into the cell and is immunodominant and induces neutralizing antibodies [24] [25] [26] . Based on the literature, it is well known that MERS-CoV Spike protein gene has significant genetic variability isolated from both human and camel. In South Korea, a recent outbreak occurred with a high fatality rate. The Spike gene diversity was identified in many samples and showed interspecific variation with MERS-CoV isolates from South Korea [27] . A novel recombinant MERS-CoV has already been identified from Saudi Arabia [28] . Recently, in another study, total 530 nucleotides deletion was observed in Spike gene from serum samples collected from Taif, Saudi Arabia and a novel genetic variant of MERS-CoV was designated as a quasispecies [29] . Multiple substitutions of amino acids were observed in RBD, part of Spike gene from a bat sample collected from Uganda and the recombination in the S1 subunit of the Spike gene was observed and it was expected that this variation likely to play an important role in the emergence of MERS-CoV causing disease in human [30] . Recently, the MERS-CoV has been genetically and phenotypically characterized from Africa [31] and South Korea [32] . The nucleotide substitutions/ amino acid variation of Spike gene has significantly affected the virus transmission, disease spread to extended hosts and their evolution in different geographical locations. Based on the published information, there is a lack of detail information about Spike gene sequence variability of MERS-CoV from Saudi Arabia infecting human and camel. So, there is an urgent need to identify the genetic variability of Spike protein gene so that it can determine and established the link that how the virus is moving from infected camel to human. Based on the above information, this study was conducted. The detection of MERS-CoV in human and camel determining the genetic diversity among Spike gene will further help researchers as well as health authority to design and develop an effective disease management and control strategies in the Kingdom of Saudi Arabia. The MERS-CoV samples were collected from both human and camel and stored at Special Infectious Agents Unit (SIAU), King Fahd Medical Research Centre (KFMRC), King Abdulaziz University (KAU), Jeddah, Saudi Arabia. The samples (blood and nasal swabs) were isolated from the six patients just after one day after admission into the hospital. All nasal swabs were properly collected and maintained by immersing into viral transport medium in a cold container. All the collected samples were stored for further analysis after proper processing at BSL-3 lab in SIAU, KFMRC, KAU, Jeddah, Saudi Arabia. The freshly prepared Vero cells (ATCCCCL-81) were used to inoculate by using 100 l nasal swab and maintained in complete DMEM following the described protocol [7] . The inoculated cells were further for virus infection and development of cytopathic effect by incubating at 37 • C with 5% Co2. After complete cytopathic effect, the supernatant from cell-culture was collected. The complete sequences of MERS-CoV were retrieved from National Center for Biotechnology Information (NCBI/PubMed) database. The retrieved sequences were aligned by using BioEdit 7.2 software (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). The MERS-CoV Spike protein gene-specific primers were designed by using the selected sequences to amplify the complete Spike protein gene. The viral RNA was purified from culture supernatants and nasal swab by QIAamp Viral RNA Mini Kit (Qiagen). The MERS-CoV was detected by real-time RT-PCR using upE gene and further confirmed by ORF1a primers, as described previously. Briefly, A 25 l reaction was set up containing 5 l of RNA, 12.5 l of 2 X reaction buffer from the Superscript III one step RT-PCR system with Platinum Taq Polymerase 1 l of reverse transcriptase/Taq mixture from the kit, 0.4 l of a 50 mM MgCl 2 solution (Invitrogen), 1 g of non-acetylated bovine serum albumin (Sigma), 400 nM of primers EMC-Orf1a-Fwd and EMC-Orf1a-Rev, as well as 200 nM of probe EMCOrf1a-Prb (6-carboxyfluorescein (FAM)-TTGCAAATTGGCTTGCCCCCACT -6-carboxy-N,N,N,N-tetramethylrhodamine (TAMRA). Thermal cycling was performed at 55 • C for 20 min for the RT, followed by 95 • C for 3 min and then 45 cycles of 95 • C for 15 s, 58 • C for 30 s. [7, 46] . The purified Viral RNA was used to amplify the MERS-CoV Spike protein gene. The purified viral RNA was reverse transcribed, and the Spike protein gene was amplified by RT-PCR. The amplified product was visualized on 1% agarose gel. The PCR product was eluted from the gel and purified with a gel purification kit (Qiagen). The PCR amplicon was gel eluted and purified and further sequenced by BIVeriti thermal cycler (Applied Biosystems) using Spike protein gene-specific primers in SIAU. The sequence alignment was performed using BioEdit, version 7.2.5. and the genetic diversity was determined by analyzing the sequence identity matrix with selected MERS-CoV sequences from Saudi Arabia. To explore the phylogenetic relationship of generated MERS-CoV sequences with sequences were analyzed by MEGAX software programme and a phylogenetic tree was constructed [33] . To analyze the pattern of recombination among the Spike protein gene sequences, we have used the Recombination Detection Program (RDP4, v.4.70) [34] . The multiple sequences alignment file was imported to RDP software for recombination detection. The major and minor parent and possible recombinant with recombination breakpoints and hot spots with their start and end sequences were also identified by default settings which include GENECONV, BootScan, MaxChi, Chimaera, SiScan, and 3Seq, to detect putative recombination events in the Spike gene sequence of MERS-CoV. The putative recombination events were identified and considered significant with the cut off P-value (≤0.05) with standard Bonferroni corrections. The structure was predicted by using SWISS modeling software (https://swissmodel.expasy.org/) utilizing the Spike protein gene of both human and camel samples with selected MERS-CoV. As it has already been published about the recombinant MERS-CoV, we selected a few sequences of recombinant virus to compare the protein structure with samples collected from human and camel samples. The nasal swabs collected from both human and camels were found positive by RT-PCR. The Vero cells were inoculated with samples obtained from human and camels shoed a cytopathic effect after 3 days of inoculation. The culture supernatants were used to detect the virus by RT-PCR for upE, ORF1a, and ORF1b regions. The Spike protein gene was amplified by using Spike gene-specific primers and visualized on 1% Agarose gel. The full-genome of Spike protein gene was sequenced bidirectionally from both human and camel samples at SIAU, KFMRC, KAU, Jeddah and tentatively designated as sample1390-Hu-Jed and sample-31-Cam-Jed. The generated sequences have been submitted in GenBank with their accession numbers MN403101, MN403102 and used for diversity analysis with previously published sequences. The sample1390-Hu-Jed was used to analyze the similarity with other sequences and the highest nucleotides (98.4%) and amino acid similarity (99.8%) was observed with many isolates of MERS-CoV and the lowest nucleotides (64.7%) and amino acid The phylogenetic analysis with Spike protein gene sequences (Nucleotide and Amino acid) of selected MERS-CoV sequences deposited in GenBank was performed. The phylogenetic tree based on nucleotide sequences showed various clusters. The sample1390-Hu-Jed formed closed cluster with an isolate from Buraydah (KT806006-2014) and Taif (KR912196) while the sample-31-Cam-Jed formed the closest cluster with camel isolate from Hufuf (KFU-HKU-KY706247-2014) there mixed clustering of human and camel isolates were observed in a phylogenetic relationship, Fig. 1 . The phylogenetic relationship was also analyzed by using amino acid sequences. The phylogenetic tree was constructed, and an almost similar pattern of clustering was observed with selected virus isolates (Fig. 2) . The amino acid sequence variations were observed at multiple locations. Total of 28 samples sequences showed sequence transversion of amino acids. Interestingly, two common variations were also observed in all 35 samples analyzed as compared to human and camel sample collected and used in this study ( Table 2 ). An attempt was made to analyze the nucleotide and amino acid sequence variations with selected recombinant MERS-CoV as compared to sample 31-camel-Jeddah and total nineteen nucleotide sequence variations were observed scattered throughout the Spike protein gene, and only three amino acid in camel while two amino acid variations were observed in human sample (Fig. 3A, B) . Our findings were supported by earlier reports and a unique amino acid substitution was observed in RBD which affecting the binding efficiency (Kossyvakis et al., 2015) . Additionally, in another study, only five mutations were detected in consensus sequences 473 intrahost and single nucleotide variants were identified (Borucki et al., 2016) . The Spike protein gene sequences were used to analyze the pattern of recombination in selected MERS-CoV from Saudi Arabia. Putative recombination events were identified using Recombination Detection Programme (RDP4, v.4.70) with the default settings [34] . Based on RDP parameters, no significant recombination was observed. But when the recombination pattern was analyzed by using BootScan, 3Seq, and parameters, the recombination was observed in eight, seven and one sequences at multiple positions with the average P-value of 6.870 × 10 −02, 8.031 × 10−03 and 4.589 × 10 −02 respectively with all the sequences analyzed (Fig. 4) . Recently, a genetic diversity analysis study was conducted from UAE and 10 recombination events were observed in the camel samples. The most frequent recombination breakpoint was the junctions between ORF1b and Spike protein gene [20] . Our data also supported by above reports that most of the recombination and sequences diversity have been observed in the Spike protein gene. The structure of Spike protein using amino acid sequences generated in this study from both human and camel samples and was compared to protein structure with recombinant MERS-CoV reported recently. The predicted structures of both recombinant as well as our generated sequences from both human and camel sample sequences, are presented in Fig. 5 . During modeling, no significant variations were observed among the sequences and 99.4% similarity was observed with the available template in the protein database, but 3 amino acid variations were observed in human and 2 variations in camel samples in this study. The mutation in the RBD of Spike protein gene affects the interactions with the human receptor, CD26. The similar kind of mutations and changes in amino acid have been reported earlier and the structural changes has been presented [16, 17] . The MERS-CoV was identified from Jeddah, Saudi Arabia since 2012 and causes respiratory disease to a human. This virus has spread to 27 countries. The genetic diversity has been reported among CoVs. MERS-CoV also has been reported to have genetic diversity across the whole genome and especially in Spike protein gene. Our study has provided the genetic diversity of the Spike protein gene isolated from both camel and human samples from Jeddah, Kingdom of Saudi Arabia. The highest sequence identity was observed with the previously reported sequences submitted in the GenBank. The phylogenetic tree relationships were also formed a closed cluster with earlier known viruses from various locations in the Kingdom. The nucleotide and amino acid variations were scattered throughout the full Spike protein genome. Interestingly, two common amino acid changes were observed in all the 35 samples analyzed. As it has already been reported about the emergence of the recombinant virus from Saudi Arabia [19, 28] and mutant virus from South Korea [16, 17] and other intrahost variants reported [35] . The transversion of three amino acids was observed in our sample collected from human and two amino acid variation was observed in the camel sample as compared to selected recombinant virus sequences from Saudi Arabia. The comparative structure of the Spike protein gene has been predicted and presented in this study to show how the structural changes as compared to recombinant virus appear. The effect of a mutation in the Spike protein gene with the viral entry to the host cell by using CD26 binding S.S. Sohrab, E.I. Azhar / Journal of Infection and Public Health xxx (2019) xxx-xxx requires further detail study. These changes may affect the structural changes of RBD and interactions with the cognate human receptor, CD26 as it has been reported earlier [17] . The most important factor for the emergence of a novel virus is during the recombination step in the life cycle of the virus. The recombination happens with the co-circulating CoVs in multiple hosts which further increase the rate of recombinant virus emergence. The estimated rate of mutations in CoVs are known as moderate to high as compared to other known ssRNA viruses. The rate of average substitution for CoVs has been reported as ∼10 −4 substitutions per year per site while the average substitution of S gene in 229E was observed to be ∼3 × 10 −4 per site per year [36] and in SAARS-CoV it was estimated to be 0.80-2.38 × 1 −3 per site per year. In the case of MERS-CoV, the average rate of substitution was found to be 1.12 × 10 −3 per site per year [37] . Recently, the co-circulation of multiple HCoV species in camel along with MERS-CoV has been reported from Saudi Arabia and resulted into an emergence of a recombinant virus which was responsible for an outbreak in 2015 [28, 38] . These results showed that many CoV are circulating in the wild animals originated from human and animal and by this, an opportunity is favored for the genetic recombination, evolution and emergence of potential and recombinant virus lethal to human [3] . By considering the variations in the MERS-CoV Spike protein gene reported so far, it is very important to consider the frequent appearance and conservation of RBD alterations in human infections. It is well reported that the interspecies transmission of CoVs is mainly mediated by mutations in Spike protein gene with a high affinity toward human receptors [28, 39, 40] . The RBD mutations increased the unexpected emergence with reduced affinity to human CD26 in South Korean MERS-CoV outbreak. The Identification of recombinant virus from Saudi Arabia [28] as well as unusual presentation and emergence of MERS-CoV resulted in an outbreak in Riyadh [19] raises several critical questions. Based on epidemiologic features, the zoonotic transmission of the MERS-CoV from the animal reservoir has been suggested by an intermediate animal host. The evidence of dromedary camel as a reservoir for MERS-CoV has been already reported based on the identical sequences obtained from camel and a patient with close contact with camel nasal secretion which directs cross-species transmission without any intermediate host [7, 13] . Indeed, it has been reported that the interfacial residual difference in receptors of various mammalian host species is very important for interspecies transmission of CoVs [40] . The sequence diversity and homology play an important role in various functions of virus and host cells like cellular fusion and attachment [28, 41] . The viruses have ability to cause various types of diseases including neurological disorders. [42, 43] . The human genome sequence diversity due to gene flow from South east Asia to other locations also plays an important role in the disease transmission and spread of pathogens from one location to other as well as intrahost transmission. As it has been shown that the ancient east-Asian mtDNA HG-M and exhibit the highest nucleotide diversity [44, 45] . However, our data showed no significant amino acid sequences variations in the Spike protein gene. Finally, there is not much information available about the role of RBD mutation linked with reduced affinity to host receptor CD26 and MERS-CoV transmissibility to infect human. The genetic diversity was significantly observed in the Spike protein gene sequences of both nucleotides and amino acids collected from Saudi Arabia. The information generated from this study provided an insight to the pattern of sequence diversity which plays an important role in the viral disease spread as well as movement of virus from one host to others. This diversity information will play an important role in designing and development of antiviral drugs, vaccines as well as antiviral compounds against MERS-CoV. Based on the data generated in this study, it is concluded that the genetic diversity of the Spike protein gene plays an important role in interaction with CD26. The genetic diversity emerges with the high frequency of recombination events in the CoVs resulted in the emergence of novel recombinant viruses with unpredictable changes in the virulence during human infection. Finally, to elucidate the evolutionary pathway, more detail mutation analysis of Spike protein gene needs to be done using more sequences from Saudi Arabia and we should take lessons from the outbreak caused by MERS-CoV and SAARS-CoV and we should take necessary measures to combat any further outbreak. SSS collected and processed samples, designed the study, analyzed the data, and conceived the idea. EIA supervised the research and reviewed the manuscript. Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, Saudi Arabia. Coronavirus genome structure and replication Coronavirus pathogenesis and the emerging pathogen severe acute respiratory syndrome coronavirus Epidemiology, genetic recombination, and pathogenesis of coronaviruses Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia Middle east respiratory syndrome coronavirus in bats, Saudi Arabia Evidence for camel-to-human transmission of MERS coronavirus MERS-CoV outbreak in Jeddah -a link to health care facilities Epidemiological, demographic, and clinical characteristics of 47 cases of Middle East respiratory syndrome coronavirus disease from Saudi Arabia: a descriptive study Clinical features and virological analysis of a case of Middle East respiratory syndrome coronavirus infection Transmission of MERS-coronavirus in household contacts Probable transmission chains of MERS-CoV and the multiple generations of secondary infections in South Korea Detection of the Middle East respiratory syndrome coronavirus genome in an air sample originating from a camel barn owned by an infected patient A sensitive and specific antigen detection assay for Middle East respiratory syndrome coronavirus Crosssectional surveillance of Middle East respiratory syndrome coronavirus (MERS-CoV) in dromedary camels and other mammals in Egypt Variations in spike glycoprotein gene of MERS-CoV Spread of mutant Middle East respiratory syndrome coronavirus with reduced affinity to human CD26 during the South Korean outbreak Middle East respiratory syndrome coronavirus (MERS-CoV): animal to human interaction Unusual presentation of Middle East respiratory syndrome coronavirus leading to a large outbreak in Riyadh during 2017 Diversity of Middle East respiratory syndrome coronaviruses in 109 dromedary camels based on full-genome sequencing Zoonotic origin and transmission of Middle East respiratory syndrome coronavirus in the UAE MERS-CoV spillover at the camelhuman interface Mechanisms of coronavirus cell entry mediated by the viral Spike protein Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC Structure of MERS-CoV Spike receptor-binding domain complexed with human receptor DPP4 MERS-CoV Spike nanoparticles protect mice from MERS-CoV infection Variations in spike glycoprotein gene of MERS-CoV Epidemiology of a novel recombinant MERS-CoV in humans in Saudi Arabia Spike gene deletion quasispecies in serum of patient with acute MERS-CoV infection Further evidence for bats as the evolutionary source of Middle East respiratory syndrome coronavirus MERS-CoV from camels in Africa exhibit region-dependent genetic diversity Genetic characterization of Middle East respiratory syndrome coronavirus MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets RDP4: detection and analysis of recombination patterns in virus genomes Middle east respiratory syndrome coronavirus intra-host populations are characterized by numerous high frequency variants Mosaic structure of human coronavirus NL63, one thousand years of evolution Spread, circulation, and evolution of the Middle East respiratory syndrome coronavirus Co-circulation of three camel coronavirus species and recombination of MERS-CoVs in Saudi Arabia Coronavirus diversity, phylogeny and interspecies jumping Bat-to-human: spike features determining "host jump" of coronaviruses SARS-CoV, MERS-CoV, and beyond Low levels of HIV-1 envelope-mediated fusion are associated with long-term survival of an infected CCR5-/-patient ZIKV leads to microcephaly Inhibition of Neurogenesis by Zika virus infection Austro Asiatic tribes are original native inhabitants of India Paleolithic spread of Y-chromosomal lineage of tribes in Eastern and Northeastern India Assays for laboratory confirmation of novel human coronavirus None declared. Not required.