key: cord-0884797-zagb0fwf authors: Hernández‐Huerta, María Teresa; Pérez‐Campos Mayoral, Laura; Romero Díaz, Carlos; Martínez Cruz, Margarito; Mayoral‐Andrade, Gabriel; Sánchez Navarro, Luis Manuel; Pina‐Canseco, María Del Socorro; Cruz Parada, Eli; Martínez Cruz, Ruth; Pérez‐Campos Mayoral, Eduardo; Pérez Santiago, Alma Dolores; Vásquez Martínez, Gabriela; Pérez‐Campos, Eduardo; Matias‐Cervantes, Carlos Alberto title: Analysis of SARS‐CoV‐2 mutations in Mexico, Belize, and isolated regions of Guatemala and its implication in the diagnosis date: 2020-11-01 journal: J Med Virol DOI: 10.1002/jmv.26591 sha: 66868bb77db96853a887f58eccee5f873f12a3ca doc_id: 884797 cord_uid: zagb0fwf The genomic sequences of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) worldwide are publicly available and are derived from studies due to the increase in the number of cases. The importance of study of mutations is related to the possible virulence and diagnosis of SARS‐CoV‐2. To identify circulating mutations present in SARS‐CoV‐2 genomic sequences in Mexico, Belize, and Guatemala to find out if the same strain spread to the south, and analyze the specificity of the primers used for diagnosis in these samples. Twenty three complete SARS‐CoV‐2 genomic sequences, available in the GISAID database from May 8 to September 11, 2020 were analyzed and aligned versus the genomic sequence reported in Wuhan, China (NC_045512.2), using Clustal Omega. Open reading frames were translated using the ExPASy Translate Tool and UCSF Chimera (v.1.12) for amino acid substitutions analysis. Finally, the sequences were aligned versus primers used in the diagnosis of COVID‐19. One hundred and eighty seven distinct variants were identified, of which 102 are missense, 66 synonymous and 19 noncoding. P4715L and P5828L substitutions in replicase polyprotein were found, as well as D614G in spike protein and L84S in ORF8 in Mexico, Belize, and Guatemala. The primers design by CDC of United States showed a positive E value. The genomic sequences of SARS‐CoV‐2 in Mexico, Belize, and Guatemala present similar mutations related to a virulent strain of greater infectivity, which could mean a greater capacity for inclusion in the host genome and be related to an increased spread of the virus in these countries, furthermore, its diagnosis would be affected. The first cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) were reported in December 2019 in Wuhan city, Hubei province, China 1, 2 ; thus initiating the coronavirus pandemic . 3 According to the information released from scientists around the world and the GISAID consortium, until September 15, 2020, SARS-CoV-2 has caused 29,445,572 cases worlwide and 931,454 deaths. 4 According to predictions, 5 the total number of deaths will increase to 2,778,330 by January 1, 2021. To identify how the virus spread, cross-sectional studies with phylogenetic analysis and markers that identified mutations were implemented. 6 It is known from epidemiological reports that the first cases started in Mexico from the East, particularly from the United States, Spain, France, Germany, Singapore, and especially from Bergamo, Italy. 7, 8 In addition, we think that the dispersion went from Mexico to Belize and Guatemala, and therefore, there could be the same molecular characteristics, for this reason we included these three countries in our study. East respiratory syndrome coronavirus. 9 Its structure contains a single-stranded RNA (ssRNA) genome with a length of 29,903 bp. It comprised of a 5ʹ-untranslated region (5ʹ-UTR), a conserved replicase domain (ORF1ab) cleaved into 16 nonstructural proteins (NSPs) that participate in virus transcription and genome replication, four structural proteins (S, E, M, and N), several accessory proteins (ORF3a, ORF6, ORF7a, ORF7b, ORF8, and ORF10), and a highly conserved 3ʹ-UTR 1, 10, 11 (Table 1 ) among other coronaviruses. 12, 13 The ORF1ab gene encodes for replicase polyprotein 1ab (pp1ab), which is constituted of NSPs (NSP1, NSP2…NSP16). Of these, NSP12 corresponds to RNA-dependent RNA polymerase (RdRp) and is formed by 932 amino acids (4392-5324 residues). The spike (S) protein has been described as responsible for the interaction with the human receptor angiotensin-converting enzyme 2 (hACE2) 14 ; it is constituted of two domains, the S1 domain, responsible for binding, and the S2 domain that mediates the fusion of the viral and cellular membrane. 15 Moreover, S1 has variations but S2 is highly conserved. 16 Nonsynonymous substitution changes the protein sequences, these have been reported in SARS-CoV-2 in the functional domains of ORF3a. 17 Issa et al. 17 reported that these substitutions are related to virulence, infectivity, ion channel formation, and virus release. ORF3a mutations have been found in other countries such as India. 18 We analyze and identify the characteristics of circulating SARS-CoV-2 mutations present in genomic sequences in Mexico, Belize, and Guatemala, to find out if they have the same molecular characteristics, we also evaluate how these mutations affect the primer The results indicated the presence of similar mutations in ORF1ab, S (S1, S2 or S2'), ORF 3a, ORF7a, ORF8, and N, as well as in the noncoding (5ʹ-UTR and 3ʹ-UTR) and intergenic regions (between ORF3a and E gene) in strains from Mexico, Belize, and Guatemala. Also, we found that primers from the Center for Disease Control and Prevention (CDC US) could present low specificity. We analyzed 457 SARS-CoV-2 genomic sequences from Mexico, Belize, ( Figure 2A) . Also, Figure 2B shows an overall view of the trimeric spike Figure 3 illustrates the spatial distribution of the L84S mutation along with R48, G50, L57, P56, Q72, and Y73 residues, which could be a glycerol binding site. S97 and L98 could be a region that binds to an Hg + ion likewise, ORF8 could be related to pathogenesis. 14, 27 The in silico analysis of the primers is used in the RT-qPCR for detection of SARS-CoV-2 ( Table 5 ). The results reported by Yan et al. 28 and Udugama et al., 29 show that most of the primers contain give a product using Primer3Plus, and similarly, multiple alignments (Table 5 ). This indicates a high sensitivity to these primers, however, reported that C>T was the most frequent mutation observed and also found C8782T (S2839) mutations in ORF1ab and T28144C (L84S) in ORF8 genes. We did not identify the mutation in the C29095T (F274) N gene. 39 Moreover, current evidence of the mutation of an aspartate (D) at position 614 to glycine (G) in spike is possibly related to increased infectivity, 41 but also gives a more pathogenic strain. The G614G mutation alters the fusion of the cell membrane and the data reveals that it is located in a highly glycosylated region that also allows the identification of two viral clades. 39, 42 The aspartate strain has been found in cases reported on the West Coast of United States, while the glycine strain has been reported on the East Coast. 43 43 We found a mutation in the noncoding regions 5ʹ-UTR (C241T), this type of mutation in UTRs of SARS-Cov-2 has been studied recently, suggesting that C241T in 5ʹ-UTR appeared early during the outbreak, and could be key in virus replication and RNA folding, 46 affecting the steam-loop 5b (SL5b) 47, 48 and the host defense. 49 The intergenic mutation A29700G located between ORF3a and E genes might emerge through adenosine deaminase acting on RNA (ADAR) and could be important in the antiviral response 28, 35, 50 reducing the stability in the RNA fold. 29 Activation of the SARS-CoV-2 spike protein via sequential proteolytic cleavage can be at two distinct sites. For many CoVs, the spike protein is cleaved at the boundary between the S1 and S2 subunits (residues 685 and 686), which remain non covalently bound in the prefusion conformation while for all CoVs, the spike is further cleaved by host proteases at the so-called S2 site located immediately upstream of the fusion peptide (residues 788-806). 51 Also, RBD is constituted by residues 333-527 and belongs to a region that attaches to hACE2, a highly conserved cryptic epitope in the receptor-binding domains of SARS-CoV-2 and SARS-CoV. 52 As we can see, the D614G mutation is not between important regions known, but recently has been associated with high prevalence, from <1% in January to 69% in March. The global spread of SARS-CoV-2 subtype with spike protein D614G mutation is shaped by Human Genomic Variations that regulate the expression of TMPRSS2 and MX1 genes, although the mechanism by which such a phenomenon occurs is not clear yet. 53 The ORF8 protein has 121 residues in length and very little is known about its function. Nevertheless, Zhang et al. 22 have proposed a 3D model along with its binding sites. The L84S mutation in genomic sequences in Mexico can indicate that the circulating strain shows a different characteristic, like the Wuhan strain. However, the number of analyzed samples is a limitation of guarantee. Due to several reports of low sensitivity in the RT-PCR test, which is not considered the gold standard for diagnosis of COVID-19, 54,55 we analyzed a predictive evaluation of the sensitivity of the primers used ( Table 5 ). The N gene has a high degree of conservation in coronaviruses, 56 however, in our study, the N gene is the third with the highest number of mutations, following the ORF1ab and S genes. HERNÁNDEZ-HUERTA ET AL. As time passes, mutations in the genomic sequences of SARS-CoV-2 could appear in the highly conserved regions and the effectiveness of the diagnostic methods could be compromised. Factors, such as correct sampling, conservation and transport of the sample, extraction 61 quality and integrity of RNA, 37 calibration of the thermocycler, and optimal amplification conditions, may influence the results. Likewise, the design of primers in conserved regions is essential, and experimental studies are required for a wider understanding. Finally, several questions related to the mutations remain, a very important one is whether these mutations are related to the observed case-fatality rate. Until September 13, 2020, Mexico has a very high case-fatality rate of 10.6%, Belize 1.3%, and Guatemala 3.6%, in addition to a high number of people with comorbidities. 62 A new coronavirus associated with human respiratory disease in China A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster A novel coronavirus from patients with pneumonia in China disease and diplomacy: GISAID's innovative contribution to global health Institute for Health Metrics and Evaluation. COVID-19 projections. University of Washington Molecular epidemiology of infectious diseases. Electron Physician El nuevo coronavirus que llegó de Oriente: análisis de la epidemia inicial en México Full genome sequence of the first SARS-CoV-2 detected in Mexico Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Origin and evolution of pathogenic coronaviruses Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan Accessory proteins of SARS-CoV and other coronaviruses Programmed ribosomal frameshifting in decoding the SARS-CoV genome De novo design of protein peptides to block association of the SARS-CoV-2 spike protein with human ACE2 Receptor-binding domain of SARS-CoV spike protein induces highly potent neutralizing antibodies: implication for developing subunit vaccine Exploring the genomic and proteomic variations of SARS-CoV-2 spike glycoprotein: a computational biology approach SARS-CoV-2 and ORF3a: nonsynonymous mutations, functional domains, and viral pathogenesis. mSystems Molecular conservation and differential mutation on ORF3a gene in Indian SARS-CoV2 genomes Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation SARS-Cov-2 RNA-dependent RNA polymerase in complex with cofactors Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR Development and evaluation of an efficient 3'-noncoding region based SARS coronavirus (SARS-CoV) RT-PCR assay for detection of SARS-CoV infections A melting curve-based multiplex RT-qPCR assay for simultaneous detection of four human coronaviruses Improved TaqMan real-time assays for detecting hepatitis A virus Extended ORF8 gene region is valuable in the epidemiological investigation of severe acute respiratory syndrome-similar coronavirus Laboratory testing of SARS-CoV, MERS-CoV, and SARS-CoV-2 (2019-nCoV): current status, challenges, and countermeasures Diagnosing COVID-19: the disease and tools for detection CDC 2019-Novel coronavirus (2019-nCoV) real-time RT-PCR diagnostic panel. Division of Viral Diseases Specific primers and probes for detection 2019 novel coronavirus. China National Institute For Viral Disease Control and Prevention Detection of 2019 novel coronavirus (2019-nCoV) in suspected human cases by RT-PCR. School of Public Health Detection of second case of 2019-nCoV infection in Japan Diagnostic detection of novel coronavirus 2019 by real time RT-PCR. Department of Medical Sciences Return of the coronavirus: 2019-nCoV Evidence for the selective basis of transitionto-transversion substitution bias in two RNA viruses Variant analysis of COVID-19 genomes Mutation landscape of SARS-CoV-2 reveals three mutually exclusive clusters of leading and trailing single nucleotide substitutions Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment Genomic characterization of a novel SARS-CoV-2 Making sense of mutation: what D614G means for the COVID-19 pandemic remains unclear SARS-CoV-2 viral spike G614 mutation exhibits higher case fatality rate Distinct viral clades of SARS-CoV-2: implications for modeling of viral spread A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Genomic analysis of early SARS-CoV-2 variants introduced in Mexico RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look Protein structure and sequence reanalysis of 2019-nCoV genome refutes snakes as its intermediate host and the unique similarity between its spike protein insertions and HIV-1 An in silico map of the SARS-CoV-2 RNA structurome Analysis of rapidly emerging variants in structured regions of the SARS-CoV-2 genome ADARs and the balance game between virus infection and innate immune cell response Should RT-PCR be considered a gold standard in the diagnosis of COVID-19? A highly conserved cryptic epitope in the receptor binding domains of SARS-CoV-2 and SARS-CoV Global spread of SARS-CoV-2 subtype with spike protein mutation D614G is shaped by human genomic variations that regulate expression of TMPRSS2 and MX1 genes Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites Should RT-PCR be considered a gold standard in the diagnosis of COVID-19? Fact sheet: comparison of national RT-PCR primers, probes, and protocols for SARS-CoV-2 diagnostics Analytical sensitivity and efficiency comparisons of SARS-CoV-2 RT-qPCR primer-probe sets Non-specific primers reveal false-negative risk in detection of COVID-19 infections Chest CT for typical coronavirus disease 2019 (COVID-19) pneumonia: relationship to negative RT-PCR testing Recent progress on the diagnosis of 2019 novel coronavirus Extraction-free COVID-19 (SARS-CoV-2) diagnosis by RTPCR PCR to increase capacity for national testing programme during a pandemic Analysis of SARS-CoV-2 mutations in Mexico, Belize, and isolated regions of Guatemala and its implication in the diagnosis The authors acknowledge all the people and laboratories responsible for the sequencing and submitting to the GISAID and NCBI databases for public consultation. Without them, this study could not have been