key: cord-0786562-jg0in4q1 authors: Priyadarshi, Himanshu; Das, Rekha title: Complexities in viral replication strategies as a potential explanation for prevalence of asymptomatic carriers in Covid-19 infections: analytical observation on SARS-Cov2 genome characteristics date: 2021-06-10 journal: Theory Biosci DOI: 10.1007/s12064-021-00349-3 sha: d7d74f72abd12e5a0421cc4bf7926c13422bb13c doc_id: 786562 cord_uid: jg0in4q1 Analytical observations (in silico) indicate molecular features of SARS-Cov2 genome that potentially explains the high prevalence of asymptomatic cases in Covid-19 pandemic. We observed that the virus maintains a low preference for ‘GGG’ codon for glycine (3%) in its genome. We also observed multiple putative introns of 26–44 nucleotide (nt) length in the genomic region between the coding regions of Nsp10 and RPol in the viral ORF1ab, like several other beta-coronaviruses of similar infectivity levels. It appears that the virus employs a dual strategy to ensure unhindered replication within the host. One of the strategies employ a (− )1 frameshift translation event through programmed ribosomal slippage at the ribosomal slippage site in the ORF1ab. The alternate strategy relies on intron excision to generate a read through frame. The presence of ‘GGG’ in this conserved ribosomal slippage site ensures adequate tRNA in cytoplasm to match the codon, implying no additional frameshift translation due to ribosomal stalling. With fewer replication events, viral load remains low and resulting in asymptomatic cases. We suggest that this strategy is the primary reason for the prevalence of asymptomatic cases in the disease, enabling the virus to spread rapidly. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s12064-021-00349-3. The SARS-Cov2 during Covid-19 pandemic that has already infected 151 million people and claimed more than 3.1 million lives across the world in a span of 15 months is undeniably among the greatest crises facing mankind in the century. This novel coronavirus continues to challenge the health care and administrative systems of countries worldwide with its high rate of infectivity (spread) (Aguilar et al. 2020; He et al. 2020; Petersen et al. 2020) . Typical of influenza virus infections, majority of fatalities in SARS-Cov2 infections are observed in people of higher age group (over 65 years) with weakened immune systems. Among persons under 21 years of age, fatality was higher in individuals with preexisting medical conditions as well as very young children (Bixler et al. 2020) . Similar to other influenza viruses such as human influenza A (H5N1), SARS-CoV2 infection also induces cytokines storm in host which may cause severe acute respiratory syndrome, multiple organ failure and death (Ratajczak and Kucia 2020; Song et al. 2020) . These inflammatory responses are typical of infections with heavy viral load in the hosts for influenza viruses (Boon et al. 2011; De Jong et al. 2006) . Interestingly, a large proportion of the individuals infected with SARS-Cov2 virus are asymptomatic harboring relatively lower viral loads (Zhou et al. 2020 ) while simultaneously being capable of spreading the infection themselves. It is the strong prevalence of such asymptomatic carriers that make containment measures difficult in the Covid-19 pandemic (Yu and Yang 2020) . As per a study on evacuated people from China to Japan, asymptomatic ratio was nearly 30% (Nishiura et al. 2020) . Though SARS-Cov2 genome sequence, its mutant and their potential impact on disease management have been investigated Leung et al. 2020; Li et al. 2021; Starr et al. 2021) , the molecular mechanism behind the prevalence of low viral load and asymptomatic cases is largely unexplored. Here we attempted an in silico dissection of the molecular peculiarities of the SARS-Cov2 viral genome using bioinformatic tools to develop a theoretical hypothesis behind the prevalence of asymptomatic cases in Covid-19. In silico analysis of the molecular architecture of ORF1ab was carried out for 27 coronaviruses including Middle East respiratory syndrome coronavirus, SARS coronavirus and multiple novel coronavirus isolates ( Table 1) . ORF1ab of coronaviruses were subjected to Simple Modular Architecture Research Tool (SMART) for identification & annotation of protein domains and architectures (Letunic et al. 2015) . Online web server of Sequence Manipulation Suite (https:// www. bioin forma tics. org/ sms2/ codon_ usage. html) was used to estimate codon usage frequency for each amino acid in each coronavirus genome. Identification of putative introns between ORF1a and ORF1b was done based on standard GT-AG rule, and the presence of branch site (Wu and Krainer 1996) . Generunner software (http:// www. gener unner. net/) was used for in silico sequence analyses viz to check translation frame, identify putative introns and find their length. Excision of identified putative introns and verification of the correctness of the reading frame after rejoining ORF1a and ORF1b was performed in silico using Generunner software(http:// www. gener unner. net/). Protein Homology/analogy Recognition Engine V 2.0 (Phyre2) is a free web-based services for protein structure modeling, prediction and analysis (Kelley et al., 2015) . In silico protein sequence derived from ORF1ab of SARS-Cov2 was subjected to Phyre2 for identification of putative enzymes encoded in genome for RNA splicing. This study attempted an in silico exploration of the novel coronavirus genomic features underlying the high prevalence of asymptomatic carriers. A basic feature of the ORF1ab of coronaviruses appears to be the presence of a conserved ribosomal slippage site. Closer examination also reveals that the ribosomal slippage junction of all the studied coronaviruses consistently features a 'GGG' codon (Table 1) . Now, though this GGG codon at the ribosomal slippage site presents itself within the correct frame, the translating machinery reading through it would invariably encounter a premature termination codon (PTC). In other words, reading through the 'GGG' at the ribosomal slippage site disrupts the translation of key proteins such as viral RNA polymerase (RPol), RNA-dependent RNA polymerase (RdRP), helicase, non-structural protein 11 (Nsp11) and Nsp13 located downstream of this junction in ORF1b (Fig. 1 ). In addition, while introns in ORF1ab are not reported in coronaviruses, we observed multiple putative introns in silico between the coding regions of Nsp10 and RPol based on the standard GT-AG rule (Wu and Krainer 1996) . In silico excision of the observed putative introns in this region (that would also remove this 'GGG' codon from the ribosomal slippage site) could place ORF1b with ORF1a in correct frame without affecting the size of the preceding and succeeding domains (Nsp10 and RPol). Intriguingly, reading the viral RNA in a (− )1 frame at this ribosomal slippage site also produces the same result. This molecular position binds the virus to exercise one of the two options for successful translation of ORF1ab for replication in the host-either intron excision by RNA splicing or reading the template from a (− )1 frame at the ribosomal slippage site to generate a read through ORF (Fig. 2 ). An interesting twist to this simplistic model is that SARS coronaviruses are known to replicate in the host cell cytoplasm (Klein et al. 2020; Knoops et al. 2008; Snijder et al. 2006; Stertz et al. 2007 ), while the spliceosome complex required for intron removal reside inside the cell nucleus (Pessa et al. 2008 ). However, proteins homologous to enzymes of intron excision pathways have been identified from coronaviruses including SARS-Cov (Snijder et al. 2003) . Curiously, in silico protein folding prediction models for ORF1b segment of SARS-Cov2 (Accession number: MN908947) polypeptide trained on 2'-O-MT, intron binding protein and pre-mRNA splicing factors also indicate 100% probability of homology (Suppl. file 1, 2, 3 & 4). Read together, based on bioinformatics analysis there is scope to speculate that these viral genomes encode their own splicing enzyme, albeit with limited experimental evidence. Frameshift translation in eukaryotic systems occurs either by a programmed ribosomal slippage or due to stalling of the ribosomes during a translation event when faced with unavailability of specific tRNA matching the RNA template codon. Programmed ribosomal slippage in association with an RNA pseudoknot has been reported in coronaviruses (Brierley et al. 1989) . Curiously, we also observed that coronavirus genomes have a low frequency (10%) of GGG codon usage for glycine (Table 1 ) compared to other common human viruses ( Table 2 ). The GGG codon usage frequency was especially low for SARS group of viruses, with the lowest inSAS-Cov2 (3%). This would imply that the tRNA corresponding to the 'GGG' codon in the viral genome would be abundant in the tRNA pool of the host cell, leading to extremely low probability of ribosomal slippage events. Thus the viral replication in the host would continue to remain at low levels. Intense inflammatory response to influenza-like viral infections leading to clinical disease do not make any difference manifestations is significantly correlated with viral load in the hosts (Boon et al. 2011; De Jong et al. 2006) . Thus basal level replication would ensure that the virus triggers negligibly low immune reaction in otherwise healthy hosts, resulting in asymptomatic cases. Indeed, SARS-CoV2 viral load in nasopharyngeal swabs, have been observed to be several fold less in 'asymptomatic patients' than the 'asymptomatic patients in the incubation period' (Zhou et al. 2020) . At the same time, these asymptomatic patients also demonstrate a period of viral shedding (Zhou et al. 2020) , during which viral transmission is a strong possibility and complicates containment (Yu and Yang 2020) . A similar strategy is observed in Rous sarcoma virus (RSV) where the frameshift site features a stop codon (Jacks et al. 1988 ). However, by placing a functional codon that has been used sparsely in the genome at the frameshift site, the probability of frameshift translation is further reduced, as in the case of SARS-Cov2. We speculate that this is the reason for the high prevalence of asymptomatic carriers for SARS-CoV2. Strengthening our hypothesis, the closely related MERS beta-coronavirus (GGG codon usage 7%) exhibits quicker progression of disease in infected individuals (Hilgenfeld and Peiris 2013). In yeast model, natural modification by addition of methyl derivatives on uridines at wobble position promotes decoding of G-ending codon (Johansson et al. 2008) . In silico analysis of the ORF1b segment of SARS-Cov2 (Accession number: MN908947) polypeptide predict the presence of an S-adenosyl-L-methionine-dependent methyltransferases domain in the viral genome. Assuming a phenomenon similar to yeast in human cells, this could potentially help in unhindered decoding of other GGG codons in the SARS coronavirus genome despite poor abundance of cytoplasmic tRNA corresponding to the 'GGG' codon. On the other hand, same can also assist SARS coronaviruses for avoiding ribosomal slippage and producing more asymptomatic cases. Coronavirus replication in vitro gets inhibited after supplementation of 'D, L-lysine acetylsalicylate and glycine' (Muller et al. 2016) . These two studies invite further investigations to understand the evolution of molecular mechanisms for coronavirus replication strategies and their relation with the prevalence of asymptomatic carriers. Multiple introns of lower (26 & 44 nucleotide) size ranges in this genomic region were also characteristic of coronaviruses with lower GGG codon usage preference such as SARS and SARS-Cov2. In addition, the intron sizes also appeared to be conserved among several viruses in our study, suggesting a definite selective basis to these molecular features. On the other hand, we could observe only a single, 89 nucleotide-long putative intron in MERS. In fact, in silico excision of even this putative intron in MERS using SMART resulted in the disruption of either Nsp10 or RPol domain. Notably, SARS-Cov2 possesses more infectivity (transmission ability) than the SARS and MERS (Chu et al. 2020; Petersen et al. 2020 ). Since multiple introns offer a wider probability for generation of correct reading frames, it may be argued that the SARS and SARS-Cov2 viruses should preferably resort to the intron excision method for rapid replication over frameshift translation. In fact, influenza viruses are known to hijack host splicing machinery to process some of their own RNA (Dubois et al. 2014 ) as well as possess features aiding in programmed ribosomal frameshifting (Firth et al. 2012) . Through subgenomic RNAs (sgRNA) quantification from the SARS-CoV2 infected people, it has been learned that transcription is repressed in asymptomatic cases compared to symptomatic cases (Wong et al. 2021 ). The study also revealed, higher prevalence of structural deletions in SARS-CoV2 RNAs in symptomatic cases. Together, these two observations support our hypothesis of more active transcription and splicing of the viral RNA in symptomatic cases. However, it needs to be remembered that the ssRNA genome of coronaviruses is the positive sense strand for viral protein translation . Therefore, excision of the introns from the initial viral particles would literally destroy the true copies of the original genetic material from the host system, thus eliminating raw material for further mutation and evolution. With this logic it is tempting to suggest the presence of a molecular switch that dictates which of the two mechanisms would be adopted by the virus for replication at a given time or tissue location. We also suggest that the presence of these combined hindrances to replication is in fact the major selective advantage to the SARS-Cov2 virus, resulting in the rapid spread of the disease. Indeed, viruses that replicate rapidly, sending the host immune systems into overdrive in a short duration are at a disadvantage, since rapid development of symptoms help elimination of infected individuals before the virus has a chance to spread in the population (Fig. 2) . Based on bioinformatic analyses of the SARS-Cov2 genome, we suggest that the SARS-Cov2 viral replication in host cells is strongly dependent on either a programmed frameshift translation at a specific ribosomal slippage site in the ORF1ab region or excision of introns within this region. The inherent presence of these two hindrances to viral replication appears to be the reason for its slower pace of replication, resulting in a high prevalence of asymptomatic carriers in the host population. Though our study provides an insight on molecular peculiarities of SARS-Cov2 underlying the high prevalence of asymptomatic cases, our observations are exclusively from in silico observations and require experimental testing and validation. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s12064-021-00349-3. Investigating the impact of asymptomatic carriers on COVID-19 Transmission SARS-CoV-2-associated deaths among persons aged 21 Years-United States H5N1 influenza virus pathogenesis in genetically diverse mice is mediated at the level of viral load Characterization of an efficient coronavirus ribosomal frameshifting signal: requirement for an RNA pseudoknot Comparative tropism, replication kinetics, and cell damage profiling of SARS-CoV-2 and SARS-CoV with implications for clinical manifestations, transmissibility, and laboratory studies of COVID-19: an observational study Fatal outcome of human influenza A (H5N1) is associated with high viral load and hypercytokinemia Influenza viruses and mRNA splicing: doing more with less Ribosomal frameshifting used in influenza a virus expression occurs within the sequence UCC_UUU_CGU and is in the+ 1 direction The clinical feature of silent infections of novel coronavirus infection (COVID-19) in Wenzhou From SARS to MERS: 10 years of research on highly pathogenic human coronaviruses Signals for ribosomal frameshifting in the rous sarcoma virus gag-pol region Eukaryotic wobble uridine modifications promote a functionally redundant decoding system The Phyre2 web portal for protein modeling, prediction and analysis SARS-CoV-2 structure and replication characterized by in situ cryo-electron tomography SARS-coronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum SMART: recent updates, new developments and status in 2015 Empirical transmission advantage of the D614G mutant strain of SARS-CoV-2. medRxiv Differential efficiencies to neutralize the novel mutants B. 1.1. 7 and 501Y. V2 by collected sera from convalescent COVID-19 patients and RBD nanoparticle-vaccinated rhesus macaques D, L-Lysine acetyl-salicylate+ glycine impairs coronavirus replication Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19) Minor spliceosome components are predominantly localized in the nucleus Comparing SARS-CoV-2 with SARS-CoV and influenza pandemics SARS-CoV-2 infection and overactivation of Nlrp3 inflammasome as a trigger of cytokine "storm" and risk factor for damage of hematopoietic stem cells Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage Ultrastructure and origin of membrane vesicles associated with the severe acute respiratory syndrome coronavirus replication complex Cytokine storm induced by SARS-CoV-2 Prospective mapping of viral mutations that escape antibodies used to treat COVID-19 The intracellular sites of early replication and budding of SARS-coronavirus COVID-19 transmission through asymptomatic carriers is a challenge to containment Subgenomic RNAs as molecular indicators of asymptomatic SARS-CoV-2 infection A new coronavirus associated with human respiratory disease in China U1-mediated exon definition interactions between AT-AC and GT-AG introns Viral dynamics in asymptomatic patients with COVID-19 Acknowledgements The authors express their humble gratitude to all warriors of the Covid-19 pandemic the world over.Author contributions HP: Concept and Sequence Analysis; RD: Analysis, Literature survey, Manuscript Writing. Conflict of interest This manuscript has been drafted purely based on theoretical bioinformatic analyses and has no experimental and/or clinical basis. The authors assume full responsibility and liability for the ideas and opinions expressed in this article. There are no conflicts of interests.