key: cord-0986355-x631datu authors: Martignano, F.; Di Giorgio, S.; Mattiuz, G.; Conticello, S. G. title: Commentary on “Poor evidence for host-dependent regular RNA editing in the transcriptome of SARS-CoV-2” date: 2022-03-12 journal: J Appl Genet DOI: 10.1007/s13353-022-00688-x sha: 9017ffb7a06aac5aee3564667788f99986fcca1e doc_id: 986355 cord_uid: x631datu Analysis of the SARS-CoV-2 transcriptome has revealed a background of low-frequency intra-host genetic changes with a strong bias towards transitions. A similar pattern is also observed when inter-host variability is considered. We and others have shown that the cellular RNA editing machinery based on ADAR and APOBEC host-deaminases could be involved in the onset of SARS-CoV-2 genetic variability. Our hypothesis is based both on similarities with other known forms of viral genome editing and on the excess of transition changes, which is difficult to explain with errors during viral replication. Zong et al. criticize our analysis on both conceptual and technical grounds. While ultimate proof of an involvement of host deaminases in viral RNA editing will depend on experimental validation, here, we address the criticism to suggest that viral RNA editing is the most reasonable explanation for the observed intra- and inter-host variability. We and others have hypothesized that the genetic information of the SARS-CoV-2 could be affected by the cellular RNA editing machinery based on the ADAR and APOBEC proteins. The hypothesis originates from two observations: (a) several viral genomes show mutational patterns that could be ascribed to the activity of host deaminases (Cattaneo et al. 1988; Vartanian et al. 1991; Harris et al. 2003; Mahieux et al. 2005; Noguchi et al. 2005; Taylor et al. 2005; Zahn et al. 2007; Phuphuakrat et al. 2008; Vartanian 2008; Carpenter et al. 2009; Doria et al. 2009 Doria et al. , 2011 George et al. 2009; Clerzius et al. 2011; Pfaller et al. 2011; Suspene et al. 2011; Samuel 2012; Cuevas et al. 2015; Tomaselli et al. 2015; Peretti et al. 2018; Rosani et al. 2019 )-this has been confirmed experimentally in several cases-and (b) there is an excess of transition changes occurring both at the intrahost (Di Giorgio et al. 2020; Farkas et al. 2020; Popa et al. 2020; Wang et al. 2021; Graudenzi et al. 2021; Gregori et al. 2021; Lythgoe et al. 2021; Song et al. 2021; Tonkin-Hill et al. 2021; Voloch et al. 2021; Picardi and Mansi 2022) and inter-host scale (Simmonds 2020; van Dorp et al. 2020; Jacob-Hirsch et al. 2020; Klimczak et al. 2020; Kosuge et al. 2020; Mourier et al. 2020; Popa et al. 2020; Azgari et al. 2021; Deng et al. 2021; Rice et al. 2021; Sadykov et al. 2021; Simmonds and Azim Ansari 2021; Tasakis et al. 2021; Matyášek et al. 2021; Pang et al. 2021; Ramazzotti et al. 2021 ). Such excess is not explainable with the known mutational pattern introduced by the RNA-dependent RNA polymerase or by the SARS-CoV-2 proofreading machinery (Smith et al. 2013; Kabinger et al. 2021) . It is still not clear whether the involvement of the cellular RNA editing machinery affects the onset/progression of the COVID-19 infection. On the other hand, the outcomes of such editing heavily skew the evolutionary trajectory of the virus (Simmonds 2020; van Dorp et al. 2020; Jacob-Hirsch et al. 2020; Klimczak et al. 2020; Kosuge et al. 2020; Mourier et al. 2020; Popa et al. 2020; Azgari et al. 2021; Deng et al. 2021; Sadykov et al. 2021; Simmonds and Azim Ansari 2021; Tasakis et al. 2021; Matyášek et al. 2021; Pang et al. 2021; Ramazzotti et al. 2021) . As currently available studies are all based on mutational analyses of the viral genome/transcriptome, we wholeheartedly agree with Zong et al. (2022) that final proof of ADAR and APOBEC involvement will be only obtained through genetic analysis, by infecting ADAR-and APOBEC-deficient cells with the SARS-CoV-2 virus. Zong et al. (2022) criticize our analysis based on both conceptual and technical grounds and we think it is useful to address the criticism in order to clarify the issues for future studies. Here, we address their criticism following the order outlined in their abstract: (1) No prediction of what the mutation profile should be. The authors suggest we did not have a null hypothesis against which to test the possibility of RNA editing in the viral transcriptomes. This probably depends from their misunderstanding of our analysis: they compare our claims to those presented in a report that hypothesized that every possible type of RNA-DNA mismatch could represent a post-transcriptional modification in the human transcriptome (Li et al. 2012 ). Our scope was far more limited, as we just wanted to see whether ADAR and APOBEC editing, the only well-characterized RNA modifications easily identifiable by direct sequencing, could represent a source of variability in the virus. With this aim, our expectations were quite clear: if RNA editing by ADARs and/or APOBECs plays a role in intra-host diversification of the virus; A > G / T > C and C > T / G > A substitutions should outnumber all other types. This is quite evident from the mutational patterns observed, both in our and in other follow-up analyses (Farkas et al. 2020; Simmonds 2020; van Dorp et al. 2020; Klimczak et al. 2020; Kosuge et al. 2020; Azgari et al. 2021; Pathak et al. 2021; Ramazzotti et al. 2021; Rice et al. 2021; Sadykov et al. 2021; Simmonds and Azim Ansari 2021; Song et al. 2021; Tasakis et al. 2021; Tonkin-Hill et al. 2021; Friedman et al. 2021; Voloch et al. 2021; Wang et al. 2021; Graudenzi et al. 2021; Gregori et al. 2021; Lythgoe et al. 2021; Matyášek et al. 2021; Pang et al. 2021; Picardi and Mansi 2022) . Furthermore, even though the intra-host single-nucleotide polymorphism (SNV) pattern might be questionable due to sequencing errors, we observed a similar enrichment also in inter-host SNVs from SARS-CoV-2 genomic sequences, which has been confirmed by further studies, including those using more refined approaches (i.e., mutational signatures (Popa et al. 2020; Graudenzi et al. 2021 ). Based on this parallelism, it is sensible to hypothesize that some RNA sites that are edited eventually become fixed in the viral genome. Considering that similar results-using different pipelines and datasets-have been found in all intraand inter-host analyses published so far, the authors' claim that the "obtained a mutation spectrum similar to the random mismatch spectrum" is quite wrong. Especially wrong since the authors take our argument about "transitions and transversions" out of context and do not realize that we actually had a mutational pattern against which to compare our analysis: past mutational analyses of SARS-CoV (Fig. 4A - B Smith et al. 2013) show that both the viral replication machinery (both transcription and proofreading system) have a mutational profile quite different from what we observed in the COVID-19 samples-notably with a bias towards transversions. The infected cells used in these analyses were Vero cells, a line where APOBECs are not active (ADARs and APOBECs induce transitions). While we could not expect a defined amount of adenine-to-inosine or cytosine-to-uracil editing beforehand, we think that our expectations were somehow more realistic than those advanced by the authors themselves with their prediction that "a successful RNA editing paper would find an extremely high A > G% (usually > 80%) after multiple steps of filters". Such expectation relies entirely on the assumption that the effects of RNA editing on the viral transcriptome should strictly mirror those observed in the human one. Such expectation completely disregards the potential effects of other RNA editing enzymes such as the APOBECs or replication-dependent errors. For example, the mutational pattern in HIV is biased towards G > A mutations as its genome is targeted mainly by APOBECs during the first cDNA synthesis stage of reverse transcription (e.g., (Harris et al. 2003) ). Analogously, any mutational pattern observed will depend on the different mutational processes acting on the virus. (2) Filtering steps and cutoffs in their pipeline failed to increase the A > G percentage, neither did they change the mismatch profile. As the authors state, in a metatranscriptomic context, the only viable strategy for discriminating sequencing errors from real SNVs is to apply post-filtering approaches. Indeed, the filter on the allelic fraction (> 0.005), not mentioned by the authors, significantly increases the signal-to-noise ratio, increasing the fraction of A > G / T > C and C > T / G > A substitutions. We agree that this can be further optimized and, indeed, subsequent studies proposed more refined approaches to reduce the number of false positive calls (e.g., (Picardi and Mansi 2022)). It is indeed reassuring that dif-ferent pipelines converged to the same conclusions (Di Giorgio et al. 2020; Farkas et al. 2020; Popa et al. 2020; Wang et al. 2021; Graudenzi et al. 2021; Gregori et al. 2021; Lythgoe et al. 2021; Pathak et al. 2021; Song et al. 2021; Tonkin-Hill et al. 2021; Voloch et al. 2021 ). While we agree that the mutational spectrum is not significantly affected by the filters reanalyzed by Zong et al. (2022) , these filters were nonetheless necessary to avoid introducing bias in the estimation of the allelic fraction of the observed changes. Low coverage positions (< 20) with few supporting reads (< 4) would artificially inflate the allelic frequency of these low confidence substitutions (e.g., at a position with a coverage of 20 reads, even a single mutated read would result in a 0.05 allelic fraction, way higher than the threshold we set to consider a SNV). Considering the high heterogeneity in terms of coverage, within and among samples, we used this filter to get a better view on the distribution of allelic frequencies. This is important to direct the interpretation of the data, as different thresholds can be used to arbitrarily discriminate 'RNA editing' positions from 'germline' variants based on allelic fraction. This allowed us to find the similarity between the proposed editing of the viral transcriptome and what happens in the cellular RNA editing: not high levels of editing on a few selected residues, which would quickly affect the evolution of the virus, but low-frequency editing, reminiscent of what happens mostly on Alu sequences (0.6% average frequency, what the authors call hyperediting) (Porath et al. 2014; Picardi and Pesole 2020) . Sequencing errors at the read ends typically introduce false positive SNVs, as the authors state, and are the reason for applying the 15 bp padding filter to the read ends. While not substantially changing the mutational profile, the filter was needed to avoid introduction of SNVs from recurrent errors at the read ends. These false positives would have weighted our analysis on the consequences of editing at synonymous/nonsynonymous sites. Of course, a proper analysis should be carried out not on intra-host changes-which are not selected-but on inter-host ones, as they are the result of selection. Even if the authors claim we did not, we have indeed analyzed it (our supplementary 2 table), but we did not weight much on it because the dataset was too small. Indeed, subsequent studies, very thorough and expertly argumented have been carried out by others, and all of them concurred that RNA editing is likely to be a major factor in determining the evolutionary trajectory of the SARS-CoV-2 (e.g., (Simmonds 2020; Rice et al. 2021; Simmonds and Azim Ansari 2021) ). It should also be taken into account that, beside the evolutionary significance of the outcomes of RNA editing, the biological significance of the editing of the viral transcriptome might lie more on the cellular response to the infection than on the recoding effects of the editing. (3) Failure in interpreting the equally-abundant T > C substitutions. (4) ADAR motifs and RNA structures contradicted with the antisense editing proposed by themselves. In our paper, we hypothesize that both A > G and T > C polymorphisms derive from ADAR activity editing during viral replication, when positive and negative strands coexist-by necessity-as double-stranded RNA: as ADARs induce adenine-to-inosine changes, these will appear as A > G when the positive strand is targeted and as T > C when the negative strand is transcribed back to positive. We think it is a fairly reasonable model, as the authors themselves admit after their explanation on how true T > C polymorphisms in the human transcriptome cannot originate from ADAR editing (which targets only adenines). Indeed, we never propose that ADAR editing could target thymines in the viral transcriptome, which would be problematic, as there are no thymines in RNA. And we have been very careful in differentiating the biological phenomenon (A-to-I, C-to-U) from the analytical details (A > G, T > C, C > T, and G > A) of the bioinformatic analysis. It is true that we did not discuss a possible clash between the transcription complex and the ADARs ability to target dsRNA, but there was no reason for it: ADARs could target double-stranded RNA at any time after the transcription complex leaves the scene. On the other hand, we apologize for not specifying something we thought it was obvious: when analyzing the editing sequence contexts, we have used the reverse complement to analyze T > C changes together with A > G ones, as evident from sequence logos-where the central position is always an A. (5) ADARs should be poorly expressed in cytosol compared to APOBECs. RNA editing of the human transcriptome typically takes place in the nucleus. This is something that has been experimentally established for all well-characterized RNA editing events, be they dependent on ADARs (e.g., (Patterson and Samuel 1995; Behm et al. 2017) or on APOBEC1 (e.g., (Lau et al. 1991; Sowden et al. 1996 ). Yet, localization of both APOBECs and ADARs is not limited to the nucleus (e.g., (Patterson and Samuel 1995; Yang et al. 2001; Bennett et al. 2006; Land et al. 2013 ) and even nuclear proteins are synthesized in the cytoplasm and shuttle through it. While the significance of ADAR editing in the cytoplasm has never been experimentally characterized, its cytoplasmic localization alters the fate of short interfering RNAs (Yang et al. 2005) . Based on this, cytoplasmic editing is conceivable also for ADARs. Eventually, it is probable that the amount of cytoplasmic editing and its effects are likely determined by the cellular milieu (presence/absence of cytoplasmic ADAR1 isoform, APOBEC expression, etc.) and we agree that wet-lab studies are needed to understand the relevance of RNA editing in viral infections, including those originating from SARS-CoV-2. Nonetheless, we find that the statement "we only propose that those editing sites are unreliable (likely to be replication errors) but we do not claim that none of the sites they found are true editing sites" is quite a weak foreword for a manuscript that ends with the "findings and conclusions made by Di Giorgio et al. were erroneous and misleading and should be correctly in instant"… Ethics approval This article does not contain any studies with human participants or animals performed by any of the authors. The authors S.G.C, F.M, S.DG, and G.M declare that they have no conflict of interest. The mutation profile of sarscov-2 is primarily shaped by the host antiviral defense Accumulation of nuclear ADAR2 regulates adenosine-to-inosine RNA editing during neuronal development APOBEC-1 and AID are nucleo-cytoplasmic trafficking proteins but APOBEC3G cannot traffic Evidence for ADAR-induced hypermutation of the Drosophila sigma virus (Rhabdoviridae) Biased hypermutation and other genetic changes in defective measles viruses in human brain infections Multiple levels of PKR inhibition during HIV-1 replication Extremely High Mutation Rate of HIV-1 In Vivo Mutation signatures inform the natural host of SARS-CoV-2 Evidence for hostdependent RNA editing in the transcriptome of SARS-CoV-2 Editing of HIV-1 RNA by the double-stranded RNA deaminase ADAR1 stimulates viral infection ADAR2 editing enzyme is a novel human immunodeficiency virus-1 proviral factor Large-scale population analysis of SARS-CoV-2 whole genome 1 sequences reveals host-mediated viral evolution with emergence of mutations 2 in the viral Spike protein associated with elevated mortality rates Transcriptomic profiling and genomic mutational analysis of human coronavirus (HCoV)-229E -infected human cells Tipping the balance: antagonism of pkr kinase and adar1 deaminase functions by virus gene products Mutational signatures and heterogeneous host response revealed via large-scale characterization of SARS-CoV-2 genomic diversity Host-dependent editing of SARS-CoV-2 in COVID-19 patients DNA deamination mediates innate immunity to retroviral infection Transcriptomic profiling of human corona virus (HCoV)-229E-infected human 2 cells and genomic mutational analysis of HCoV-229E and SARS Mechanism of molnupiravir-induced SARS-CoV-2 mutagenesis Similarity between mutation spectra in hypermutated genomes of rubella virus and in SARS-CoV-2 genomes accumulated during the COVID-19 pandemic Point mutation bias in SARS-CoV-2 variants results in increased ability to stimulate inflammatory responses Apolipoprotein B mRNA editing is an intranuclear event that occurs posttranscriptionally coincident with splicing and polyadenylation Widespread RNA and DNA sequence differences in the human transcriptome SARS-CoV-2 within-host diversity and transmission Extensive editing of a small fraction of human T-cell leukemia virus type 1 genomes by four APOBEC3 cytidine deaminases Mutational asymmetries in the sars-cov-2 genome may lead to increased hydrophobicity of virus proteins Host-directed editing of the SARS-CoV-2 genome G to A hypermutation of hepatitis B virus Emerging severe acute respiratory syndrome coronavirus 2 mutation hotspots associated with clinical outcomes and transmission Spatio-temporal dynamics of intra-host variability in SARS-CoV-2 genomes Expression and regulation by interferon of a double-stranded-RNA-specific adenosine deaminase from human cells: evidence for two forms of the deaminase Characterization of BK polyomaviruses from kidney transplant recipients suggests a role for APOBEC3 in driving in-host virus evolution Protein kinase PKR and RNA adaenosine deaminase ADAR1: new roles for old players as modulators of the interferon response Detection of A-to-I RNA editing in SARS-COV-2 Genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2 A genome-wide map of hyper-edited RNA reveals numerous new sites Large-scale analysis of synonymous viral variants reveals global adaptation of the SARS-CoV-2 to the human codon usage Evidence for strong mutation bias toward, and selection against, U content in SARS-CoV-2: implications for vaccine design A-to-I editing of Malacoherpesviridae RNAs supports the antiviral role of ADAR1 in mollusks Short sequence motif dynamics in the SARS-CoV-2 genome suggest a role for cytosine deamination in CpG reduction ADARs, viruses and innate immunity Extensive C->U transition biases in the genomes of a wide range of mammalian RNA viruses; potential associations with transcriptional mutations, damage-or host-mediated editing of viral RNA Rampant C→U hypermutation in the genomes of SARS-CoV-2 and other coronaviruses: causes and consequences for their short-and long-term evolutionary trajectories Coronaviruses lacking exoribonuclease activity are susceptible to lethal mutagenesis: evidence for proofreading and potential therapeutics ADAR mediated A-to-I RNA editing affects SARS-CoV-2 characteristics and fuels its evolution Determinants involved in regulating the proportion of edited apolipoprotein B RNAs Genetic editing of herpes simplex virus 1 and Epstein-Barr herpesvirus genomes by human APOBEC3 cytidine deaminases in culture and in vivo SARS-CoV-2 variant evolution in the United States: high accumulation of viral mutations over time likely through serial Founder Events and mutational bursts New antiviral pathway that mediates hepatitis C virus replicon interferon sensitivity through ADAR1 Adars and the balance game between virus infection and innate immune cell response Patterns of within-host genetic diversity in SARS-COV-2 No evidence for increased transmissibility from recurrent mutations in SARS-CoV-2 Selection, recombination, and G--A hypermutation of human immunodeficiency virus type 1 genomes Evidence for editing of human papillomavirus DNA by APOBEC3 in benign and precancerous lesions Intra-host evolution during SARS-CoV-2 prolonged infection Intra-host variation and evolutionary dynamics of SARS-CoV-2 populations in COVID-19 patients Intracellular trafficking determinants in APOBEC-1, the catalytic subunit for cytidine to uridine editing of apolipoprotein B mRNA ADAR1 RNA deaminase limits short interfering RNA efficacy in mammalian cells A-to-G hypermutation in the genome of lymphocytic choriomeningitis virus Poor evidence for host-dependent regular RNA editing in the transcriptome of SARS--CoV-2 Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations