key: cord-0865150-lkp9iyd7
authors: Anreiter, Ina; Mir, Quoseena; Simpson, Jared T.; Janga, Sarath C.; Soller, Matthias
title: New Twists in Detecting mRNA Modification Dynamics
date: 2020-07-01
journal: Trends Biotechnol
DOI: 10.1016/j.tibtech.2020.06.002
sha: 87bf6dab4e857dd6d834fa31085708956164d3f7
doc_id: 865150
cord_uid: lkp9iyd7

Modified nucleotides in mRNA are an essential addition to the standard genetic code of four nucleotides in animals, plants, and their viruses. The emerging field of epitranscriptomics examines nucleotide modifications in mRNA and their impact on gene expression. The low abundance of nucleotide modifications and technical limitations, however, have hampered systematic analysis of their occurrence and functions. Selective chemical and immunological identification of modified nucleotides has revealed global candidate topology maps for many modifications in mRNA, but further technical advances to increase confidence will be necessary. Single-molecule sequencing introduced by Oxford Nanopore now promises to overcome such limitations, and we summarize current progress with a particular focus on the bioinformatic challenges of this novel sequencing technology.

Ina Anreiter, 1,2 Quoseena Mir, 3 Jared T. Simpson, 1,2 Sarath C. Janga, 3, 4, 5 and Matthias Soller 6, * ,@ Modified nucleotides in mRNA are an essential addition to the standard genetic code of four nucleotides in animals, plants, and their viruses. The emerging field of epitranscriptomics examines nucleotide modifications in mRNA and their impact on gene expression. The low abundance of nucleotide modifications and technical limitations, however, have hampered systematic analysis of their occurrence and functions. Selective chemical and immunological identification of modified nucleotides has revealed global candidate topology maps for many modifications in mRNA, but further technical advances to increase confidence will be necessary. Single-molecule sequencing introduced by Oxford Nanopore now promises to overcome such limitations, and we summarize current progress with a particular focus on the bioinformatic challenges of this novel sequencing technology.

Chemical modifications on RNA are well-established and evolutionarily conserved features of structural RNAs such as rRNA and tRNA [1] [2] [3] . In the past few years the occurrence of these modifications on protein-coding mRNAs, long noncoding RNAs (lncRNAs), and small regulatory RNAs (srRNAs) has received renewed attention to determine their role in regulating gene expression, development, and health and disease (Box 1). The rapid evolution of transcriptome sequencing technologies has made it possible to develop methodologies that interrogate the topography of RNA modifications transcriptome-wide. This new field, termed epitranscriptomics (see Glossary), seeks to elucidate the role of RNA modifications in regulating gene expression, with a special focus on their biological functions in mRNA.

Major recent methodological advances in mass spectrometry (MS) and next-generation sequencing combined with immunoprecipitation and chemical or enzymatic conversion methods have increased the catalog of known mRNA modifications, and have led to new insights into the role of these modifications in regulating gene expression in humans and model organisms such as yeast, plants, Drosophila, and mice. However, the field still faces considerable methodological challenges because modifications in mRNA generally are not abundant. Moreover, many current methods for probing RNA modifications are hampered by high error rates, low specificity, and poor reproducibility. We summarize below current approaches and emerging technologies for assessing mRNA modifications. We aim to highlight the strengths and limitations of current methods regarding specificity, sensitivity, and reproducibility, with a particular focus on emerging single-molecule direct RNA sequencing by Nanopore.

To date, 13 different chemical modifications have been identified in mRNA transcripts, and these can be divided into modifications of cap-adjacent nucleotides and internal modifications ( Figure 1 ). These modifications are added by a variety of dedicated enzymes (Box 2 and Table 1 ). Modifications of cap-adjacent nucleotides are added to the 5′-ends of RNAs transcribed Highlights Writers, readers, and erasers have now been discovered for many mRNA modifications.

Global topographic candidate maps have been generated for many modifications, but high error rates need to be addressed by technical improvements in detection and validation using orthogonal methods that apply rigid selection criteria.

Nanopore single-molecule direct RNA sequencing is progressing towards reliable detection of modified nucleotides in mRNA.

by RNA polymerase II [mRNA, primary (pri-)miRNA transcript, lncRNA, small nucleolar (sno)RNA, and small nuclear (sn)RNA] [4] . The composition of the cap varies with the type of RNA molecule, and typically consists of a 7-methylguanosine (m 7 G) moiety added in a characteristic 5′-5′ triphosphate linkage to the first transcribed nucleotide ( Figure 1A ). In some snRNAs and snoRNAs the cap guanosine is trimethylated to m 2′2′7 G, and further alternative cap structures including NAD + have recently been identified [4] [5] [6] . The first and second nucleotides adjacent to the cap can be 2′-O-methylated at the ribose (cOMe) in animals, viruses, and protists. If the cap-adjacent nucleotide is an adenosine, it can be further methylated to N 6 ,2′-Odimethyladenosine (m 6 Am, Figure 1A ) [4, [7] [8] [9] [10] [11] . Of further special note are the extensive modifications in trypanosomes of cap-adjacent nucleotides in the splice leader RNAs that are transspliced onto the body of the mRNAs ( Figure 1A ) [12] . Modifications of cap-adjacent nucleotides are common, differ among tissues and transcripts, and regulate mRNA stability and translation [4, 5, 7, 8, 10, 11, 13] .

Internal modifications occur in 5′ and 3′ untranslated regions (UTRs), coding regions, and introns of mRNAs ( Figure 1B ). Mass spectrometric (MS) quantification indicates that N 6 -methyladenosine (m 6 A) and adenosine to inosine (A to I) editing are the most abundant internal mRNA modifications [14] . Less abundant internal mRNA modifications are 5-methylcytosine (m 5 C) [15] , pseudouridine (Ψ) [16] [17] [18] [19] , N 1 -methyladenosine (m 1 A) [20, 21] , N 4 -acetylcytidine (ac 4 C) [22] , hydroxymethylcytosine (hm 5 C) [23] , 3-methylcytidine (m 3 C) [24] , cytosine to uridine (C to U) editing [25] , m 7 G [26, 27] , Nm [28] , and 7,8-dihydro-8-oxoguanosine (8-oxoG, as a result of guanosine oxidation by reactive oxygen species) [29] .

Internal mRNA modifications are involved in a variety of gene-regulatory functions in the cell, where m 6 A is thought to be the most diverse because of the presence of multiple bona fide readers of the YTH protein domain family (Box 2). Internal m 6 A can regulate all aspects of gene expression of an mRNA, including mRNA splicing, 3′-end processing, export, stability, localization, and translation [30] [31] [32] [33] [34] [35] . Other internal modifications tend to be more process-specific and can affect translation efficiency (e.g., inosine [36] , m 1 A [37] , ac 4 C [22] , m 6 Am [9] , m 3 C [24] ), codon recoding (e.g., inosine [36] , 8-oxoG [29] ), mRNA stability (e.g., ac 4 C [22] ), and nuclear export (e.g., m 6 A [38] ). Among these mRNA modifications, m 6 A, m 6 Am, and m 1 A have been proposed to function dynamically because they can be reversed by eraser proteins [13, 30, 39, 40] (Box 2).

Although the abundance of mRNA modifications are generally low, with a prevalence below 0.5% of all nucleotides [14, 24, 26] , they play important functional roles in mRNA regulation, cell function, development, the immune system, and homeostasis (Box 1). In addition, many mRNA modifications play key roles in viral infection and replication, and in response to stress [4, 25, 34, 35] . For Glossary AlkAniline-Seq: RNA alkaline hydrolysis and aniline cleavage sequencing. ALKBH5: ALKB homolog 5, an m 6 A mRNA demethylase that removes the methyl group associated with N 6 -methylated adenosine Epitranscriptomics: the study of transcriptome-wide distribution and function of RNA modifications. FTO protein: fat mass and obesity associated protein, an m 6 A mRNA demethylase that removes the methyl group of N 6 -methylated adenosine at the first position after the cap. LAIC-seq: m 6 A-level and isoformcharacterization sequencing. m 1 A-ID-seq: meRIP for m 1 A that leverages the inherent stalling of reverse transcriptases at modified nucleotides to give improved resolution. m 1 A-MAP: misincorporation-assisted profiling of m 1 A. m 6 A-REF-seq: m 6 A-sensitive RNA endoRNase-facilitated sequencing. MAZTER-seq: RNA digestion via m 6 A-sensitive RNase and sequencing. meRIP-seq: methylated RNA immunoprecipitation and sequencing. METTL3: methyltransferase-like 3, the catalytic subunit of the m 6 A methyltransferase complex that is involved in m 6 A deposition on mRNA, also named MTA-70, MTA, and IME4. METTL14: methyltransferase-like 14, a catalytically inactive component of the m 6 A methyltransferase complex that serves as a support protein for METTL3. miCLIP-seq: methylated individual nucleotide resolution crosslinking immunoprecipitation and sequencing. NsunC271A-CLIP: miCLIP for 5-methylcytosine (m 5 C) using covalent binding of the methyltransferase Nsun2 to its RNA targets. PA-m 6 A-seq: photo-crosslinkingassisted m 6 A sequencing RiboMethSeq: ribose methylation sequencing. RibOxi-seq: site-specific identification of 2′-O-methylation sites using ribose oxidation sequencing. RNMT: mRNA cap guanine-7 methyltransferase, an enzyme that methylates the inverted guanosine at the mRNA cap to m 7 G. TRIBE: targets of RNA-binding proteins identified by editing. YTH proteins: a class of proteins that specifically recognize m 6 A in mRNA.

In 1957, Davis and Allen discovered pseudouridine as the first known chemically modified RNA nucleotide [121] . Since then, 162 more RNA modifications have been described and catalogued (RNA Modification Databases: http://mods.rna.albany.edu and http://modomics.genesilico.pl [115] ). RNA modifications occur in all domains of life and in all types of RNA (tRNA, rRNA, mRNA, srRNA, snoRNA, snRNA, and lncRNA), and have been extensively linked to development, health, and disease. The high copy number of rRNAs and tRNAs in cells has greatly facilitated the study of modifications in these RNA species. rRNAs and tRNAs have complex 3D structures, which mediate ribosome function and protein translation. Modified nucleotides in rRNAs and tRNAs play essential roles in ribosome assembly and dynamics, and disruption of these modifications has been associated with lethality, severe growth defects, intellectual disability, diabetes, and cancer [32, 33, 122] . Although nucleotide modifications in mRNA have been known for N40 years, their functional interrogation has only become possible in the past decade owing to high-throughput sequencing-based mapping methodologies that overcome the low abundance of these modifications, as well as to the discovery of writer, reader, and eraser proteins (Box 2). Since then, may biological functions in development and disease have been associated to mRNA modifications, including a variety of cancers, mental disorders, and fertility and metabolic phenotypes [4, 30, 32, 33] .

instance, m 6 A and/or m 6 Am levels are altered in specific brain regions of acutely stressed mice and by glucocorticoid administration [41] . Human patients with major depressive disorder have lower m 6 A and/or m 6 Am levels in their blood and do not respond to glucocorticoid stimulation [41] . Human cells under serum starvation and peroxide treatment show stress-specific induced m 1 A sites [20] . Internal mRNA m 7 G is enriched in coding sequences (CDS) and 3′-UTRs, and is depleted in 5′-UTRs in mammalian cells and brain tissues in response to heat shock and oxidative stress [42] . Similar responses to stress have been described for Ψ [17,19], 8-oxoG [29] , and C to U editing [43] .

Another important function of mRNA modifications resides in the immune system. Of particular importance here is A to I editing, which in vertebrates tunes down the autoimmune response to double-stranded RNA [44, 45] . Furthermore, 2′-O-ribose methylation of cap-adjacent nucleotides (cOMe) has been proposed as a mechanism to distinguish self from non-self mRNAs because it repels binding of interferon-induced proteins with tetratricopeptide repeats (IFITs) and RIG-I, the main inducers of antiviral and inflammatory responses [46] [47] [48] [49] . However, cOMe likely has other transcribed base (base 1) is linked first to an inverted guanosine by a 5′-5′-triphosphate linkage and subsequently becomes methylated to 7-methylguanosine (m 7 G) by RNMT (RNA guanine 7-methyltransferase). The first and second nucleotides of the mRNA can be ribose 2′-O-methylated (cOMe) by the cap methyltransferases CMTr1 and CMTr2. If base 1 is a ribose 2′-O-methylated adenosine it can be further methylated to N 6 ,2′-O-dimethyladenosine (m 6 Am) by PCIF1 (phosphorylated CTD-interacting factor 1). In the splice leader sequence added to all mRNAs in trypanosomes (shaded in grey) the third and fourth nucleotides are also 2′-O-ribose methylated, the first adenosine is trimethylated to N 6 N 6 ,2′-O-trimethyladenosine (m 6 2 Am) instead of m 6 Am, and the fourth base is methylated to 3,2′-O-dimethyluridine. (B) Internal mRNA nucleotides carry different modifications depending on the nucleotide (Box 2 for enzymes). Adenosine can be converted to inosine by adenosine deaminases (ADARs), or methylated to N 1 -methyladenosine (m 1 A), N 6 -methyladenosine (m 6 A), or N 6 ,N 6dimethyladenosine. Cytidine can be converted to uridine (U), acetylated N 4 -acetylcytidine (ac 4 C), or methylated to 3-methylcytidine (m 3 C) or 5-methylcytidine (m 5 C). m 5 C can be converted to 5-hydroxymethylcytidine (hm 5 C). Guanosine can be methylated to 7-methylguanosine (m 7 G) or oxidized to 7,8-dihydro-8-oxoguanosine (8-oxoG) . Uridine can be converted to pseudouridine (Ψ). Finally, the ribose sugars of all nucleotides can be 2′-O-methylated (Nm).

The enzymes that deposit, remove, and bind to mRNA modifications have been termed 'writers', 'erasers', and 'readers', respectively [31, 34, 35] (see Table 1 in main text). Writers bind to short consensus sequence motifs in target mRNAs; however, these motifs are far more common than the modifications themselves, suggesting that additional factors determine writer binding. Other writers, such as CMTrs, PUS1, and METTL16, bind to mRNA based on mRNA structure motifs [31, 123] . Erasers and readers recognize the modifications themselves; however, given the large variety of different reader proteins, additional factors are probably involved in targeting readers to their mRNA targets. The machinery of writers, erasers, and readers is complex, often consisting of large protein complexes with a variety of cofactors. The writer complex for m 6 A, for instance, consists of a~900 kDa complex with two core methyltransferases (METTL3 and METTL14) and several auxiliary cofactors [31] . Although loss of the methyltransferases does not result in lethality in Drosophila, loss of several of the cofactors does, suggesting that these enzymes exercise alternative functions unrelated to m 6 A-methylation. Some writer proteins can also act as reader proteins, as is the case for METTL16 under particular cellular conditions [31] . In addition, some modifications have alternative writers, erasers, and readers. METTL3/METTL14, METTL4, and METTL16 writers for m 6 A act in different complexes, and both FTO and ALKBH5 have been shown to erase m 6 A [31] . A varying number of m 6 A readers have been identified across organisms, with a record of 13 different readers in plants [31] . There is also crosstalk between modifications, and some enzymes target more than one type of modification. For instance, FTO was first described as an eraser for internal m 6 A [124] , but was later shown to also demethylate m 6 Am at the cap [125] . Furthermore, mRNA modifications might influence the ability of mRNA binding proteins to bind to mRNA targets. For instance, interferon-induced proteins with tetratricopeptide repeats (IFITs), mRNA binding proteins that are involved in innate immunity and viral response, show low affinity for mRNA transcripts with ribose methylation at the cap [47] [48] [49] . Whether alternative writers, erasers, and readers of mRNA modifications act in different tissues, in different cellular/physiological contexts, or on different mRNA targets remains mostly an open question. Furthermore, although the machinery acting on m 6 A has been well characterized in recent years, the factors acting on other modifications are less well explored, and new enzymes or functions could be discovered in years to come. Further identification of these enzymes will aid our understanding of the biological functions of mRNA modifications and yield prime new targets for pharmaceutical targeting of mRNA modification-associated diseases such as cancer and obesity.

functions because only a fraction of host mRNAs carry cOMe, and levels differ between tissues and transcripts in mice [8, 13] . Moreover, IFIT1 in some viruses does not inhibit translation of non-ribose methylated transcripts [50] .

Although many recent studies outline the importance of mRNA modifications, transcriptomewide mapping of modifications still faces significant challenges. Because the abundance of mRNA modifications is generally low and variable (in the range 5-88% at a given site or transcript [51, 52] , and only 2-5% of cellular RNA is mRNA), current methods ( Table 2 ) require large amounts of mRNAs and/or very deep sequencing. Furthermore, current methods have yielded highly variable results, and suffer from many false positives and lack of specificity and/or singlebase resolution.

Assessing the Presence of Modifications Using MS Liquid chromatography coupled to tandem MS (LC-MS/MS) has been widely applied to detect and quantify the relative abundances of RNA modifications [5, 14, 26, 27, 53, 54] . Generally, the approach involves digesting total RNA or purified mRNA into individual nucleotides, followed by separation by LC and quantification by MS. The presence and amount of all nucleotide modifications in an RNA sample can then be assessed by comparison of the MS peaks from the sample with the MS peaks of standards [5, 14, 26, 53] . Provided that the nucleotides can be sufficiently Nm FTSJ3 c [147] A to I ADAR 1-3 [148] C to U APOBEC family [25] a Requires ribose methylation introduced by CMTr to methylate adenosine the at N 6 position. b Part of a 900 kDa holoenzyme. c Remains to be confirmed for mRNA. d Requires m 5 C.

separated, LC-MS/MS measurements are quantitative and concordant between studies even if the levels of the modified nucleotides are low [14, 26, 27, 53] . This approach has also been combined with enzymatic digestion by nucleases which have differential activity towards cap and internal m 7 G modifications to demonstrate the prevalence of internal m 7 G in mRNA [27] or enrichment of caps by ion-exchange (CapQuant and CAP-MAP) [5, 12, 55] . However, LC-MS/ MS requires large amounts of input RNA (N0.65 μg polyA mRNA for internal modifications, and N20 μg polyA mRNA for modifications of cap-adjacent nucleotides) and is limited for lowabundance nucleotides such as caps (100-to 10 000-fold less sensitive than radioactive labeling) [5, 12, 14, 27] . In addition, oligo-dT-purified polyA mRNA carries over rRNA, leading to misannotations [56] .

LC-MS/MS does not provide information about the location of the modification in a transcript, but this can be circumvented by using oligonucleotides obtained by digestion with different RNases or from tRNAs [54, 57, 58] . Digestion with RNase, however, can introduce errors in mapping fragments to the genome. For example, the m 7 G site in let-7 miRNA is in fact a fragment of rRNA with a well-known Gm site [59, 60] . Although the complement of modifications has been mapped in cellular tRNAs, the amount of input and the computational challenges are significant [54, 57] . [88, 89] a Only applied to unicellular organisms so far.

Antibody-Based Methods To Detect Modifications The most widely used methods for transcriptome-wide mapping of mRNA modifications rely on RNA immunoprecipitation (RIP) by commercially available antibodies that recognize modified nucleotides (Box 2), but the specificity of these antibodies varies. For profiling transcripts carrying modifications, RIP is generally followed by whole-transcriptome sequencing (RIP-seq or meRIP-seq) to identify modified regions~100-200 nt in length. To refine these regions to single-base resolution, methods have been developed that take advantage of nucleotide mismatches or truncation signatures induced by crosslinking antibodies to modified nucleotides before reverse transcription and sequencing (e.g., PA-m 6 A-seq [61] , miCLIP-seq [11, 62] , m 1 A-MAP [63] , and m 7 G-MeRIP-Seq [26] ). LAIC-seq assesses m 6 A modifications without prior fragmentation to distinguish methylation levels between mRNA isoforms, but at the cost of lost positional information [52] .

Results from antibody-based mRNA modification profiling are highly variable because, for example, antibodies raised against m 6 A can also recognize adenosine, and thus IP will not enrich only for m 6 A-carrying transcripts [64] . Early studies using meRIP-seq reported~7000 mRNA targets [40, 65] , whereas studies applying miCLIP for single-base resolution mapping reported~3500 m 6 A-containing mRNA targets [62] . It is unclear whether these discrepancies are due to high false-positive rates in meRIP-seq, high false-negative rates in miCLIP (owing to weak misincorporation and truncation signatures), or reflect true biological variance in different cells and conditions. However, reproducibility across m 6 A meRIP-seq datasets has been found to be low (~30 to 60%), even in the same cell type and between biological replicates [61, 66] . Furthermore, commonly used m 6 A antibodies crossreact with m 6 Am [62] . Thus, the discordance among antibody-based identification methods is largely a result of high falsepositive rates. Higher confidence in antibody-based identification methods can be gained by including a mutant control. In yeast, 1308 high-confidence m 6 A sites in 1183 transcripts were identified by comparing wild-type with METTL3 mutant (lacking m 6 A) cells [67] . A recent study used metabolic labeling (m 6 A-label-seq) to substitute m 6 A with N 6 -allyladenosine (a 6 A) to map m 6 A sites, and obtained a number of mRNA targets similar to that obtained with miCLIP (2480 -4512) [68] .

Antibody-based detection of m 1 A initially reported N4000 mRNA targets [21] . Reanalysis of m 1 A-induced misincorporation and truncation signatures from stalling of the reverse transcriptase in later studies reduced the number of targets to~450-600 (m 1 A-ID-seq [20] and m 1 A-MAP [63] ). Further, reanalysis of the 474 m 1 A sites previously identified by m 1 A-MAP [63] showed that 89% of the identified sites were false positives originating from genetic variation, sequencing errors, and misannotations [69] . In fact, the most recent studies using m 1 A-seq or LC-MS/MS suggest that virtually all mapping of m 1 A is caused by crossreactivity of commonly used m 1 A antibodies to the m 7 G cap, and that m 1 A sites in mRNA are in fact exceedingly rare [14, 37, 70] , although they have been validated in transcripts of the mitochondrial ND5 gene [37, 70] .

For several other modifications, the available mapping methods remain very limited. Two studies using meRIP and miCLIP for m 7 G identified several thousands of internal m 7 G sites in different mouse and human cell lines [26, 42] . The overlap of targets identified in the different cell lines was relatively high in one study (73.5%) [42] , but low in the other (27%) [26] . Only one study has used antibody-based methods to map ac 4 C [22] . It is thus not clear whether RIP for these modifications is hampered by similar false-positive rates and antibody crossreactivities to other modifications. Orthogonal methods to map m 7 G and m 3 C at single-base resolution by alkaline hydrolysis (AlkAniline-Seq) [71] have so far failed to detect these modifications in mRNA [71] .

Chemical Methods To Detect Modifications Chemical reactions specific to a given RNA modification, followed by short-read sequencing, provide an alternative to antibody-based detection of RNA modifications. In the case of A to I and C to U mRNA editing by deamination, no conversion step is necessary because nucleotide changes (I reads as G) can be directly assessed from RNA sequencing data by applying carefully controlled variant analysis [72] .

The most common chemical conversion for mapping m 5 C by bisulfite sequencing (Bs-seq or RNA-BisSeq) relies on chemical deamination of cytidines, but not of m 5 C, to uridine by sodium bisulfite [15, 73, 74] . Bs-seq is straightforward and cost-effective, but the efficiency of chemical conversion sets a limit to the detection of rare modifications [73, 75] .

The catalytic mechanism of m 5 C methylation involves transient covalent binding of a cytosine to the enzyme. This can be exploited by incorporation of azacytidine into nascent mRNA followed by IP with an antibody against the methyltransferase and sequencing of the target RNA [75, 76] . Alternatively, mutating one of the two cysteines in the catalytic center of the methyltransferase (C271A for Nsun2) will covalently bind the methyltransferase to its RNA targets [77] . Intriguingly, the overlap between Bs-seq, 5-azacytidine immunoprecipitation (Aza-IP), and NsunC271A-CLIP is surprisingly low [75] . Because most m 5 C sites in tRNAs were detected by these approaches, m 5 C potentially differs widely between cell types or is highly dynamic, but the lack of overlap could also be due to rapid degradation of protein-RNA adducts.

Chemical methods to map m 7 G and m 3 C have used alkaline hydrolysis of structural RNAs (AlkAniline-Seq) [71] and reduction to abasic sites with sodium borohydride (m 7 G-MaP-seq) [60] , but this approach did not detect these modifications in mRNA owing to the limits of chemical conversion for detecting rare modifications [71] .

To map internal Nm, nonmodified nucleotides are removed from the 3′ end of RNA with iterative oxidation-elimination-dephosphorylation cycles, but Nm blocks this process. After ligation of linkers to the Nm-modified nucleotide at the 3′-end, these sites can be mapped by sequencing (Nm-seq or RibOxi-seq) [28, 78] . In mammalian mRNA, Nm sites in 1267 transcripts were identified [28] . Determination of Nm levels in mRNA by MS suggested that~3% of nucleotides are ribose-methylated (18 Nm/mRNA), but this is likely a massive overestimate as a result of rRNA carry-over [14, 64] .

Although not yet used for mRNA, Nm can be mapped based on resistance to alkaline hydrolysis resulting in Nm nucleotides being depleted from the start of sequencing reads (RiboMethSeq) [79] . In addition, Nm stalls reverse transcriptase under low dNTP concentrations, resulting in Nmspecific truncation signatures during reverse transcription (2′-O-methylation sequencing, 2′-OMe-Seq [80] ).

Ψ can be mapped by chemical conversion with N-cyclohexyl-N′-(2-morpholinoethyl) carbodiimide metho-p-toluenesulphonate (CMC), which then blocks reverse transcription (Ψ-seq or pseudo-seq) [16, 19] . Ψ-seq detected 89 [16] and 353 [67] modified mRNA transcripts in different human cell lines. As with Nm-seq, the number of sites identified with Ψ-seq is low compared with the prevalence of Ψ reported by MS [14, 17] , and the overlap between the different studies is very modest [81] . Furthermore, a method employing a chemical pulldown enrichment step for Ψ (N3-CMC-enriched pseudouridine sequencing, CeU-seq) identified 1929 modified mRNAs [17], but whether most of these sites were missed by Ψ-seq or are false positives remains to be determined.

Enzymatic Methods To Detect Modifications The earliest methods to map modifications in mRNA relied on the cleavage specificity of RNases, in particular RNase T1, which cleaves after guanosine [82, 83] . Using this approach in combination with radioactive labeling and separation of individual nucleotides, m 6 A was first mapped in prolactin and Rous sarcoma virus transcripts [84] [85] [86] . This approach remains the most sensitive and can distinguish between mRNA and rRNA because m 6 A is not in a GA context in rRNA [71] . Moreover, cOMe can be analyzed in a similar sensitive way [8] .

Recently, endoRNases have been discovered that are blocked by methylation [87] . In MAZTER-seq [88] and m 6 A-REF-seq [89] , the endoRNases MazF and ChpBK cut unmethylated RNA at ACA and UAC motifs, respectively, leaving m 6 A methylated RNA intact, resulting in transcriptome-wide m 6 Adependent restriction profiles [88, 89] . Although this method requires little input RNA and is highly specific, the specificity for ACA restricts MAZTER-seq to~16% of all m 6 A sites in mammals [88] .

Conversion of m 6 A to hm 6 A by FTO protein can be exploited by its reactivity towards dithiothreithol that allows selective chemical labeling (m 6 A-SEAL) [90] . This approach revealed a greater specificity and sensitivity than antibody-based (e.g., meRIP, miCLIP) and enzymatic methods (e.g., deamination adjacent to RNA modification targets, DART-seq) [90] .

The APOBEC (apolipoprotein B mRNA editing enzyme) family of cytidine deaminases has been successfully used for transcriptome-wide mapping of m 6 A and hm 5 C. hm 5 C converted by T4 bacteriophage β-glucosyltransferase (T4-BGT) is protected from deamination by APOBEC3A, allowing mapping of hm 5 C in RNA (APOBEC-coupled epigenetic sequencing, ACE-seq) [91] . Similarly to hm 5 C, m 6 A can be mapped by using an APOBEC1-YTH fusion protein in which the m6A reader YTH targets m 6 A sites, and APOBEC1 deaminates adjacent cytidines (DART-seq) [92] . DARTseq was shown to identify~60% of m 6 A meRIP targets, suggesting higher specificity than meRIP [92] . However, the efficiency of purified APOBEC1-YTH is limited in vitro, and exogenous expression of APOBEC1-YTH in cells is necessary, which could introduce errors owing to altered expression levels of the YTH RNA-binding protein [64, 92, 93] . A concept similar to DART-seq underlies m 6 A mapping with TRIBE (targets of RNA-binding proteins identified by editing). In TRIBE, the YTH (or METTL3) RNA-binding domains are fused to the catalytic domain of ADARs. Binding of this fusion protein to m 6 A sites then results in A to I editing by the ADAR catalytic domain in the vicinity [94] . TRIBE and DART-seq identify a subset of targets and also require a complementary DNA sequencing dataset to identify RNA editing events.

The limitations in the sensitivity and accuracy of transcriptome-wide mRNA modification mapping techniques make validation by an orthogonal method essential. The most reliable technique to validate the presence and methylation level of m 6 A is site-specific cleavage and radioactive labeling of the modified nucleotide followed by ligation-assisted extraction and thin-layer chromatography (SCARLET). Mapping of m 6 A is accomplished by chimeric oligonucleotides consisting of ribo-and deoxyribonucleotides to direct site-specific cleavage of mRNA transcripts by RNase H [51, 88, 95] , but deoxyribozymes are a valid alternative to RNase H-directed cleavage [96, 97] . If the candidate site is not known at nucleotide resolution, in vitro transcribed substrate RNAs can be methylated in nuclear extracts and the modifications can be mapped by molecular methods [64, 84, 86, 98] Direct RNA Sequencing by Nanopore

Oxford Nanopore Technologies™ (ONT) is a long-read sequencing platform that generates long reads of up to several kilobases from single mRNA molecules in real time (Box 3) [99, 100] . By contrast to other long-read sequencing technologies (e.g., PacBio), ONT allows direct RNA sequencing (DRS) without prior amplification or cDNA conversion. Therefore, ONT has potential Trends in Biotechnology in the direct detection of RNA modifications, opening a completely new approach to the study of epitranscriptomic modifications, but accurate detection faces significant challenges.

Nucleotide modifications in ONT sequencing reads were first demonstrated on CpG-methylated DNA based on deviations of ONT raw current signals from a model of canonical nucleotides (Box 4 for bioinformatics of ONT basecalling) [101, 102] . CpG-methylation calling software is now an integrated feature of the ONT analysis software Nanopolish. This methylation caller is a hidden Markov model that was trained on synthetically methylated DNA to distinguish between methylated and unmethylated cytosines using raw current signals [101] , and has been applied to the study of DNA methylation patterns in the human genome and epigenetic imprinting in mice [103, 104] .

In 2018, ONT introduced a direct RNA sequencing protocol and showed raw signal differences between modified m 6 A, m 5 C, and unmodified bases using a synthetically designed firefly luciferase (FLuc) transcript with embedded modifications [105] . Subsequently, ONT DRS was applied to Escherichia coli full-length 16S rRNA, showing that ONT signals are sensitive to m 7 G and Ψ via corresponding errors in basecalling [106] . The first study applying ONT DRS to the human transcriptome showed that m 6 A could be detected within the methylation consensus motif (GGACU) in the human cell line GM12878. As a proof of concept, the study showed a raw ONT signal difference at a known m 6 A site (chr19:3,976,327, GRCh38 human genome reference) in eukaryotic elongation factor 2 (EEF2) transcripts compared with an in vitro transcribed unmodified Box 3. Nanopore Sequencing Oxford Nanopore Technologies™ (ONT) sequencers are available as small hand-held devices (MinION) and larger highthroughput sequencers (PromethION) . The underlying technology is the same between sequencers, and the largest difference is sequencing output. ONT is a rapidly developing technology, and frequent improvements are released for sequencing reagents, sequencing flow cells, and bioinformatics. These technological updates have resulted in the rapid improvement of ONT sequencing error rates (Box 4). Although bioinformatic updates can be applied to previous datasets (Box 4), reagent and flow-cell updates might make it difficult to compare data from older and newer studies. ONT also relies heavily on user-developed solutions. For direct RNA sequencing specifically, ONT only offers sequencing library kits for mRNA or polyA RNA, and studies investigating other types of RNA (e.g., in vitro transcribed RNA, rRNA, tRNA) have developed their own custom adaptations [126, 127] .

For direct mRNA sequencing, polyA mRNA is enriched with oligo-dT beads and a cDNA strand is synthesized using an oligo-dT primer (see Figure 2A in main text). The cDNA strand is not sequenced, but prevents RNA from forming secondary structures and protects against degradation by most RNases, and thus increases throughput. To prepare nucleic acids for nanopore sequencing, an RNA sequencing adapter (RMX) is ligated to the 3′ end. This adaptor carries a 'motor protein' at the 3′ end and a 'tether' protein at the 5′ end. The motor protein moves the nucleic acid strand (~400 bases/s for DNA and~70 bases/s for RNA) through the pore while the tether protein prevents the cDNA from passing through the pore and being sequenced.

The current nanopore (R9.4 or R9.5) consists of an engineered protein derived from Escherichia coli Curlin sigma S-dependent growth (CsgG) pore [128] embedded in an electrically resistant membrane made from a synthetic polymer [129] [130] [131] (see Figure 2B in main text). MinKNOW™, the software made by ONT to control the MinION device, performs several core tasks such as assignment of run parameters, data acquisition, and feedback on how the experiment is progressing. The membrane in a MinION flow cell holds individually addressable 2048 pores controlled in groups of 512 channels. MinKNOW assigns four pores per channel through mux scan (a process to check pore activity), thus allowing simultaneous sequencing of 512 molecules [131] . Nanopores are immersed in an ionic solution, and when voltage is applied a ionic current passes through the nanopore that is individually recorded by a sensor [131] [132] [133] . The sensor measures the current several thousand times per second, and the collected data are sent to MinKNOW [131, 133] .

When a nucleic acid strand traverses the nanopore from one chamber to the other, the current changes in a characteristic way such that the nucleic acid sequence can be identified. The raw signals (current measurements) plotted over time are known as 'squiggle' plots (see Figure 2C in main text). Unlike fluorescence-based sequencing, where individual nucleotides are recognized, the measured signal is affected by multiple nucleotides that reside in the pore, which is modeled using 6 nt subsequences (kmers) for DNA and 5 nt for RNA. These 5 nt subsequences from RNA result in 4 5 different signal configurations [101, [134] [135] [136] . These signal configurations are modeled using Gaussian distributions where the mean represents the level of the signal and the standard deviation represents the variation from measurement and intrinsic noise (see Figure 2D in main text).

transcript. The study then identified 86 genes that showed signal differences at the same GGACU site between different transcripts of the same gene [107] . The ONT software package Tombo can also identify modified bases in RNA based on signal deviations. This has been used to call m 5 C methylation patterns in the viral RNA of human coronavirus [108] . These initial studies provided the first evidence that ONT DRS sequencing is sensitive to RNA modifications and thus might be used for transcriptome-wide mapping of epitranscriptomic marks.

Although direct RNA sequencing holds a great promise for the identification of RNA modifications at single-base resolution, the challenge, however, lies in the interpretation of raw signals corresponding to modified and unmodified base sequence contexts. Currently available computational algorithms to map RNA modifications using ONT DRS datasets are either based on modification-induced basecalling errors (differential error calling) or machine-learning models that look at differences in raw signal levels. Both approaches generally require matching low-and high-modification datasets, such as lack-of-function mutants for METTL3 [109, 110] , in vitro synthesized RNA [105, 109, 111] , or reverse transcribed RNA (cDNA) [112] . In addition, most studies generally use antibody-based data, known modified sites, and/or and filtering for known motifs (such as the DRACH motif for m 6 A) to refine and/or validate results [109, [111] [112] [113] .

The first category of algorithms to call RNA modifications from ONT DRS reads operates on the assumption that RNA modifications cause a deviation of ONT signal that results in bases being miscalled. These modification-induced calling errors are used as a surrogate to map modified nucleotides on transcripts [112] . For instance, ELIGOS (epitranscriptional landscape inferring from glitches of ONT signals) works on the percentage error of specific bases (%ESB) between native RNA sequencing data and cDNA data for the same sample (because modifications are removed from cDNA), and has been used to identify rRNA methylation sites in E. coli, yeast, and human cells [112] . However, this approach is subject to differences arising from the different nanopore motors, sequencing directions, and basecalling models between RNA and cDNA.

An elegant approach to identify m 6 A-induced ONT sequencing errors used matched sets of synthetic RNA molecules from MALAT1 lncRNA with and without m 6 A at known locations. This approach was then extrapolated to use the differences in error calling between wild-type, METTL3-defective mutant, and METTL3 genetic-rescue Arabidopsis to identify m 6 A sites Box 4. Bioinformatics of Nanopore Sequencing

To obtain the sequences of the input molecules, the electrical signal must be translated into individual bases; a process called basecalling. MinKNOW has a built-in basecaller for 'live' basecalling during the sequencing run, or the user can turn this function off and instead basecall the raw files later.

Early versions of the sequencers from ONT used a basecaller that ran on the Metrichor™ cloud-based platform. This basecaller used a hidden Markov model (HMM), a probabilistic model for decoding an unknown sequence of 'states' based on an observed sequence. For nanopore basecalling the unknown states were k-mers of the DNA or RNA molecule, and the observed sequence was a segmentation of the raw signal. This approach to basecalling has been superseded by neural network-based approaches (reviewed in [137] ), as comprehensively benchmarked by Wick and colleagues [134] .

The latest ONT basecaller, called Guppy, uses graphical processing units (GPUs) to accelerate this crucial signal-to-sequence step to facilitate high-throughput sequencing experiments. Guppy has two different basecalling modesa 'fast' basecaller that aims to keep up with live data generation, and a 'high-accuracy' model that lowers the sequencing error rate at the expense of longer processing times. Guppy (V3.4.5) has a basecalling error rate of 4-6% for DNA and 7-12% for RNA in high-accuracy mode [134] . Guppy's latest version can also directly call modified bases in DNA (m 5 C and m 6 A), although only in limited contexts (CpG and CCWGG for m 5 C, and GATC for m 6 A). Both HMM and recurrent neural network (RNN) models depend on training signals, and the performance of a basecaller therefore depends on having a good training data set. With each update to the basecaller and further reduction in error rates, it is possible to re-basecall old datasets as long as the raw signal files are retained.

transcriptome-wide [109] . In the same system, m 6 A sites were further analyzed by miCLIP to show that 66% of called nanopore DRS differential error sites fall within miCLIP peaks [109] . A similar approach was taken to identify base-resolution and isoform-specific m 6 A sites in adenovirus RNA by comparing DRS errors in m 6 A wild-type and METTL3 knockout datasets [113] .

Likewise, the algorithm EpiNano was trained on 100% methylated or unmethylated synthetic RNA sequences to predict m 6 A modifications based on sequencing errors [111] . EpiNano predicted m 6 A sites with 90% accuracy in synthetic reads and 87% accuracy in yeast in vivo, but prediction accuracy depends on the extent of methylation. The prediction accuracy of EpiNano was validated on DRS reads from a yeast METTL3 knockout strain lacking m 6 A modifications.

The second category of methods to identify RNA modifications is based on training machine-learning models using modifications identified by antibody-based methods. These maps are then used to develop modification-specific basecallers for direct RNA-sequencing data to predict modifications [110] . This approach was used to develop MINES (m 6 A identification using Nanopore sequencing), a random forest classifier trained on miCLIP m 6 A sites within DRACH motifs in HEK293T cells. MINES was able to predict m 6 A miCLIP modification sites with 80% accuracy and identified a significant set of previously unannotated sites. The authors validated MINES by showing the sensitivity of identified sites to METTL3 knockdown and ALKB5 overexpression. Importantly, sites identified by miCLIP, but not predicted by MINES, did not show sensitivity of METTL3 or ALKB5 manipulations, as would be expected from false-positive miCLIP sites [110] .

It is worth noting that MINES extracts raw DRS signals using Tombo, which does not support spliced mapping and only allows assessment of sites in the 3′-UTRs of transcripts. Comparing the performance of Tombo with the error-based algorithm EpiNano revealed that~50% of reads were discarded by Tombo and that EpiNano achieved higher accuracy than Tombo (87% vs 59%), but reduced sensitivity (32% vs 59%) [111] .

Nanocompore, another machine-learning algorithm, uses Nanopolish to align raw DRS signals to the reference genome, which allows spliced alignment of RNA reads [114] . Nanocompore runs an automated pipeline for data preprocessing including basecalling, alignment using Minimap2, and signal alignment using Nanopolish. It operates by comparing DRS signals between highand low-modification samples to detect transcriptome-wide m 6 A modification events. The authors stipulate that, given an appropriate low-modification control, Nanocompore can be used to detect a large variety of different RNA modifications [114] .

The methods mentioned here demonstrate that ONT sequencing harbors the potential to revolutionize the transcriptome-wide prediction of RNA modifications. However, each of the available solutions currently has limitations. Differential error algorithms require matched low-and highmodification samples with identical genetic backgrounds because differences in DNA sequence (e.g., SNPs) will result in errors that are unrelated to RNA modifications. This limits the current use of these algorithms to systems where a low-modification control can be obtained on the exact genetic background of the sample of interest.

The prediction of RNA modifications in ONT sequencing is further complicated by the fact that each sequencing signal originates from a group (k-mer) of 5 nt rather than from a single nucleotide. Moreover, the signal for each k-mer is variable, leading to a distribution of possible signals ( Figure 2D ). RNA modifications can result in a shift in the signal distribution for a given k-mer, Figure 2 . Nanopore Sequencing. (A) Schematic of the library preparation procedure for Nanopore direct RNA sequencing. PolyA RNA is enriched using oligo-dT primers and a reverse transcription (RT) adaptor is ligated. After second-strand synthesis, the sequencing adapter RMX, which is preloaded with motor protein and tether protein, is then ligated. (B) Schematic of Nanopore direct RNA sequencing. The motor protein feeds the RNA molecule through the nanopore in the 3′-5′ direction. The five bases passing through a nanopore cause a characteristic disruption in the current which is stored as raw signal. (C) A current trace (squiggle plot) showing the raw signal generated by nanopore sequencing of a single mRNA molecule. Leader and adapter sequences are shaded yellow and pink, the polyA tail is shaded green, and the mRNA body is shaded orange. The inset (top right) illustrates how the nucleotide sequence is inferred from the raw current trace originating from a sliding window of five nucleotides (k-mer). Machine-learning algorithms are then used to calculate the probability that a signal corresponds to a given k-mer, thus inferring the nucleotide sequence from the calculated probabilities. (D) The two features recorded by Oxford Nanopore Technologies (ONT) sequencers are the current signal (in arbitrary units, AU) and the time that a given k-mer takes to transverse the pore (signal length, retention time or 'dwell'). The scatter plot depicts the distribution of mean current and signal length for 100 reads each in a different sequence context of the unmodified k-mer CACCC (blue) and the modified k-mer CAm 5 CCC (orange, identified by parallel bisulfite sequencing, where m 5 C is 5-methylcytosine). Note that, despite an identical k-mer, the signal varies as a result of different measurements and intrinsic noise in different reads, and possibly also by the different surrounding sequence of a given k-mer. This variability can be represented as a signal density plot for each k-mer, depicted in the top-right inset (density distribution for raw current signal). RNA modifications can affect raw current reads as well as signal length, resulting in a shift in signal distributions (e.g., divergence between blue and orange). However, these signal shifts can be modest, as shown by the largely overlapping density plots for CACCC and CAm 5 CCC, making accurate prediction of modified bases a computational challenge. Plots were generated with Sequoia, an interactive visual analytics platform for interpretation and feature extraction from ONT sequencing datasets [138] . and this shift can be used to predict the presence of modifications. However, these shifts can be relatively small, and the distribution of signals generated by modified k-mers largely overlaps with the distribution of unmodified k-mers ( Figure 2D ).

Modification prediction platforms that employ machine-learning models depend on the quality of the training datasets (miCLIP or similar), and are thus limited by the quality of these datasets. Currently N160 RNA modifications have been reported in the literature [115] . Generating training models that encompass these diverse modifications will require very large and complex training datasets.

The emerging field of epitranscriptomics has brought global insights into the topography of nucleotide modifications in mRNA and their impact on gene expression. The prevalent links of RNA modifications to lifestyle diseases such as obesity, as well as to many neurological diseases and cancer, have made RNA modification enzymes prime targets for the pharmaceutical industry in employing their classic repertoire to develop inhibitors. To better understand the biological roles of RNA modifications, and how to pharmacologically interfere with them, accurate epitranscriptomic maps are required.

Recent technical advances at many levels have increased the catalog of modifications in mRNA and promise to overcome technical challenges to detect mRNA modifications with high confidence to address key questions about their role in the regulation of gene expression. However, it is important to consider the inherent limitations of antibody-and chemical conversion-based methods to detect modifications that are generally not abundant. The reliability of results can be improved by higher sample sizes (most studies currently only report 2-3 replicates), increased sequencing depth [70] , and by considering variability due to different reaction conditions, antibodies, and enzymes [116] . A major improvement in obtaining higher accuracy in mapping modifications is to include a knockout of the writer and verification of specific targets by orthogonal methods.

One big open question concerns whether the variability observed in global levels of mRNA modifications reflects limitations in current methods, cell-to-cell variability, or cell type-specific features (see Outstanding Questions). In addition, current methods can only assess a single type of modification at a time, but it is likely that a combination of modifications is present on a given transcript. Thus, to truly understand how mRNA modifications regulate gene expression, it will be important to develop methodologies to determine the combinations of modifications that are present in individual transcripts and from small amounts of input such as individual cells.

Single-molecule sequencing by ONT now promises the technological advances to eventually measure low and dynamic changes in mRNA modifications. Moreover, ONT will permit combinations of modifications to be assessed on single transcripts in the near future. To address key questions about the function of modifications, single-cell RNA sequencing will inevitably be required. Because direct RNA sequencing by ONT excludes cDNA amplification steps, sequencing mRNAs from single cells presents a particular challenge, but solutions to this conundrum could include using smaller devices and more replicates.

For ONT to become the leading technology, further improvements at multiple levels are required, with a particular focus on the major challenge in computational interpretation of sequencing signals. Several machine-learning algorithms have recently been developed to predict modifications from ONT data. These algorithms are generally random forest models that predict modified sites based on a combination of RIP-seq or ONT training data, but they also incorporate sequence

Which nucleotides are modified in mRNA?

What fraction of mRNAs carry modified nucleotides?

Why do only a fraction of mRNAs carry a specific nucleotide modification?

Are mRNA modifications dynamic?

Do mRNAs carry multiple different modifications, and is there a modification code?

How do mRNA modifications contribute to development and cause disease? motifs and other sequence features that have been developed from RIP-seq data [117] [118] [119] [120] . The main limitation here is that the ability of computational algorithms to predict modifications relies on the quality of the training data. High false-positive rates in RIP-seq data and high rates of sequencing errors in ONT data thus limit the accuracy of such algorithms.

Given rapid technological progress, we anticipate soon being able to more comprehensively understand how the modification of nucleotides is dynamically instructed to impact on mRNA processing, and how its misregulation results in human disease.

Distribution and frequencies of post-transcriptional modifications in tRNAs

Tuning the ribosome: the influence of rRNA modification on eukaryotic ribosome biogenesis and function

RNA modifications in gene expression control

mRNA cap regulation in mammalian cell function and fate

Quantifying the RNA cap epitranscriptome reveals novel caps in cellular and viral RNA

Identification of NAD + capped mRNAs in Saccharomyces cerevisiae

Cap-specific, terminal N6-methylation by a mammalian m6Am methyltransferase

A novel synthesis and detection method for cap-associated adenosine modifications in mouse mRNA

Cap-specific terminal N6-methylation of RNA by an RNA polymerase II-associated methyltransferase

PCIF1 catalyzes m6Am mRNA methylation to regulate gene expression

Identification of the m6Am methyltransferase PCIF1 reveals the location and functions of m6Am in the transcriptome

Mass spectrometry of mRNA cap 4 from trypanosomatids reveals two novel nucleosides

Reversible methylation of m6Am in the 5′ cap controls mRNA stability

tRNA modification profiles of the fastproliferating cancer cells

Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA

Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells

Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome

Transcriptome-wide mapping of pseudouridines: pseudouridine synthases modify specific mRNAs in S. cerevisiae

Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA

Transcriptome-wide mapping reveals reversible and dynamic N1-methyladenosine methylome

The dynamic N1-methyladenosine methylome in eukaryotic messenger RNA

Acetylation of cytidine in mRNA promotes translation efficiency

Transcriptome-wide distribution and function of RNA hydroxymethylcytosine

Three distinct 3-methylcytidine (m3C) methyltransferases modify tRNA and mRNA in mice and humans

The APOBEC protein family: united by structure, divergent in function

Transcriptome-wide mapping of internal N7-methylguanosine methylome in mammalian mRNA

Existence of internal N7-methylguanosine modification in mRNA determined by differential enzyme treatment coupled with mass spectrometry analysis

Nm-seq maps 2'-O-methylation sites in human mRNA with base precision

Transcriptional mutagenesis mediated by 8-oxoG induces translational errors in mammalian cells

m6A in mRNA: an ancient mechanism for fine-tuning gene expression

The m6A writer: rise of a machine for growing tasks

Nucleotide modifications in messenger RNA and their role in development and disease

RNA modifications modulate gene expression during development

Where, when, and how: contextdependent functions of RNA methylation writers, readers, and erasers

Reading, writing and erasing mRNA methylation

Inosine induces context-dependent recoding and translational stalling

The m1A landscape on cytosolic and mitochondrial mRNA at single-base resolution

The m6A methylase complex and mRNA export

ALKBH1-mediated tRNA demethylation regulates translation

Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq

The Role of m6A/m-RNA methylation in stress response regulation

Dynamic methylome of internal mRNA N7-methylguanosine and its regulatory role in translation

APOBEC3A cytidine deaminase induces RNA editing in monocytes and macrophages

The RNA-editing enzyme ADAR1 controls innate immune responses to RNA

The RNA modification N6-methyladenosine as a novel regulator of the immune system

A conserved histidine in the RNA sensor RIG-I controls immune tolerance to N1-2'Omethylated self RNA

Structural basis for m7G recognition and 2'-O-methyl discrimination in capped RNAs by the innate immune receptor RIG-I

Structure of human IFIT1 with capped RNA reveals adaptable mRNA binding and mechanisms for sensing N1 and N2 ribose 2'-O methylations

Human IFIT3 modulates IFIT1 RNA binding specificity and protein stability

Human IFIT1 inhibits mRNA translation of Rubulaviruses but not other members of the Paramyxoviridae family

Probing N6-methyladenosine RNA modification status at single nucleotide resolution in mRNA and long noncoding RNA

m6A-LAIC-seq reveals the census and complexity of the m6A epitranscriptome

Liquid chromatography-mass spectrometry for analysis of RNA adenosine methylation

Detection of ribonucleoside modifications by liquid chromatography coupled with mass spectrometry

CAP-MAP: cap analysis protocol with minimal analyte processing, a rapid and sensitive approach to analysing mRNA cap structures

Statistically robust methylation calling for whole-transcriptome bisulfite sequencing reveals distinct methylation patterns for mouse RNAs

Oligonucleotide sequence mapping of large therapeutic mRNAs via parallel ribonuclease digestions and LC-MS/MS

tRNA modification profiles and codondecoding strategies in Methanocaldococcus jannaschii

METTL1 promotes let-7 microRNA processing via m7G methylation

Detection of internal N7-methylguanosine (m7G) RNA modifications by mutational profiling sequencing

High-resolution N6-methyladenosine (m6A) map using photo-crosslinking-assisted m6A sequencing

Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome

Base-resolution mapping reveals distinct m1A methylome in nuclear-and mitochondrial-encoded transcripts

m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination

Comprehensive analysis of mRNA methylation reveals enrichment in 3' UTRs and near stop codons

Limits in the detection of m6A changes using MeRIP/m6A-seq

High-resolution mapping reveals a conserved, widespread, dynamic mRNA methylation program in yeast meiosis

A metabolic labeling method detects m6A transcriptome-wide at single base resolution

2018) m1A within cytoplasmic mRNAs at single nucleotide resolution: a reconciled transcriptome-wide map

Antibody cross-reactivity accounts for widespread appearance of m1A in 5'UTRs

AlkAniline-seq: profiling of m7G and m3C RNA modifications at single nucleotide resolution

Dynamic landscape and regulation of RNA editing in mammals

5-methylcytosine promotes mRNA export -NSUN2 as the methyltransferase and ALYREF as an m5C reader

Distinct 5-methylcytosine profiles in poly (A) RNA from mouse embryonic stem cells and brain

Characterizing 5-methylcytosine in the mammalian epitranscriptome

Identification of direct targets and modified bases of RNA cytosine methyltransferases

NSun2-mediated cytosine-5 methylation of vault noncoding RNA determines its processing into regulatory small RNAs

High-throughput and site-specific identification of 2'-O-methylation sites using ribose oxidation sequencing (RibOxi-seq)

Illumina-based RiboMethSeq approach for mapping of 2'-O-Me residues in RNA

High-throughput single-base resolution mapping of RNA 2-O-methylated residues

TRUB1 is the predominant pseudouridine synthase acting on mammalian mRNA via a predictable and conserved code

Mobilities of modified ribonucleotides on twodimensional cellulose thin-layer chromatography

RNA nucleotide methylation

Mapping of N6-methyladenosine residues in bovine prolactin mRNA

Precise localization of m6A in Rous sarcoma virus RNA reveals clustering of methylation sites: implications for RNA processing

Sequence specificity of the human mRNA N6-adenosine methylase in vitro

Detection of N6-methyladenosine based on the methyl-sensitivity of MazF RNA endonuclease

Deciphering the 'm6A

Code' via antibody-independent quantitative profiling

Single-base mapping of m6A by an antibody-independent method

Antibody-free enzyme-assisted chemical approach for detection of N6-methyladenosine

Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase

DART-seq: an antibody-free method for global m6A detection

Concentration and localization of coexpressed ELAV/Hu proteins control specificity of mRNA processing

Identification of methylated transcripts using the TRIBE approach

Probing N6-methyladenosine (m6A) RNA modification in total RNA with SCARLET

N6-methyladenosine-sensitive RNA-cleaving deoxyribozymes

Deoxyribozyme-based method for absolute quantification of N6-methyladenosine fractions at specific sites of RNA

An in vitro system for accurate methylation of internal adenosine residues in messenger RNA

Molecular bases of cyclodextrin adapter interactions with engineered protein nanopores

Multiple base-recognition sites in a biological nanopore: two heads are better than one

Detecting DNA cytosine methylation using nanopore sequencing

Mapping DNA methylation with highthroughput nanopore sequencing

Nanopore sequencing and assembly of a human genome with ultra-long reads

Using long-read sequencing to detect imprinted DNA methylation

Highly parallel direct RNA sequencing on an array of nanopores

Reading canonical and modified nucleobases in 16S ribosomal RNA using nanopore native RNA sequencing

Nanopore native RNA sequencing of a human poly(A) transcriptome

Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis

Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification

Direct RNA sequencing enables m6A detection in endogenous transcript isoforms at base specific resolution

Accurate detection of m6A RNA modifications in native RNA sequences

Decoding the epitranscriptional landscape from native RNA sequences

Direct RNA sequencing reveals m6A modifications on adenovirus RNA are necessary for efficient splicing

RNA modifications detection by comparative Nanopore direct RNA sequencing

MODOMICS: a database of RNA modification pathways. 2017 update

Machine learning of reverse transcription signatures of variegated polymerases allows mapping and discrimination of methylated purines in limited transcriptomes

SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features

PACES: prediction of N4-acetylcytidine (ac4C) modification sites in mRNA

RNAm5Cfinder: a web-server for predicting RNA 5-methylcytosine (m5C) sites based on random forest

NmSEER V2.0: a prediction tool for 2'-Omethylation sites based on random forest and multi-encoding combination

Ribonucleic acids from yeast which contain a fifth nucleotide

RNA 2'-O-methylation (Nm) modification in human diseases

mRNA structure determines modification by pseudouridine synthase 1

N6-methyladenosine in nuclear RNA is a major substrate of the obesity-associated FTO

FTO controls reversible m6Am RNA methylation during snRNA biogenesis

Direct RNA sequencing of the coding complete influenza a virus genome

Targeted nanopore sequencing with Cas9-guided adapter ligation

Structural and mechanistic insights into the bacterial amyloid secretion channel CsgG

Engineered transmembrane pores

Studies of RNA sequence and structure using nanopores

MinION analysis and reference consortium: phase 1 data release and analysis

The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community

Three decades of nanopore sequencing

Performance of neural network basecalling tools for Oxford Nanopore sequencing

Nanopore sequencing: from imagination to reality

The power spectrum of ionic nanopore currents: the role of ion correlations

From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy

Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets

Identification of YTH domain-containing proteins as the readers for N1-methyladenosine in RNA

Mutations in cytosine-5 tRNA methyltransferases impact mobile element expression and genome stability at specific DNA repeats

Tet-mediated formation of 5-hydroxymethylcytosine in RNA

RNA 5-methylcytosine facilitates the maternal-to-zygotic transition by preventing maternal mRNA decay

5-methylcytosine promotes pathogenesis of bladder cancer through stabilizing mRNAs

Drosophila YBX1 homolog YPS promotes ovarian germ line stem cell development by preferentially recognizing 5-methylcytosine RNAs

) m5C methylation guides systemic transport of messenger RNA over graft junctions in plants

AlkB homologue 1 demethylates N3-methylcytidine in mRNA of mammals

FTSJ3 is an RNA 2'-O-methyltransferase recruited by HIV to avoid innate immune sensing

A-to-I editing of coding and non-coding RNAs by ADARs

Transcriptome-wide mapping of N6-methyladenosine by m6A-seq based on immunocapturing and massively parallel sequencing

Transcriptome-wide mapping of 5-methylcytidine RNA modifications in bacteria, archaea, and yeast reveals m5C within archaeal mRNAs

We apologize to colleagues whose work was not cited owing to space limitations. We thank I. Haussmann and R. Arnold for comments on the manuscript. We thank R. Koonchanok 

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 15