Streamlining differential exon and 3' UTR usage with diffUTR


Streamlining differential exon and 3’ UTR usage with diffUTR

Stefan Gerber1,2, Gerhard Schratt2 & Pierre-Luc Germain1,3,4,*

1Group of Computational Neurogenomics, D-HEST Institute for Neurosciences, ETH Zürich

2Lab of Systems Neuroscience, D-HEST Institute for Neurosciences, ETH Zürich

3Lab of Statistical Bioinformatics, DMLS, University of Zürich

4SIB Swiss Institute of Bioinformatics

*Correspondence to Pierre-Luc Germain (pierre-luc.germain@hest.ethz.ch)

Abstract

Background: Despite the importance of alternative poly-adenylation and 3’ UTR length for a1

variety of biological phenomena, there are limited means of detecting UTR changes from standard2

transcriptomic data.3

Results: We present the diffUTR Bioconductor package which streamlines and improves upon4

differential exon usage (DEU) analyses, and leverages existing DEU tools and alternative poly-5

adenylation site databases to enable differential 3’ UTR usage analysis. We demonstrate the6

diffUTR features and show that it is more flexible and more accurate than state-of-the-art alter-7

natives, both in simulations and in real data.8

Conclusions: diffUTR enables differential 3’ UTR analysis and more generally facilitates DEU9

and the exploration of their results.10

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


Background11

Coding sequences in eukaryotic mRNAs are generally flanked by transcribed but untranslated12

regions (UTRs) which can impact RNA stability, translation, and localization [1]. In particular, the13

length of 3’ UTRs often varies even within a given gene due to the use of different poly-adenylation14

(polyA) sites [2], leading especially to the inclusion or not of regulatory elements such as binding15

sites for microRNAs (miRNAs) or RNA-binding proteins [3]. Alternative poly-adenylation (APA)16

is highly prevalent in mammals [4] and has been shown to be important to a variety of biological17

phenomena [5,6,7,8].18

A number of methods for 3’ end sequencing have been developed with the goal to map APA19

sites [9,10,11,12,13,4,14], leading to the development of atlases such as PolyASite [15] or PolyA DB20

[16]. As such methods are only marginally used, however, it would be beneficial to leverage21

the widespread availability of traditional RNA-seq for the purpose of identifying changes in 3’22

UTR usage. A chief difficulty here is that most UTR variants are not catalogued in standard23

transcript annotations, limiting the utility of standard transcript-level quantification based on24

reference transcripts, such as salmon [17]. Nevertheless, a number of methods have been developed25

to this purpose. Methods like DaPars [18] and APAtrap [19] try to infer new polyA sites from read26

coverage changes from RNA-seq experiments, however the depletion of RNAseq coverage at the 3’27

end of transcripts makes the precise inference of polyA sites challenging [20]. Other tools like QAPA28

[8] and APAlyzer [21] use already available polyA site databases but only compare the usage of the29

most proximal polyA sites to distal ones in a pairwise fashion and fail to grasp the full complexity30

of dynamic APA when there are three or more polyA sites, which is the case for approximately half31

of mammalian transcripts [4]. Furthermore they do not make use of the already proven statistical32

frameworks to analyse different exon usage (DEU) from count data [22,23,24,25]. These tools take33

into account the inherent properties of read count distributions and are arguably more appropriate34

to analyse differences in relative polyA site usage, which is conceptually highly similar to DEU. We35

therefore developed diffUTR, which streamlines and improves upon well established DEU tools,36

and leverages them, along with polyA site databases, to infer alternative 3’ UTR usage across37

conditions.38

1

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


Results39

Streamlining differential bin/exon usage analysis40

Popular bin-based DEU methods are provided by the limma [25,24], edgeR [23] and DEXSeq [22]41

packages. However, their usage is not straightforward for non-experienced users, and their results42

often difficult to interpret. We therefore developed a simple workflow (Figure 1A), usable with any43

of the three methods but standardizing inputs and outputs. In particular, bin annotation and quan-44

tification, as well as different usage results, are all stored in a RangedSummarizedExperiment45

[26], which facilitates data storage and exploration, and enables advanced plotting functions irre-46

spective of the underlying method. diffUTR is flexible in its application, and supports the use of47

strand information if available.48

Transcript
annotation
GRanges /

EnsDb /
.gtf

polyA sites
GRanges /

.bed

prepareBins

countFeatures D
E

U
w

ra
p

p
e

rs

DEXseq
diffSplice2

edgeR

Bins
(GRanges)

Ranged
Summarized
Experiment

bam files

Plotting
functions

(Rsubread)

UTRCDS

Transcript
annotation

+
polyA
sites

Bins

A

B

Figure 1: Overview. A: diffUTR workflow. Bins are prepared from various types of gene anno-
tations as well as, optionally, additional APA-driven segmentation and extension, then read counts
within bins as well as bin information are stored in a standardized RangedSummarizedExperiment,
which can then be used as an input for any of the three DEU methods, producing again a stan-
dardized output that can be used with the package’s plotting functions. B: Schematic of bin
preparation. APA sites are used to further segment and extend disjoined gene bins.

Improvement to diffSplice49

diffUTR also implements an improved version of limma’s diffSplice method which does not50

assume constant residual variance across bins of the same gene (see diffSplice2). To test the effect51

2

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


of these modifications in a standard DEU setting, we ran both versions (as well as the other two52

DEU methods) on simulated data from a previous DEU benchmark [27]. The precision and recall53

results (Figure 2A) confirmed the previously observed superiority of DEXSeq and, more generally,54

the imperfect false discovery rate (FDR) control. Importantly, it also confirmed that our improved55

diffSplice2 method outperforms the original, at no additional computing cost.56

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75

FDR

T
P

R

Differential exon usage
A

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

FDR

T
P

R

method

APAlyzer

APAlyzer2

DaPars

DEXSeq

diffSplice

diffSplice2

edgeR

QAPA.dPau

QAPA.pval

Differential UTR usage
B

d
iff
U
T
R

Figure 2: FDR and recall (TPR) on simulated data. A: In the classical DEU context. B: In
the differential UTR usage context. The dashed line indicates a real False Discovery Rate (FDR)
of 5%, and the dots indicate nominal FDRs of 10, 5 and 1%. diffUTR methods far outperform
QAPA and DaPars. In both contexts, our modifications to diffSplice significantly improve its
performance.

Application to differential UTR usage and benchmark on a simulation57

We next sought to evaluate the methods when applied for differential UTR analysis. For this58

purpose, APA sites are used to further segment and extend UTR bins, as illustrated in Figure 1B59

(see methods for the details). Given the absence of RNAseq data with a differential UTR usage60

ground truth, we simulated reads with known UTR differences from real data (see Simulated61

Data). We then ran the different diffUTR methods (as well as the unmodified diffSplice62

variant), and compared them to alternative methods. While DaPars and APAlyzer provide gene-63

level significance testing, QAPA does not, and our attempts to use its equivalence classes with64

standard transcript usage methods (see methods) gave very poor results. Therefore, for the65

purpose of comparison we tried two alternatives: simply ranked genes according to QAPA’s main66

output, i.e. the absolute difference in polyA site usage between conditions (|∆PAU|), labeled in67

2B as QAPA.dPau, or running t -tests on the log-transformed PAU values, labeled as QAPA.qval.68

3

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


Since APAlyzer produces different analyses for genes’ 3’ end and intronic APA usage, we used69

both the 3’ end results and a combination of the two (the latter shown as APAlyzer2 ). As Figure70

2B shows, all diffUTR methods outperformed alternatives by far. On this test, our improved71

diffSplice2 had comparable performance to DEXSeq, at a fraction of the computing costs.72

Differential UTR usage in real data73

We next sought to test diffUTR in real data. First, since 3’ UTRs are known to generally lengthen74

during neuronal differentiation [28,8], we expected to observe a skew towards positive fold changes75

of 3’ UTR bins when comparing RNAseq experiments from embryonic stem cells (ESC) and ESC-76

derived neurons. We therefore re-analyzed data from [29] and observed clearly the expected skew77

among statistically-significant genes, especially for bins with a higher expression (Figure 3A).78

We next found both 3’ sequencing and standard RNAseq data from samples of mouse hip-79

pocampal slices undergoing Forskolin-induced long-term potentiation [30], which enabled us to use80

the 3’ sequencing data as a truth for analysis performed on the standard RNAseq data (Figure81

3B and Supplementary Figure 1). In this case we represent the results through Receiver-operator82

characteristic (ROC) curves since the Precision-recall curves make the differences less visible due83

to the lower general power. Although power to detect UTR changes is necessarily low with respect84

to 3’ sequencing, we again observed that diffUTR methods clearly outperformed all alternative85

methods.86

Exploring differential exon/UTR usage results87

diffUTR provides three main plot types to explore differential bin usage analyses, each with a88

number of variations. Figure 4 showcases them in the context of long-term potentiation of mouse89

hippocampal neurons [30]. plotTopGenes (Figure 4A) provides gene-level statistic plots (similar90

to a ‘volcano’ plot), which come in two variations. For standard DEU analysis, absolute bin-level91

coefficients are weighted by significance and averaged to produce gene-level estimates of effect92

sizes. For differential 3’ UTR usage, where bins are expected to have consistent directions (i.e.93

lengthening or shortening of the UTR) and where their size is expected to have a strong impact on94

biological function, the signed bin-level coefficients are weighted both by size and significance to95

produce gene-level estimates of effect sizes. By default, the size of the points reflects the relative96

expression of the genes, and the color the relative expression of the significant bins with respect97

to the gene.98

4

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


A

0.25

0.50

0.75

1.00

0.25 0.50 0.75 1.00

FPR

T
P

R

method

APAlyzer

DaPars

DEXSeq

diffSplice2

QAPA.pval

B

−5 0 5

0
1

2
3

4
5

6

Bin log2(foldchange)

B
in

 m
e
a
n
 l
o
g
(C

P
M

)

d
iff
U
T
R

3' UTR lengthening

Figure 3: Differential UTR analysis on real data. A:. 3’ UTR lengthening during neuronal
differentiation. Plotted are the UTR bins found statistically significant (bin- and gene-level FDR
both ¡ 0.1) by diffUTR (diffSplice2) when comparing in vitro differentiated neurons to mouse
embryonic stem cells. The color indicates the point density. The clear skew towards a positive bin-
level foldchange (indicative, in most cases, of a UTR lengthening), especially for bins with a higher
mean count (CPM=counts per million reads sequenced). B: Receiver-operator characteristic
(ROC) curves of differential UTR usage analysis on the LTP dataset, using 3’ sequencing to
establish the ground truth. The axes are square-root-transformed to improve visibility, and only a
subset of method variations are shown (see Supplementary Figure 1 for all variants).

deuBinPlot (Figure 4B) provides bin-level statistic plots for a given gene, similar to those99

produced by DEXSeq and limma, but offering more flexibility. They can be plotted as overall100

bin statistics, per condition, or per sample, and can display various types of values. Importantly,101

since all data and annotation are contained in the object, these can easily be included in the plots.102

Figure 4B shows a lengthening of the Jund 3’ UTR in the LTP group.103

Finally, geneBinHeatmap (Figure 4C) provides a compact, bin-per-sample heatmap represen-104

tation of a gene, allowing the simultaneous visualization of various information. We found these105

representations particularly useful to prioritize candidates from differential bin usage analyses. For106

example, many genes show differential usage of bins which are generally not included in most107

transcripts of that gene (low count density), and are therefore less likely to be relevant.108

Further variations tested109

During implementation, we tested other changes to the method which were ultimately discarded110

as they did not improve performance, but which we here briefly report.111

First, differential UTR analysis differs from typical differential exon usage analysis in that the112

vast majority of UTR bins are consecutively transcribed, meaning that changes in the usage of a113

5

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


Smg6

Ntrk2

Homer1

Nr4a2

Slc2a1

Txndc11Stmn4Rheb

FosbRnf217 Dio2

Nr4a3

Syt4Scg2Hmgcs1

Plk2Rbbp7

Nfkb1 Clta

Frmd6
Arid5a

Eprs

Lmna

Slc1a3
Grem2

0

10

20

30

40

0.5 1.0 1.5 2.0

Weighted absolute coefficient

−
lo

g
1

0
(q

.v
a

lu
e

)

geneMeanDensity

0.00

0.25

0.50

0.75

−0.2

0.0

0.2

density.ratio

A

0

1

2

3

sqrt−scaled genomic location

type

UTR

CDS

condition

CTRL

LTP

JundB

Scaled

S
m

g
6

 b
in

s

ty
p

e

lo
g

W
id

th

m
e

a
n

L
o

g
D

e
n

s
it
y

lo
g

1
0

P
V

a
lu

e

logcpm

condition

type

CDS/UTR
CDS
UTR/3UTR
UTR
3UTR
non−coding

log10PValue

0

10

20

30

40

50

condition

CTRL
LTP

scaled
logcpm

−2

−1

0

1

2

logNormDensity

0

0.05

0.1

0.15

0.2

0.25

C

logCPM

lo
g
(C
P
M
)

Figure 4: Plotting functions. A: plotTopGenes provides significance and effect size statistics
aggregated at the gene level. B: deuBinPlot provides a more flexible version of the bin-level gene
plots generated by common DEU packages. Shown here is the upregulation of Jund 3’ UTR upon
LTP. C: geneBinHeatmap provides a compact, bin-per-sample heatmap representation of a gene.

6

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


bin should also be visible in downstream bins. We therefore reasoned that it would be beneficial to114

use this property to improve statistical analysis. We reasoned that connected bins with significant115

fold changes in the same direction could be unified and their p-values aggregated, and tested a116

rudimentary implementation using Fisher’s aggregation. However, this decreased accuracy and led117

to a worse FDR control (Supplementary Figure 2).118

Second, most methods compare bin-level foldchanges to gene-level ones to identify bins be-119

having differently from the others, and we reasoned that, especially for genes with more UTR bins120

than CDS bins, including counts of 3’ UTR when calculating overall gene expression could under-121

estimate the gene expression and possibly mistake the UTR foldchange for the gene foldchange.122

We therefore tried a modification of diffSplice to only calculate the gene foldchange from coding123

sequence (CDS) bins and then compare it to the individual bins. Again, this approach proved124

unsuccessful (Supplementary Figure 3).125

Discussion126

diffUTR streamlines DEU analysis and outperforms alternative methods in inferring UTR changes,127

which demonstrates the utility of harnessing powerful, well-established frameworks for new ends.128

It must be noted that the way in which the simulation was performed, i.e. elongating transcripts129

to the next polyA site(s), is similar to the way diffUTR disjoins the annotation into bins, which130

could cause a bias towards this method (as well as QAPA and APAlyzer, which also makes use of131

alternative polyA sites). However, this is unlikely to be the reason for the observed superiority of132

diffUTR -based methods given the considerable extent by which they outperformed alternatives,133

and the observation of similar results in real data.134

Similar to DEU tools [27], diffUTR fails to control the FDR correctly, and our attempts so far135

to improve this remained unsuccessful. We therefore recommend prudence with results close to136

the significance threshold. In addition, and in contrast to DEU where exons are subject to splicing137

in a potentially independent fashion, 3’ UTRs typically do not undergo splicing and therefore only138

differ in length between conditions. This means that the behavior of a UTR bin is dependent on139

that of upstream bins, a property which could be exploited to improve accuracy at the gene-level.140

However, our simple attempt to do so by combining p-values of consecutive bins did not have the141

desired outcome, pointing to the need of more research in this direction.142

Further, the bin-based approach has the drawback of not pinpointing the exact UTR locations:143

7

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


it is limited to the bin resolution, and the bins themselves are limited by incomplete transcript144

and APA annotations. Additionally, because there is a significant drop off in read coverage at the145

end of transcripts, we have observed that it is often bins upstream of the actual UTR lengthen-146

ing/shortening event which give a statistically-significant signal rather than the one truly affected.147

This is why we have provided tools to enable the further inspection of events in a given gene.148

Finally, the results of bin-based analyses are limited by the overlaps of transcripts from different149

genes, an issue on which differential transcript usage analysis approaches appear superior (e.g.150

[31]). However, transcript usage analysis tools are dependent on the completeness of the transcript151

annotation, while bin-based approaches are more open to the discovery of unannotated transcript152

variants, which is especially relevant for differential UTR usage. Here, we made the choice of153

including ambiguous bins, but flagging them as such, enabling users to interpret them with caution.154

While DEXSeq remains the tool of predilection for relative bin usage analyses, it scales very badly155

to larger sample sizes, and alternatives might be needed in some contexts. Our changes to156

limma’s original diffSplice method consistently result in more accurate predictions, making157

this new method the best compromise for bin-based approaches when DEXSeq is not applicable.158

More generally, it also shows that even with well-established approaches, there is still room for159

incremental, but non-negligible improvement.160

Methods161

0.1 Data and code availability162

The data objects and code used to produce the figures are available through the https://163

github.com/plger/diffUTR_paper repository. The diffUTR source code is available at https:164

//github.com/ETHZ-INS/diffUTR.165

0.2 RNAseq data processing166

For the evaluation of diffSplice2 in a standard DEU case, we used bin count data obtained167

from the authors of the original DEU benchmark [27]. For other datasets, reads were downloaded168

from the SRA, aligned to the GRCm38.p6 genome using STAR 2.7.3a with default parameters169

and the GENCODE M25 annotation as guide. The same gene annotation was used as input for170

bin creation.171

8

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://github.com/plger/diffUTR_paper
https://github.com/plger/diffUTR_paper
https://github.com/plger/diffUTR_paper
https://github.com/ETHZ-INS/diffUTR
https://github.com/ETHZ-INS/diffUTR
https://github.com/ETHZ-INS/diffUTR
https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


0.3 diffUTR172

diffUTR is implemented as a Bioconductor package making use of the extensive libraries avail-173

able, especially the GenomicRanges package [32] and the different DEU methods (see Differential174

analysis).175

0.3.1 Preparing bins176

Exons are extracted from the genome annotation and flattened into non-overlapping bins (Figure177

1B). In other words, the exon annotation is fragmented into the widest ranges where the set of178

overlapping features is the same. Bins that do not overlap with coding sequences (CDS) and179

belong to a protein coding transcript are labeled as UTR and the rest as CDS. When APA sites180

are also provided as input (for the purpose of this article, polyAsite v2.0 sites were used), bins are181

further segmented and/or extended. For this the closest upstream CDS or UTR is found for every182

poly(A) site and the UTR is defined from this boundary to the polyA site and assigned to the183

corresponding gene and transcript (Figure 1B). If the newly defined UTRs exceeds a predefined184

length specified by maxUTRbinSize (default is 15000bp), it is ignored as unlikely to be a real185

UTR. Moreover, if the start of a gene is the closest upstream sequence before any UTR or CDS186

the newly defined UTR is ignored to avoid assignment problems. In order to later differentiate187

between regions that are 3’ or 5’ UTRs, regions that are downstream of the last CDS of a given188

transcript were labeled as 3’ UTR. The label ‘non-coding’ is assigned to all bins that have no189

protein coding transcript overlapping it.190

If a bin originates from regions belonging to different genes, the bin is duplicated and as-191

signed once to each gene, so that each gene contains the same fragment once. Alternatively, the192

genewise argument can be used so that only exons belonging to the same gene are considered193

when flattening.194

0.3.2 Quantification195

For quantification, countFeatures() uses the featureCounts() function from the Rsubread196

package [33] to count previously mapped reads overlapping each bin. By default every read is197

assigned once to every bin it overlaps with and can therefore be counted multiple times, which is198

needed because many bins are shorter than the read length. Alternative counting methods, such as199

summarizeOverlaps() from the GenomicAlignments package [32] performed considerably worse200

9

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


in the simulation. The function returns a RangedSummarizedExperiment object [26], containing201

the read counts as well as the bin annotation.202

0.3.3 Differential analysis203

Three wrappers implement corresponding DEU methods on the204

RangedSummarizedExperiment object previously generated, returning results as further stan-205

dardized annotation within the object. For differential UTR analysis, gene-level results are ob-206

tained by filtering the bin-level results for those assigned to the type UTR and/or 3’ UTR, and207

setting all other p-values to 1 before aggregation.208

diffSpliceDGE.wrapper() This is a wrapper around edgeR ’s DEU method based on fitting a209

negative binomial generalized linear model [23]. In a first step the bins are filtered to decide which210

have a large enough read count to be kept for the statistical analysis (filterByExpr()), the library211

sizes are normalized (calcNormFactors()) and the dispersion is estimated (estimateDisp()).212

After this the model is fitted (glmFit()). If the option QLF = TRUE (default), an extended model213

is fitted, using quasi-likelihood methods to account for gene specific variability (glmQLFit()).214

In the last step bin fold changes are tested to be different from overall gene fold changes,215

using a likelihood ratio test or a quasi-likelihood F-Test depending on the QLF option chosen216

(diffSpliceDGE()). The gene level p-values are obtained by the Simes’ method [34].217

DEXseq.wrapper() In this method the standard DEXseq differential exon usage pipeline [22] is218

implemented. It is similarly to edgeR based on fitting a negative binomial model but instead of219

comparing fold change differences between bins and genes, DEXseq compares a full model con-220

taining a term corresponding to the change in exon usage between conditions to a reduced model221

without this term. The two fits are compared using a χ2 likelihood-ratio test. The libraries are nor-222

malized (estimateSizeFactor()), the dispersion is estimated (estimateDispersion() and the223

models are fitted (testForDEU()). In a last step the fold changes between the bins are estimated224

( estimateExonFoldChanges()). To obtain gene level results the function perGeneQValue()225

is used, which is based on the Šidák method [35].226

diffSplice.wrapper() and diffSplice2 This method implements the differential exon usage pipeline227

of limma for RNA-seq data [25]. The pre-processing is identical to diffSpliceDGE.wrapper(),228

then the precision weights are estimated with (limma::voom()) and the linear models are fitted229

10

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


(limma::lmFit()). In the last step, bin fold changes are tested to be different from overall230

gene fold changes, using a moderated t-test (diffSplice() or, by default, diffSplice2() – see231

below). The gene level p-values are obtained by the Simes’ method [34].232

The diffUTR::diffSplice2 function provides an improved version of limma’s original233

diffSplice method. diffSplice works on the bin-wise coefficient of the linear model which234

corresponds to the log2 fold changes between conditions. It compares the log2(fold change) β̂k,g235

of a bin k belonging to gene g, to a weighted average of log2(fold change) of all the other bins236

of the same gene combined B̂k,g (the subscript g will be henceforth omitted for ease of reading).237

The weighted average of all the other bins in the same gene is calculated by238

B̂k =

∑N
i,i6=k wiβ̂i∑N
i,i6=k wi

(1)

where wi =
1
u2i

and ui refers to the diagonal elements of the unscaled covariance matrix (X
T V X)−1.239

X is the design matrix and V corresponds to the weight matrix estimated by voom. The difference240

of log2 fold changes, which is also the coefficient returned by diffSplice() is then calculated241

by Ĉk = β̂k − B̂k. Instead of calculating the t-statistic with Ĉk, this value is scaled again in the242

original code:243

D̂k = Ĉk

√
1 −

wk∑N
i wi

(2)

and the t -statistic is calculated as:244

tk =
D̂k
uksg

(3)

s2g refers to the posterior residual variance of gene g, which is calculated by averaging the245

sample values of the residual variances of all the bins in the gene, and then squeezing these residual246

variances of all genes using empirical Bayes method. This assumes that the residual variance is247

constant across all bins of the same gene.248

In diffSplice2(), we applied three changes to the above method. First, the residual249

variances are not assumed to be constant across all bins of the same gene. This results in the250

sample values of the residual variances of every bin now being squeezed using empirical Bayes251

method, resulting in posterior variances s2i for every individual bin i. Second, the weights wi, used252

to calculate B̂k, now incorporate the individual variances by wi =
1

s2i u
2
i

. Third, the Ĉk value is253

11

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


directly used to calculate the t -statistic, which after all these changes now corresponds to254

tk =
Ĉk
uksi

. (4)

0.4 Simulated Data255

The simulation was done using the Polyester R package [36] using parameters obtained from the256

control samples of mouse hippocampus RNAseq [30]. Using salmon [17] with a decoy-aware tran-257

scriptome index for the mm10 genome from [37], the abundances for each transcript were first esti-258

mated to learn parameters for the simulation. 1000 transcripts from different genes were randomly259

chosen. The last exon of all these transcripts was lengthened to the next, second next or third next260

downstream APA site annotated in the polyAsite database [15]. Duplicates of these transcripts were261

generated, which had less or no lengthening of their last exon, generating pairs of transcripts with262

different UTR lengths. For each transcript pair, one transcript was up and the other one down reg-263

ulated by the same sampled fold change between 1.3 and 5. To make it more realistic, fold changes264

were also assigned to 300 genes from the set with differential UTR, and 300 genes that did not have265

differences in UTR usage. Reads were then generated for two conditions with three replicates each266

using the simulate experiment() function with the options paired = FALSE, error model =267

"illumina5", bias = "cdnaf" and strand specific = TRUE. The simulated reads are avail-268

able on figshare at https://dx.doi.org/10.6084/m9.figshare.13726143.269

0.5 3’-seq analysis270

To establish a set of true relative differences in UTR usage from the 3’ sequencing data [30], we271

downloaded the authors’ counts per cluster from the Gene Expression Omnibus (file272

GSE84643 3READS count table.txt.gz). We used the 3h treatment because we observed it273

to have the strongest signal, and excluded one sample (A6) that appeared like a strong outlier274

based on PCA and MDS plots. We kept only clusters with at least 50 reads in at least 2 samples,275

and used DEXSeq to fit a negative binomial on each gene and estimate the significance of the276

cluster:condition term. We considered as true positives genes with a gene-level and bin-level277

q-value ≤ 0.1, and true negatives genes with a gene-level q-value ≥ 0.8. Genes for which all278

tested methods produced a p-value of 1 or NA (i.e. genes filtered out as too lowly expressed in279

the standard RNAseq) were excluded for the benchmark.280

12

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://dx.doi.org/10.6084/m9.figshare.13726143
https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


0.6 Comparisons with alternatives281

For the comparison of methods, all functions were used with their default parameters and run282

according to their manual. As QAPA and DaPars do not provide means to aggregate the results283

to the gene level this was implemented separately. For DaPars the p-values were aggregated to284

the gene level by using Simes’ method [34] for comparability with diffUTR. Aggregation by taking285

the minimum p-value of all the transcripts in a gene produced extremely similar results. For QAPA286

|∆PAU| was calculated and aggregated to a gene level by taking the maximum from all transcripts287

of a gene and the genes were ranked by this value. Alternatively, we also tested applying a t -test288

on the log-transformed PAU values (log-transforming had a negligible effect), followed by Simes’289

gene-level aggregation. Attempts to complement QAPA with p-values estimated from established290

statistical tests working with its equivalence classes, such as BANDITS [31], did not improve the291

results and were therefore discarded so as not to distort the original method. Finally, for APAlyzer2292

we combined the 3’ UTR and intronic APA analyses by using the minimum of the two p-values.293

See the https://github.com/plger/diffUTR_paper repository for details.294

We used the following software versions for comparisons: Polyester 1.24.0, DEXSeq 1.34.0,295

edgeR 3.30.0, limma 3.44.0, DaPars 0.9.1, APAlyzer 1.5.5. For QAPA, we used salmon 1.3.0296

with validateMappings.297

Competing interests298

The authors declare no competing interests beside being the developers of the described package.299

Author’s contributions300

SG developed the bin preparation and the diffSplice modification, and ran most of the analyses.301

PLG and SG wrote the package and paper. PLG and GS supervised the project.302

Acknowledgements303

SG performed this research as part of his bachelor thesis in the Interdisciplinary Sciences program at304

ETH. PLG’s position is co-funded by Prof. Mark Robinson (Institute of Molecular Life Sciences,305

University of Zurich) and Professors Gerhard Schratt, Johannes Bohacek and Isabelle Mansuy306

(Institute of Neuroscience, ETH Zurich). GS is supported by grants from the SNF (SNF 179651,307

13

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://github.com/plger/diffUTR_paper
https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


SNF 189486) and the ETH (ETH-24 18-2 (NeuroSno)). We thank the Robinson group (UZH) for308

feedback.309

References310

1. Lewis, J. D., Gunderson, S. I. & Mattaj, I. W. The influence of 5′ and 3′ end structures on pre-mRNA311

metabolism. Journal of Cell Science. issn: 00219533 (1995).312

2. Tian, B. & Manley, J. L. Alternative polyadenylation of mRNA precursors. Nature Reviews Molecular Cell313

Biology. issn: 14710080 (2016).314

3. Fabian, M. R., Sonenberg, N. & Filipowicz, W. Regulation of mRNA translation and stability by microRNAs.315

Annual Review of Biochemistry. issn: 00664154 (2010).316

4. Derti, A. et al. A quantitative atlas of polyadenylation in five mammals. Genome Research. issn: 10889051317

(2012).318

5. Sandberg, R., Neilson, J. R., Sarma, A., Sharp, P. A. & Burge, C. B. Proliferating cells express mRNAs with319

shortened 3′ untranslated regions and fewer microRNA target sites. Science. issn: 00368075 (2008).320

6. Mayr, C. & Bartel, D. P. Widespread Shortening of 3UTRs by Alternative Cleavage and Polyadenylation321

Activates Oncogenes in Cancer Cells. Cell. issn: 00928674 (2009).322

7. Miura, P., Shenker, S., Andreu-Agullo, C., Westholm, J. O. & Lai, E. C. Widespread and extensive lengthening323

of 39 UTRs in the mammalian brain. Genome Research. issn: 10889051 (2013).324

8. Ha, K. C., Blencowe, B. J. & Morris, Q. QAPA: A new method for the systematic analysis of alternative325

polyadenylation from RNA-seq data. Genome Biology. issn: 1474760X (2018).326

9. Fox-Walsh, K., Davis-Turak, J., Zhou, Y., Li, H. & Fu, X. D. A multiplex RNA-seq strategy to profile poly(A327

+) RNA: Application to analysis of transcription response and 3′ end formation. Genomics. issn: 08887543328

(2011).329

10. Fu, Y. et al. Differential genome-wide profiling of tandem 3 UTRs among human breast cancer and normal330

cells by high-throughput sequencing. Genome Research. issn: 10889051 (2011).331

11. Zheng, D., Liu, X. & Tian, B. 3READS+, a sensitive and accurate method for 3 end sequencing of polyadeny-332

lated RNA. RNA. issn: 14699001 (2016).333

12. Jan, C. H., Friedman, R. C., Ruby, J. G. & Bartel, D. P. Formation, regulation and evolution of Caenorhabditis334

elegans 3′UTRs. Nature. issn: 00280836 (2011).335

13. Shepard, P. J. et al. Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA.336

issn: 13558382 (2011).337

14. Hwang, H. W. et al. cTag-PAPERCLIP Reveals Alternative Polyadenylation Promotes Cell-Type Specific338

Protein Diversity and Shifts Araf Isoforms with Microglia Activation. Neuron. issn: 10974199 (2017).339

15. Herrmann, C. J. et al. PolyASite 2.0: A consolidated atlas of polyadenylation sites from 3 end sequencing.340

Nucleic Acids Research. issn: 13624962 (2020).341

14

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


16. Wang, R., Nambiar, R., Zheng, D. & Tian, B. PolyA DB 3 catalogs cleavage and polyadenylation sites342

identified by deep sequencing in multiple genomes. Nucleic Acids Research 46, D315–D319. issn: 0305-1048.343

https://doi.org/10.1093/nar/gkx1000 (2021) (Jan. 2018).344

17. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware345

quantification of transcript expression. en. Nature Methods 14. Number: 4 Publisher: Nature Publishing346

Group, 417–419. issn: 1548-7105. https://www.nature.com/articles/nmeth.4197 (2021) (Apr. 2017).347

18. Xia, Z. et al. Dynamic analyses of alternative polyadenylation from RNA-seq reveal a 3′-UTR landscape across348

seven tumour types. Nature Communications. issn: 20411723 (2014).349

19. Ye, C., Long, Y., Ji, G., Li, Q. Q. & Wu, X. APAtrap: Identification and quantification of alternative polyadeny-350

lation sites from RNA-seq data. Bioinformatics. issn: 14602059 (2018).351

20. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: A revolutionary tool for transcriptomics. Nature Reviews352

Genetics. issn: 14710056 (2009).353

21. Wang, R. & Tian, B. APAlyzer: a bioinformatics package for analysis of alternative polyadenylation isoforms.354

Bioinformatics (Oxford, England). issn: 13674811 (2020).355

22. Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Research.356

issn: 10889051 (2012).357

23. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: A Bioconductor package for differential expression358

analysis of digital gene expression data. Bioinformatics. issn: 14602059 (2009).359

24. Law, C. W., Chen, Y., Shi, W. & Smyth, G. K. Voom: Precision weights unlock linear model analysis tools360

for RNA-seq read counts. Genome Biology. issn: 1474760X (2014).361

25. Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies.362

Nucleic Acids Research. issn: 13624962 (2015).363

26. Morgan, M., Obenchain, V., Hester, J. & Pagès, H. SummarizedExperiment: SummarizedExperiment con-364

tainer. R package version 1.12.0 (2018).365

27. Soneson, C., Matthes, K. L., Nowicka, M., Law, C. W. & Robinson, M. D. Isoform prefiltering improves perfor-366

mance of count-based methods for analysis of differential transcript usage. Genome Biology. issn: 1474760X367

(2016).368

28. Blair, J. D., Hockemeyer, D., Doudna, J. A., Bateup, H. S. & Floor, S. N. Widespread Translational Remodeling369

during Human Neuronal Differentiation. Cell Reports. issn: 22111247 (2017).370

29. Whipple, A. J. et al. Imprinted Maternally Expressed microRNAs Antagonize Paternally Driven Gene Programs371

in Neurons. English. Molecular Cell 78. Publisher: Elsevier, 85–95.e8. issn: 1097-2765. https://www.cell.372

com/molecular-cell/abstract/S1097-2765(20)30041-1 (2021) (Apr. 2020).373

30. Fontes, M. M. et al. Activity-Dependent Regulation of Alternative Cleavage and Polyadenylation during Hip-374

pocampal Long-Term Potentiation. Scientific Reports. issn: 20452322 (2017).375

31. Tiberi, S. & Robinson, M. D. BANDITS: Bayesian differential splicing accounting for sample-to-sample vari-376

ability and mapping uncertainty. Genome Biology. issn: 1474760X (2020).377

15

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1093/nar/gkx1000
https://www.nature.com/articles/nmeth.4197
https://www.cell.com/molecular-cell/abstract/S1097-2765(20)30041-1
https://www.cell.com/molecular-cell/abstract/S1097-2765(20)30041-1
https://www.cell.com/molecular-cell/abstract/S1097-2765(20)30041-1
https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/


32. Lawrence, M. et al. Software for Computing and Annotating Genomic Ranges. PLoS Computational Biology.378

issn: 1553734X (2013).379

33. Liao, Y., Smyth, G. K. & Shi, W. FeatureCounts: An efficient general purpose program for assigning sequence380

reads to genomic features. Bioinformatics. issn: 14602059 (2014).381

34. Simes, R. J. An improved bonferroni procedure for multiple tests of significance. Biometrika. issn: 00063444382

(1986).383

35. Šidák, Z. Rectangular Confidence Regions for the Means of Multivariate Normal Distributions. Journal of the384

American Statistical Association. issn: 1537274X (1967).385

36. Frazee, A. C., Jaffe, A. E., Langmead, B. & Leek, J. T. Polyester: Simulating RNA-seq datasets with differential386

transcript expression. Bioinformatics. issn: 14602059 (2015).387

37. Stolarczyk, M., Reuter, V. P., Smith, J. P., Magee, N. E. & Sheffield, N. C. Refgenie: a reference genome388

resource manager. GigaScience. issn: 2047217X (2020).389

16

.CC-BY-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted February 13, 2021. ; https://doi.org/10.1101/2021.02.12.430963doi: bioRxiv preprint 

https://doi.org/10.1101/2021.02.12.430963
http://creativecommons.org/licenses/by-nd/4.0/

	Data and code availability
	RNAseq data processing
	diffUTR
	Preparing bins
	Quantification
	Differential analysis

	Simulated Data
	3'-seq analysis
	Comparisons with alternatives