dynUGENE: an R package for uncertainty-aware gene regulatory network inference, simulation, and visualization


dynUGENE: an R package for
uncertainty-aware gene regulatory network

inference, simulation, and visualization
Tianyu Lu1,2� and Anjali Silva2,3,4

1 Department of Computer Science, University of Toronto, Toronto, Canada
2 Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
3 Princess Margaret Cancer Centre, University Health Network, Toronto, Canada

4 Vector Institute, Toronto, Canada

Methods for gene regulatory network inference focus on net-
work architecture identification but neglect model selection and
simulation. We implement an extension to the dynGENIE3 al-
gorithm that accounts for model uncertainty as an R package,
providing users with an easy to use interface for model selection
and gene expression profile simulation. Source code is avail-
able at https://github.com/tianyu-lu/dynUGENE with a detailed
user guide. A webserver with interactive controls is available at
https://tianyulu.shinyapps.io/dynUGENE/.

Gene regulatory network | network inference

Correspondence: tianyu.lu @mail.utoronto.ca

Introduction

Complex phenomena such as cell development and apopto-
sis emerge from coordinated dynamics of gene regulatory
networks (GRN). Inferring network structure from data can
be used for hypothesis generation, revealing mechanisms in
cell development and disease (Huang et al., 2009), and mod-
elling network evolution (Crombach and Hogeweg, 2008).
Accurate dynamical models allow us to predict the effects
of network perturbations on biological function, for example
to push cells out of a disease state (Karlebach and Shamir,
2010), or to design synthetic GRNs given the desired dynam-
ics of a network (Hiscock, 2019). The ideal model should be
flexible enough to capture highly nonlinear interactions while
not sacrificing model interpretability and computation time.

We present dynUGENE (dynamical Uncertainty-aware GEne
NEwork inference), an R package that extends the functional-
ity of dynGENIE3, a state-of-the-art method for GRN infer-
ence (Geurts et al., 2018). We build on dynGENIE3 because
it satisfies all three of our model desiderata. Existing exten-
sions include TIMEOR and BENIN which both incorporate
heterogeneous data to improve network inference accuracy
(Wonkap and Butler, 2020; Conard et al., 2020). Here, we
take a different approach and instead account for uncertainty
in dynGENIE3, allowing for stochastic gene expression sim-
ulations and parsimonious model selection. Our extension is
available as an easy to use R package and also as an interac-
tive web server.

Package Design
dynGENIE3 Background. dynGENIE3 poses GRN infer-
ence as a feature selection problem. It first trains random
forests to predict the change in concentration of each species
given the current concentrations of all species. Each interac-
tion from species xi to species xj is associated with an im-
portance score, calculated by the reduction in variance from
using xi to predict the change in xj. The importance score for
an interaction, when normalized, is interpreted as the proba-
bility of that interaction to exist. For a detailed treatment, see
the vignette and (Geurts et al., 2018).

Model Selection. The inferred network can be visualized as
a p×p matrix where the entry [xi,xj] is the importance score
of xi for inferring xj (Fig. 1). However, real GRNs are of-
ten not fully connected and the presence of an interaction is
binary (Mangan et al., 2016). To address this, dynUGENE
includes a function for model selection based on visualizing
the Pareto front (Mangan et al., 2016). However, we note
that the model at the sharp drop in the Pareto front is not al-
ways the best model (Supplementary Fig. S1). We include
an additional function on the web server where users can
choose which interactions to mask. The masked networks
can then be simulated, allowing for application-specific tun-
ing of model complexity.

Model Simulation. The inferred networks and masked net-
works can be used to simulate gene expression profiles by
numerically solving the system of ordinary differential equa-
tions learned by the random forests. In addition to determin-
istic simulations, we provide an option that accounts for the
uncertainty in the random forests predictions for stochastic
simulations. For stochastic simulations, instead of only tak-
ing the mean of a random forest’s predictions, we sample
from the Gaussian N(µ,σ2) where µ is the mean and σ2 is
the variance of the random forest’s predictions.

Provided Datasets. The dynUGENE package provides four
example time-series datasets: repressilator, stochastic re-
pressilator, Hodgkin-Huxley, and stochastic Hodgkin-Huxley
(Elowitz and Leibler, 2000; Hodgkin and Huxley, 1952).
These datasets were generated from systems of ordinary or

Lu et al. | bioRχiv | January 7, 2021 | 1–3

.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.07.425782doi: bioRxiv preprint 

https://github.com/tianyu-lu/dynUGENE
https://tianyulu.shinyapps.io/dynUGENE/
https://doi.org/10.1101/2021.01.07.425782
http://creativecommons.org/licenses/by/4.0/


Fig. 1: Bottom: inferred importance scores on the repressilator dataset for the
16th network in the step-wise column masks plot (Supplementary Fig. S2). Top:
Simulated trajectory using the inferred network.

stochastic differential equations. Details are provided in the
vignette. The package also includes one steady state dataset,
SynTReN300, taken from GRNdata (Bellot et al., 2020).
Users can provide their own data as input following the for-
mat specified in ?inferNetwork.

Discussion
A requirement for dynGENIE3 and dynUGENE is that all
species must be tracked through time. This requirement is
difficult to satisfy in practice as there are often unknown
species in a biological process of interest. Methods that can
identify or approximate latent structure in partially-observed
systems are more appropriate here (Hiscock, 2019). An
omics treatment such as RNA-seq can cover breadth but cur-
rent sequencing techniques require cells to be destroyed, thus
making time series data collection difficult. Non-destructive
sequencing techniques could address this issue.
The implementation of an inferred network as a gene circuit
will require more thought. Even for networks with sparse
interactions, the likelihood of finding a set of genes and pro-
teins that satisfy the interaction strengths and activation or
inhibitory effects is unknown. In fact, whether a species is
an activator or inhibitor is not explicitly given in the interac-
tion matrix. We can address this by posing dynUGENE as a
constrained optimization problem where it is limited to using
only a given set of parts (genes, promoters, ribosome bind-
ing sites, proteins, etc.) thus relating the importance scores
with biological interaction strengths. We leave this for future
work.

Data and code availability
Source code is available at https://github.com/tianyu-
lu/dynUGENE with a detailed user guide. A
webserver with interactive controls is available at
https://tianyulu.shinyapps.io/dynUGENE/.

ACKNOWLEDGEMENTS
The authors thank the authors of dynGENIE3 for their work and Alan Moses for
guidance.

FUNDING
This work was supported by a Postdoctoral Fellowship from Canadian Institutes of
Health Research.

Bibliography
Sui Huang, Ingemar Ernberg, and Stuart Kauffman. Cancer attractors: a systems view of tu-

mors from a gene network dynamics and developmental perspective. In Seminars in cell &
developmental biology, volume 20, pages 869–876. Elsevier, 2009.

Anton Crombach and Paulien Hogeweg. Evolution of evolvability in gene regulatory networks.
PLoS Computational Biology, 4(7):e1000112, 2008.

Guy Karlebach and Ron Shamir. Minimally perturbing a gene regulatory network to avoid a
disease phenotype: the glioma network as a test case. BMC Systems Biology, 4(1):15, 2010.

Tom W Hiscock. Adapting machine-learning algorithms to design gene circuits. BMC Bioinfor-
matics, 20(1):1–13, 2019.

Pierre Geurts et al. dyngenie3: dynamical genie3 for the inference of gene networks from time
series expression data. Scientific Reports, 8(1):1–12, 2018.

Stephanie Kamgnia Wonkap and Gregory Butler. Benin: Biologically enhanced network inference.
Journal of Bioinformatics and Computational Biology, 18(03):2040007, 2020.

Ashley Mae Conard, Nathaniel Goodman, Yanhui Hu, Norbert Perrimon, Ritambhara Singh,
Charles Lawrence, and Erica Larschan. Timeor: a web-based tool to uncover temporal regu-
latory mechanisms from multi-omics data. bioRxiv, 2020.

Niall M Mangan, Steven L Brunton, Joshua L Proctor, and J Nathan Kutz. Inferring biological
networks by sparse identification of nonlinear dynamics. IEEE Transactions on Molecular,
Biological and Multi-Scale Communications, 2(1):52–63, 2016.

Michael B Elowitz and Stanislas Leibler. A synthetic oscillatory network of transcriptional regula-
tors. Nature, 403(6767):335–338, 2000.

Alan L Hodgkin and Andrew F Huxley. A quantitative description of membrane current and its
application to conduction and excitation in nerve. The Journal of Physiology, 117(4):500,
1952.

Pau Bellot, Catharina Olsen, and Patrick E Meyer. grndata: Synthetic Expression Data for Gene
Regulatory Network Inference, 2020. R package version 1.20.0.

Carl Ganz. rintrojs: A wrapper for the intro. js library. Journal of Open Source Software, 1(6):63,
2016.

Gregory R. Warnes, Ben Bolker, Lodewijk Bonebakker, Robert Gentleman, Wolfgang Huber, Andy
Liaw, Thomas Lumley, Martin Maechler, Arni Magnusson, Steffen Moeller, Marc Schwartz,
and Bill Venables. gplots: Various R Programming Tools for Plotting Data, 2020. R package
version 3.1.0.

Hadley Wickham. ggplot2: elegant graphics for data analysis. springer, 2016.
Christopher Rackauckas and Qing Nie. Adaptive methods for stochastic differential equations via

natural embeddings and rejection sampling with memory. Discrete and Continuous Dynamical
Systems. Series B, 22(7):2731, 2017a.

Christopher Rackauckas and Qing Nie. Differentialequations. jl–a performant and feature-rich
ecosystem for solving differential equations in julia. Journal of Open Research Software, 5
(1), 2017b.

R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for
Statistical Computing, Vienna, Austria, 2020.

2 | bioRχiv Lu et al. | dynUGENE

.CC-BY 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 

The copyright holder for this preprintthis version posted January 8, 2021. ; https://doi.org/10.1101/2021.01.07.425782doi: bioRxiv preprint 

https://github.com/tianyu-lu/dynUGENE
https://github.com/tianyu-lu/dynUGENE
https://tianyulu.shinyapps.io/dynUGENE/
https://doi.org/10.1101/2021.01.07.425782
http://creativecommons.org/licenses/by/4.0/