Making simulation results reproducible-Survey, guidelines, and examples based on Gradle and Docker


Making simulation results reproducible—
Survey, guidelines, and examples based on
Gradle and Docker
Wilfried Elmenreich, Philipp Moll, Sebastian Theuermann and
Mathias Lux

Universität Klagenfurt, Klagenfurt, Austria

ABSTRACT
This article addresses two research questions related to reproducibility within the
context of research related to computer science. First, a survey on reproducibility
addressed to researchers in the academic and private sectors is described and
evaluated. The survey indicates a strong need for open and easily accessible results, in
particular, reproducing an experiment should not require too much effort. The
results of the survey are then used to formulate guidelines for making research results
reproducible. In addition, this article explores four approaches based on software
tools that could bring forward reproducibility in research results. After a general
analysis of tools, three examples are further investigated based on actual research
projects which are used to evaluate previously introduced tools. Results indicate that
the evaluated tools contribute well to making simulation results reproducible but
due to conflicting requirements, none of the presented solutions fulfills all intended
goals perfectly.

Subjects Data Science, Scientific Computing and Simulation, Software Engineering
Keywords In-silico research, Reproducibility, Simulation

INTRODUCTION
Reproducibility of experimental results is fundamental in all scientific disciplines.
Reproducing results of published experiments, however, is often a cumbersome and
unrewarding task. Casadevall & Fang (2010) report that some fields, for example biology,
are concerned with complex and chaotic systems which are difficult to reproduce.
At the same time, we would expect digital software-based experiments to be easily
reproducible, because digital data can be easily provided and computer algorithms on these
data are typically well-described and deterministic. However, this is often not the case
due to a lack of disclosure of relevant software and data that would be necessary to
reproduce results. Ongoing open science initiatives aim to have researchers provide access
to data and software together with their publications in order to allow reviewers to make
well-informed decisions and to provide other researchers with the information and
necessary means to build upon and extend original research (Ram, 2013).

This article addresses two research questions (RQ) related to reproducibility:

RQ1 “To what extent is reproducibility of results based on software artifacts important in
the field of computer science and in related research areas?”
RQ2 “What tools can be used to support reproducibility?”

How to cite this article Elmenreich W, Moll P, Theuermann S, Lux M. 2019. Making simulation results reproducible—Survey, guidelines,
and examples based on Gradle and Docker. PeerJ Comput. Sci. 5:e240 DOI 10.7717/peerj-cs.240

Submitted 9 June 2019
Accepted 1 November 2019
Published 9 December 2019

Corresponding author
Wilfried Elmenreich,
wilfried.elmenreich@aau.at

Academic editor
Philipp Leitner

Additional Information and
Declarations can be found on
page 25

DOI 10.7717/peerj-cs.240

Copyright
2019 Elmenreich et al.

Distributed under
Creative Commons CC-BY 4.0

http://dx.doi.org/10.7717/peerj-cs.240
mailto:wilfried.�elmenreich@�aau.�at
https://peerj.com/academic-boards/editors/
https://peerj.com/academic-boards/editors/
http://dx.doi.org/10.7717/peerj-cs.240
http://www.creativecommons.org/licenses/by/4.0/
http://www.creativecommons.org/licenses/by/4.0/
https://peerj.com/computer-science/


RQ1 addresses the aspects of the relevance of reproducibility to a researcher’s field,
willingness to contribute to making one’s own work reproducible, and possible concerns.
An online survey was designed to assess the current practice, subject awareness, and
possible concerns. The focus of the survey was on the disciplines computer science,
computer engineering, and electrical engineering and it addressed researchers at different
positions in universities, research institutions, and companies. To answer RQ2, we present
three examples where three different types of software projects are packaged to provide
an accurate and easy possibility for reproducing results in a controlled environment and
analyze how these solutions address the requirements derived from the survey.

The responses to our online survey confirm our initial assumption that the
reproducibility of research results is an important concern in computer science research.
One of the researchers’ main reasons for publishing software artifacts along with scientific
publications is improved credibility and reliability of results. Based on the survey’s
results presented in the section “Survey”, we infer requirements and general guidelines
assisting researchers in making their research reproducible in the section “Requirements
and General Guidelines”. Finally, we discuss how different tools comply with the created
requirements and guidelines. We find that due to conflicting requirements, none of the
presented solutions fulfills all intended goals perfectly. One of the most pressing challenges
is to achieve long term availability of results while respecting licensing issues of required
third-party dependencies. An in-depth discussion of open issues is elaborated in the
section “One Tool to Reproduce them All?” and we conclude the article and highlight our
major findings in the section “Conclusion”.

RELATED WORK
Walters (2013) notes that it is often difficult to reproduce the work described in molecular
modeling and chemoinformatics papers. For the most part this is due to the absence of
a disclosure requirement in many scientific publication venues thus far. Morin et al. (2012)
report that in 2010 only three of the 20 most cited journals had editorial policies requiring
availability of source code after publication. Fortunately, this situation is changing for
the better, for example Science introduced a policy requiring authors to make data and
code related to their publication available whenever possible (Witten & Tibshirani, 2013;
Peng, 2011; Hanson, Sugden & Alberts, 2011). Commenting on this policy, Shrader-
Frechette & Oreskes (2011) brought up the issue that although privately funded science
may be of high quality, it is not subject to the same requirements for transparency as
publicly funded science. Another obstacle is the use of closed-source tools and
undisclosed software results in publicly funded research software development projects
as discussed by Morin et al. (2012). Vitek & Kalibera (2011) address the topic of
repeatability and reproducibility for systems research and identify a particular difficulty
for embedded systems due to companies being reluctant to release code and that
implementations are often bound to specific hardware.

Focusing on the current state of reproducibility, ACM SIGCOMM Computer
Communication Review (CCR) conducted a survey on reproducibility with 77 responses
from authors of papers published in CCR and the SIGCOMM sponsored conferences

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 2/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


(Bonaventure, 2017). The responses showed that there is a good awareness of the need for
reproducibility and a majority of authors either considered their paper self-contained or
have released the software used to perform experiments. However, there were only few
releases of experimental data or of modifications of existing open source software.
The open question part of the survey indicated a need for encouragement for publishing
reproducible work or for papers that attempt to reproduce or refute earlier results.

Flittner et al. (2018) conducted an analysis of papers from four different ACM
conferences held in 2017. This study found that the type of artifacts can differ
significantly between different communities. The analysis further indicates that even if
researchers state that their work is reproducible, the majority of analyzed papers do not
provide the complete toolset to reproduce all results. Most importantly, the study
shows that published artifacts are indeed reused, which is why the authors encourage
others to release artifacts.

A critical aspect when releasing artifacts is to decide on tools supporting researchers
in the process of making research reproducible. Several papers report on case studies for
data repositories in the context of reproducibility including fields such as geographic
information systems (Steiniger & Hunter, 2013), astrophysics (Allen & Schmidt, 2015),
microbiome census (McMurdie & Holmes, 2013), and neuroimaging (Poline et al., 2012).
These examples are promising, but it cannot be expected that the approaches are going to
be used beyond the field they have been introduced. Simflowny (Arbona et al., 2013)
is a platform for formalizing and managing the main elements of a simulation flow, tied
not to a field, but to a specified simulation architecture. The Whole Tale approach
(Brinckman et al., 2018) aims at linking data, code, and digital scholarly objects to
publications and integrating all parts of the research story. Other works focus on code
and data management, such as Ram (2013) suggesting very general version control
systems such as Git for transparent data management in order to enable reproducibility
of scientific work. The CARE approach (Janin, Vincent & Duraffort, 2014) extends
the archiving concept with an execution system for Linux systems, which also takes
software installation and dependencies into account. Docker (Boettiger, 2015), which will
be examined more closely in this article, provides an even more generic approach by
utilizing virtualization for providing cross-platform portability. A tutorial for using
Docker to improve reproducibility in software and web engineering research was
published in Cito, Ferme & Gall (2016). Reprozip by Chirigati, Shasha & Freire (2013)
provided a packing and unpacking mechanism for Linux systems allowing the creation
of a package from a computer experiment which can be unpacked on another target
machine, including support for unpacking into a Docker image. In contrast to the
work presented above, our work focuses on the researchers’ requirements regarding
reproducibility independent of the capabilities of individual tools. Based on survey
responses, we infer requirements and guidelines for making research reproducible and
further analyze how different tools for packaging software artifacts comply with the
researchers’ needs. We further identify limitations of current tools and raise awareness of
researchers on the pros and cons of using different types of applications for making
research reproducible.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 3/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


SURVEY
In computer science, a large amount of research is backed up by prototypes, implementations
of algorithms, benchmarking and evaluation tools, and data generated in the course of
research. A critical factor for cutting edge research is to be able to build upon the results of
other researchers or groups by extending their ideas, applying them to new domains
or by reflecting them from a new angle. This is easily done with scientific publications,
which are mostly available online. While the hypotheses, findings, models, processes,
and equations are published, the data generated and the tools used for generating data
and evaluating new approaches are sometimes only pointed out but have to be
found elsewhere.

Our hypothesis in that direction is that there is a gap between scientific publishing on
one hand and the publication of software artifacts and data for making results reproducible
for other researchers on the other. In that sense, we created a survey asking researchers
in the computer science field for their approach and opinion.

Methodology
The survey design is driven by our research question RQ1 (“To what extent is
reproducibility of results based on software artifacts important in the field of computer
science and in related research areas?”). The survey consists of five parts. First, basic
demographic information, including the type of research, the area of research, the typical
research contribution, and the type of organization the researchers are working for, is
collected. Second, the common practice of the researchers for publishing software artifacts
and data is surveyed. Third, we focus on the researchers’ expectations regarding the
procedure of reproducing scientific results. Fourth, we ask for opinions on integrating the
question of reproducibility in the peer review process. Finally, we collect additional
thoughts with open questions.

Five-point Likert scales are used to indicate the level of agreement in the survey. For
questions where Likert scales are not applicable, single-choice or multiple-choice questions
(e.g., “What are the typical results of your research work?”), or numerical inputs without
predefined range or categories (e.g., “How much time (in hours) are you willing to invest
to make the results of a paper reproducible?”) are used. Generally, we did not offer a “I don’t
know” or similar option. For single-choice and multiple-choice questions we discussed
the nominal scales based on related work as well as the authors’ experience. Pilots with
people neither involved with the questionnaire nor taking part in the final survey were
conducted to reduce the chance of leaving out important options. For the sake of
completeness, custom values are allowed in addition to the given options, to allow
researchers to report on their practice. Open-ended questions are only used where other
types of questions might limit the spectrum of answers.

The survey was set up as an anonymous online survey, with no partial answers allowed
as all questions were mandatory and only the final submission at the end of the survey
would save the results. The survey was distributed via a scientific mailing lists and via
personal contacts with the request to distribute the survey among colleagues1. The full
survey and all responses are included in the Supplemental Material.

1 The online survey was distributed on the
following channels: Information-Centric
Networking research group discussion
list (https://www.irtf.org/mailman/
listinfo/icnrg); the Google Group comp.
simulation (https://groups.google.com/
forum/#!forum/comp.simulation); the
authors’ Facebook and Twitter profiles;
and via personal contacts.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 4/27

http://dx.doi.org/10.7717/peerj-cs.240#supplemental-information
https://www.irtf.org/mailman/listinfo/icnrg
https://www.irtf.org/mailman/listinfo/icnrg
https://groups.google.com/forum/#!forum/comp.simulation
https://groups.google.com/forum/#!forum/comp.simulation
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


Demographics
In total, we received 125 responses, mostly from academic researchers. The demographics
of survey participants are visualized in Fig. 1. Seventy-four out of the 125 participants
were working or studying at a university and 35 of 125 of research institutes. Thirteen
participants noted that they were mainly working for a company, two were private
researchers, one from a school. With their position, 30% of the participants were PhD
students, 28% were professors or group leaders, 17% worked as researchers within a
project, 12% were principal investigators, and 9% were bachelor or master students at the
time of the study. Three participants were heads of departments or organizations,
and two participants indicated that they were postdoctoral researchers. Computer
science or computer engineering was the area of research for 72% of the participants.
Regarding the area of research, 7.2% of the participants came from electrical engineering,
4% from information systems, 3.3% from (applied) mathematics, and 1.6% from
simulation. Furthermore, singular mentions were applied informatics, ciencias sociales,
computational biology, computational biology/numerical simulations, computer
networks, data analysis, economics, management, materials science, mathematical
modeling, medical informatics, physics, scientific computing, and user experience. The
population also includes researchers for whom publishing software is common practice;
28% of the participants have indicated that they have not published any software artifact
at the time of the study.

Survey result summary
Four aspects of the survey responses are analyzed. First, the relevance of reproducibility for
the research community is analyzed. Second, we investigate what people are willing to
do in order to achieve reproducible research. Third, we discuss the researchers’ opinions
on reproducibility in the peer review process. Finally, we highlight concerns regarding
sharing scientific software artifacts.

Figure 2 summarizes the responses to questions showing the relevance of
reproducibility in research. As can be seen, the majority of people wants to reproduce
results from other researchers or groups: 103 of 125 indicated agreement. Even more
(110 out of 125) considered reproducible results as added value for research papers. It can

0 10 20 30
Number of responses

Other

Head of Department / Organisation

Bachelor or Master student

Principal investigator

Researcher employed in a project

Professor  / Group leader

PhD student

(A)  Positions

0 20 40 60
Number of responses

Other

Information Systems

Electrical Engineering

Computer Engineering

Computer Science

(B)  Area or Research

0 20 40 60
Number of responses

School

I'm a private researcher

Company

Research institute

University

(C)  Research Environment

Figure 1 Demographics of the survey participants including positions (A), area of research (B) and research environment (C). For presentation
reasons, categories with less than three researchers were summarized as “Other” in (A) and (B). Full-size DOI: 10.7717/peerj-cs.240/fig-1

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 5/27

http://dx.doi.org/10.7717/peerj-cs.240/fig-1
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


be seen that the majority of researchers (96 out of 125) wants to build their research on the
work of others, which requires others to share scientific artifacts.

It can be seen from the researchers’ demographics in Fig. 3 that the relevance of
reproducibility is independent of a researcher’s position, research area, and research
environment. The results of the question “I want to reproduce the results of other
researchers or groups from their original work (software tools or libraries) to compare it to
my work” were grouped by position, research area, and research environment. These
distributions look very similar for all questions from Fig. 2. A full collection of graphical
illustrations of these distributions is included in the Supplemental Material of the paper.

An open-ended question asking why software artifacts should be published yielded
diverse answers. The most frequent answers were improvements in credibility and
reliability of results, building trust, and improving understanding of the results of others.
Besides, answers included the benefit of a practical approach by fostering task-based
research, increasing visibility for your research by making tools available and open
communication to foster research in general.

After showing the researchers’ interests in reproducibility, which are aligned with the
results from other published surveys, we now evaluate what researchers are willing to do
to make their results reproducible for others and how much effort they are willing to
invest to reproduce the results of others. Focusing on Fig. 4, we see that about half of the
researchers typically try to reproduce the results of others by running their tools (53 out
of 125). This again shows the demand for publishing scientific software artifacts.
The average amount of time participants would invest in making software of others work
to reproduce results was 23.12 hours, neglecting two outliers who would spend 105 and

81 29 10 1 4

57 46 17 2 3

strongly agree agree neutral disagree strongly disagree

Published software artifacts are added value to the published text in the 
research paper.

47 49 21 5 3

I want to reproduce the results of other researchers or groups from their 
original work (software tools or libraries) to compare it to my work.

I want to build my research tools by extending on the work (software,
tools, or frameworks) of other researchers or groups.

Figure 2 Responses to questions focusing on the general relevance of reproducibility.
Full-size DOI: 10.7717/peerj-cs.240/fig-2

strongly agree agree neutral disagree strongly disagree

(A)  Position

All responses

Other

Professor  / Group leader

Principal investigator

Researcher employed in a project

PhD student

Bachelor or Master student

57

3

14

8

12

15

5

46

2

15

5

8

13

3

17

6

1

1

7

2

2

2

3

1

1

1

(B)  Area of Research

All responses

Other

Electrical Engineering

Computer Engineering

Computer Science

57

8

2

7

40

46

11

5

9

21

17

6

2

2

7

2

1

1

3

1

2

(C)  Research Environment

All responses

Other

Company

Research institute

University

57

1

7

18

31

46

1

2

12

31

17

3

5

9

2

2

3

1

1

1

Figure 3 Responses to the question “I want to reproduce the results of other researchers or groups from their original work (software tools or
libraries) to compare it to my work.” grouped by researchers’ positions (A), research area (B) and research environment (C).

Full-size DOI: 10.7717/peerj-cs.240/fig-3

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 6/27

http://dx.doi.org/10.7717/peerj-cs.240#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.240/fig-2
http://dx.doi.org/10.7717/peerj-cs.240/fig-3
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


1035 or more hours2. The responses to the corresponding survey questions are visualized in
Fig. 5A. Two participants noted that they would invest more than a month of work time
(>160 h) to reproduce results of others, 10 participants noted that they would invest
between a week and a month (41–160 h), 42 participants would invest up to a week, but
less than a day (9–40 h), 47 would invest up to a day of work (1–8 h) and only three
participants would not invest any time at all.

Most researchers agreed they would like to publish their software to aid reproducibility.
The question of whether researchers want to publish their software tools to allow others to
reproduce their results was answered with agreement from the majority of researchers
(103 out of 125) with only 16 disagreeing. When publishing software, 102 out of 125
researchers intend to provide detailed documentation on how to install and run the
published software artifact. The question of how many hours researchers want to invest in
making their results reproducible led to an average of 24.4 hours. We excluded three
outliers with answers of 1,000, 106, and 1025 hours as we agreed that the answer of
1,000 hours—in other words 25 work weeks—and more is more likely to be a
misunderstanding of the question and may include the original research work in addition
to the extra work of making the results reproducible. The results can be seen in Fig. 5B.
Summarizing the results in clusters results in two participants investing more than a month
of work time (>160 h), seven participants would invest up to a month (41–160 h), 49
participants indicating they would invest up to week of work time (9–40 h), 38 participants
reporting to invest up to a day (1–8 h), and only four indicating that they would not invest
any time. Interpreting these numbers, we see that researchers are willing to invest more
time to make their own research reproducible than to reproduce the results of others.

strongly agree agree neutral disagree strongly disagree

I typically try to reproduce the research results of other groups or
researchers by installing and running their tools.

21 32 41 23 8

66 36 11 7 5
When I publish software I intend to provide detailed documentation  on
how to install and run the software.

74 29 6 9 7
I want to publish software tools an methods from my research to allow
others to reproduce my results

Figure 4 Responses to questions focusing on what researchers are willing to do to achieve
reproducible results or to share artifacts. Full-size DOI: 10.7717/peerj-cs.240/fig-4

0 20 40 60 80 100 120
Responses ranked by value

1

10

100

1000

Ti
m

e 
[h

]

a) How much time (in hours)
are you willing to invest into

reproducing a result or get the software
tools of others installed and running?

0 20 40 60 80 100 120
Responses ranked by value

1

10

100

1000

Ti
m

e 
[h

]

b) How much time (in hours)
are you willing to invest to make

the results of a paper reproducible?

0 20 40 60 80 100 120
Responses ranked by value

1

10

100

1000

Ti
m

e 
[y

ea
rs

]

c) How long (in years) do you
think software for reproducing

research results should be 
runnable / compilable / available?

A B C

Figure 5 Responses to the above questions on how much hours researcher would invest into reproducing (A) and making reproducible (B) as
well as how many years result should be reproducible (C). Note that the y-axis is logarithmic. Full-size DOI: 10.7717/peerj-cs.240/fig-5

2 Using the range of mean ± 3 times
standard deviation for outlier detection

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 7/27

http://dx.doi.org/10.7717/peerj-cs.240/fig-4
http://dx.doi.org/10.7717/peerj-cs.240/fig-5
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


The results of a multiple-choice question asking for the typical composition of research
results shows that software implementations and datasets are already common artifacts
of today’s research, indicating the potential utility of making research reproducible. Besides
results in written form—107 researchers mentioned published papers and 37 participants
reports with detailed results—a software implementation is part of the research results for
87 participants and 47 participants mentioned a dataset being part of their results.

Another important aspect for reproducible research is the long-term availability of
results and artifacts. The effort of preparing and publishing software artifacts and results
would ultimately be in vain if the artifacts later become inaccessible. Participants were
asked about their opinion on how long results and necessary software artifacts should be
available after initial publication. The results can be seen in Fig. 5C. With the exception
of five outliers (with seven answers of 0 for not supporting reproducibility at all, as well as
106, 109, and 1025 years, which is too long a time for all currently known digital storage media
to survive), the participants stated that software for reproducing results should be
available for an average of 9.1 years. Summarizing through clusters 18 participants stated
it should be from 0.5 to 2 years, 67 indicated it should be 3–5 years, 26 state 6–10 years,
and nine think it should be more than 10 years available.

Asked about how they share research artifacts or make results reproducible, 90 out of
125 participants stated to have already published software at the time of their participation
in the survey. Means of making their results reproducible were—multiple means could
be specified—detailed instructions (68), make scripts (54), installation scripts (34),
virtualization software (29), and container frameworks (15). There were two mentions of
hosting web front ends as means of making results available and three mentions of public
source code repositories as platforms for distribution.

Now that we are aware of current practices for making results reproducible, we focus on the
role of reproducibility in the peer review process. Our assumption is that testing
for reproducibility during the peer review process could enhance the credibility of published
results and thereby increase the quality of a paper. This opinion is shared by the survey
participants as visualized in Fig. 6: A total of 87 out of 125 participants stated that checking for
reproducibility should be part of the peer review process. Furthermore, 79 out of 125
participants would be willing to check results in addition to the traditional peer review process.

Here, differences among different positions and research areas can be found (see Fig. 7).
When focusing on the researchers’ position, nine out of 10 bachelor or master students
showed agreement, with none indicating disagreement. Principal investigators indicated
the lowest agreement. Differences can also be seen regarding different research areas.
Researchers from computer engineering showed the least agreement, whereas electrical

strongly agree agree neutral disagree strongly disagree

34 45 18 14

Checking for reproducibility of research results
should be part of a peer review process.

35 52 23 7 1

As a reviewer I would be willing to check research
results in addition to the traditional peer review.

7

Figure 6 Responses on questions focusing on the role of reproducibility in the peer review process.
Full-size DOI: 10.7717/peerj-cs.240/fig-6

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 8/27

http://dx.doi.org/10.7717/peerj-cs.240/fig-6
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


engineers indicated the most agreement. Researchers from other research areas, including
computer science, indicated a similar interest. No differences between different research
environments were identified.

When analyzing the survey results on the researchers’ concerns regarding publishing
scientific software artifacts, we can see that the traditional payment models of scientific
publishers used for research papers are seen as critical for publishing software artifacts.
Figure 8 shows that 104 out of 125 researchers indicate that their results can be reproduced
with free and open source software. This goes hand in hand with researchers’ reluctance
to pay for publishing or accessing software artifacts. Only 24 out of 125 researchers are
willing to pay for making software tools, frameworks, and subsequently their results to be
available to other researchers. A few more, but still only 28 out of 125 researchers
indicate agreement with paying for being able to reproduce the results of others. These
responses indicate the importance of possibilities for sharing software artifacts free of
charge regardless of the platform. Moreover, even researchers willing to pay for software,
might face problems due to closed-source components or other limitations.

Continuing in this vein, we asked why results cannot be reproduced using open source
tools. A total of 50 participants indicated the use of paid-for programming language
environments, 35 the use of licensed operating systems, 19 the use of copyrighted
materials, and 11 the use of commercial tools.

Computer security, when installing programs from others, is not a major concern for
69 out of 125 participants, which is alarming when reflecting on possible security issues.
An explanation could be that the researchers’ awareness is low because they themselves
would not harm others and believe others to be benevolent as well. However, this mindset
does not account for security issues that do not originate from other researchers, but from

strongly agree agree neutral disagree strongly disagree

(A)  Position

All responses

Other

Professor  / Group leader

Principal investigator

Researcher employed in a project

PhD student

Bachelor or Master student

34

1

7

4

7

12

3

45

4

13

2

9

11

6

18

6

2

2

7

1

14

5

3

3

3

7

1

2

2

2

(B)  Area of Research

All responses

Other

Electrical Engineering

Computer Engineering

Computer Science

34

9

2

3

20

45

10

6

3

26

18

4

1

3

10

14

2

6

6

7

1

1

5

(C)  Research Environment

All responses

Other

Company

Research institute

University

34

1

4

10

19

45

5

14

26

18

1

2

3

12

14

1

5

8

7

1

1

1

4

Figure 7 Responses to the question “As a reviewer I would be willing to check research results in addition to the traditional peer review.”
grouped by the researchers’ position (A), research area (B) and research environment (C). Full-size DOI: 10.7717/peerj-cs.240/fig-7

strongly agree agree neutral disagree strongly disagree

11 13 34 34

Is it possible to reproduce your research results with fre e and open
source software?

51 53 15 5 1

I am willing to pay for open access to my research software tools and
frameworks to make them available to other researchers.

33

7 21 37 26
I am willing to pay for easily accessible software tools and for being able to
reproduce the results of other researchers.

34

30 26 31 27
Local computer security and local data security is a major concern for me
when installing and running software from other researchers.

11

Figure 8 Responses to questions focusing on additional concerns when publishing scientific
artifacts. Full-size DOI: 10.7717/peerj-cs.240/fig-8

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 9/27

http://dx.doi.org/10.7717/peerj-cs.240/fig-7
http://dx.doi.org/10.7717/peerj-cs.240/fig-8
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


used third-party libraries. Therefore, software from unknown sources, or with unknown
dependencies, should always be handled with care.

We further see that security awareness depends on the researchers’ position (see Fig. 9).
Undergraduate and master students indicate the highest awareness of security risks,
while professors and principal investigators the lowest. A possible interpretation is that
researchers in higher positions neglect security issues because of the high pressure to
progress research. Students, in contrast, focus on smaller tasks and complete them more
carefully. Regarding security awareness across different research areas, computer engineers
have the highest awareness with 12 out of 19 researchers indicating agreement on the
question “Local computer security and local data security is a major concern for me when
installing and running software from other researchers.” For other fields, the awareness or
lack thereof is almost equally low.

Besides economical and security concerns, we also asked researchers about additional
reservations. A multiple-choice question on major concerns showed that when installing
and running software from other researchers the ease of installation is a prominent
topic. This questions allowed for multiple choice as well as an other option, where
participants could voice their concerns. Answers included:

� Ease of the installation (without major barriers) (104 mentions),
� Hardware requirements like computation power, memory, or specialized equipment
(71 mentions),

� License issues (72 mentions),
� Size of the download and installation (27 mentions),
� Used harddisk space after installation (two mentions),
� I don’t see additional concerns (eight mentions),
� Other (with the option of giving text here).

Five other answers were entered:

� “I am sure it does not run on the first try nine out of 10 times,”
� “External dependencies and their updateability/patchability in case of security fixes
(should never depend on the initial publisher for third party libraries because they’d

strongly agree agree neutral disagree strongly disagree

(A)  Position

All responses

Other

Professor  / Group leader

Principal investigator

Researcher employed in a project

PhD student

Bachelor or Master student

30

3

6

1

8

9

3

26

8

4

3

7

4

31

10

4

6

9

2

27

3

7

4

3

8

2

11

4

2

1

4

(B)  Area of Research

All responses

Other

Electrical Engineering

Computer Engineering

Computer Science

30

6

3

5

16

26

2

1

7

16

31

7

3

4

17

27

8

2

1

16

11

3

2

6

(C)  Research Environment

All responses

Other

Company

Research institute

University

30

2

6

6

16

26

1

6

19

31

3

10

18

27

3

11

13

11

1

2

8

Figure 9 Responses to the question “Local computer security and local data security is a major concern for me when installing and running
software from other researchers.” grouped by the researchers’ position (A), research area (B) and research environment (C).

Full-size DOI: 10.7717/peerj-cs.240/fig-9

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 10/27

http://dx.doi.org/10.7717/peerj-cs.240/fig-9
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


have to maintain their old packages for a long time); Also important: Downards (sic!)
compatibility of “new” versions with old data & tools,”

� “Analytical reporoducability (sic!) & mathematical clarity (or correctness) is my main
concern,”

� “Conflicting versions of additional required software,”
� “complex build dependencies.”

A final open-ended question was about reservations toward publishing data and
software: “I have reservations for publishing software artifacts and data in research because
...” For analysis following the approach of open coding the answers were labeled manually
by the authors and the assigned labels were discussed until agreement was reached.
The most common cluster of answers noted legal or privacy issues (14). Others pointed out
the additional effort needed (8), commercial interests (8), missing reward or support
for doing so (3), and that publishing artifacts is not part of the job, that is, not supported by
the group or organization (2)3.

Regarding the aforementioned legal issues, it would be an interesting hypothesis that
researchers would be more willing to share if legal issues and efforts are reduced. This may
be achieved by license constraints (only licenses others can build upon) or exceptions for
publishing research (leaving license issues aside for research by general agreement).

Correlation analysis
Given the Likert scales for the answers we did investigate the correlation (Spearman’s rank
correlation) between answers to see if (i) intuitive and expected correlations exist and
(ii) new and surprising correlations can be found. Table 1 shows all correlations with
jρj > 0.44. The strongest correlation to be found with a coefficient of ρ = 0.734 and a
p-value < 0.0001 was between the questions “Checking for reproducibility of research results
should be part of a peer review process” and “As a reviewer I would be willing to check
research results in addition to the traditional peer review.” Hence, people who stated to be
willing to do reproducibility checks were more likely to find the idea of a review process
with mandatory reproducibility checks attractive.

Another strong correlation (ρ = 0.723, p < 0.0001) was found between the questions
“How much time (in hours) are you willing to invest into reproducing a result or get the
software tools of others installed and running?” and “How much time (in hours) are
you willing to invest to make the results of a paper reproducible?” With that correlation one
can hypothesize that researchers with reproducibility in mind invest time in reproducing
results as well as making their results reproducible.

A less strong but still rather interesting correlation (ρ = 0.55, p < 0.0001) was found
between “I am willing to pay for open access to my research software tools and frameworks
to make them available to other researchers.” and “I am willing to pay for easily accessible
software tools and for being able to reproduce results of other researchers.” So with the
overhead of participants not willing to pay for access and publishing of in context of
reproducibility as indicated in Fig. 8, it is likely that researchers either like the idea of
either paying for both, publishing and access, or none.

3 The raw data with all the responses is
included in the Supplemental Material.

4 Correlations of jρj < 0.4 are generally
considered as poor or weak correlations
and hence not included in the table.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 11/27

http://dx.doi.org/10.7717/peerj-cs.240#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


Threats to validity
While a minor bias is assumed to be caused by the study’s title as participants may have
been attracted by the title if they could identify with the topic of reproducibility, it is still
valid to discuss the implications of the findings.

One possible limitation of the survey is the missing geographical distribution of the
participants. We did not include questions on where participants are located or work
primarily, and did not collect IP addresses. Hence, we cannot conclude if the survey result
indicates a global trend, or if the preferences of researchers from different geographic
regions differ. Similarly, a possible gender gap of the survey’s participants can not be
evaluated. For single and multiple choice questions with a pre-defined answer set in the
survey, the set of answers can introduce a certain bias to the results. Therefore, it was
decided to avoid such questions if the risk of bias was high. In that sense, we also avoided
quantizing numeric input, for example, the hours people spend making their work
reproducible. If it was necessary, we always included an open-ended answer option.
The pilot, survey of related work, and critical reflection by the authors were used as tools
to minimize the bias. In one single case, that is, the question “Which of the following are

Table 1 Correlations in the survey answers with jρj > 0.4 using Spearman’s rank correlation.
Questions ρ p-value

Checking for reproducibility of research results should be part of a peer review process. 0.734 <0.0001

As a reviewer I would be willing to check research results in addition to the traditional peer review.

How much time (in hours) are you willing to invest into reproducing a result or get the software tools of others
installed and running?

0.723 <0.0001

How much time (in hours) are you willing to invest to make the results of a paper reproducible?

I am willing to pay for open access to my research software tools and frameworks to make them available to
other researchers.

0.550 <0.0001

I am willing to pay for easily accessible software tools and for being able to reproduce results of other researchers.

I want to publish software tools and methods from my research to allow others to reproduce my results. 0.490 <0.0001

When I publish software I intend to provide detailed documentation on how to install and run the software.

I want to reproduce the results of other researchers or groups from their original work (software tools or libraries) to
compare it to my work.

0.482 <0.0001

I want to build my research tools by extending on the work (software, tools or frameworks) of other researchers or
groups.

I want to reproduce the results of other researchers or groups from their original work (software tools or libraries) to
compare it to my work.

0.477 <0.0001

I typically try to reproduce the research results of other groups or researchers by installing and running their tools.

When I publish software I intend to provide detailed documentation on how to install and run the software. 0.412 <0.0001

I want to build my research tools by extending on the work (software, tools or frameworks) of other researchers or
groups.

When I publish software I intend to provide detailed documentation on how to install and run the software. 0.410 <0.0001

I typically try to reproduce the research results of other groups or researchers by installing and running their tools.

Published software artifacts are added value to the published text in the research paper. 0.408 <0.0001

I want to build my research tools by extending on the work (software, tools or frameworks) of other researchers
or groups.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 12/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


additional concerns when installing and running software from other researchers?” the
open-ended answer option showed that at least one pre-determined answer was missing.
Several participants noted complex build dependencies (also mentioned as conflicting
versions of additional required libraries or external dependencies) are likely to be another
major concern.

Participants could have had overlaps in the categorization of positions, for example a
person could be a PhD student and an employed researcher in a project at the same
time. In this case participants might have selected the category randomly or selected the
category they appreciate more. Despite this, as long as no intentional or unintentional
mistakes are made in the answers, each category will contain samples that are member of
this category. The survey also includes bachelor and master students, where it is not
guaranteed that they are involved in research projects. However, due to the dissemination
channels we used for advertising the survey, we assume that the participants are invested
in research and reproducibility. This is supported by the study where nine out of
11 bachelor or master students indicate to have already published software artifacts.

English as the only language for the survey might be a further limitation. Nevertheless,
English is the working language of the target audience, and consequently, we assume the
influence by the survey language to be negligible.

REQUIREMENTS AND GENERAL GUIDELINES
The survey results indicate that a majority of researchers of all levels consider
reproducibility as very relevant. There is further a strong interest in doing work to make
one’s own results reproducible, a strong interest to use results of others for comparison to
own work, and to some extent, a motivation to try to reproduce work for review purposes.

To achieve this, it is necessary to make all information that is necessary to reproduce the
results available together with a publication. Additionally, the effort necessary to reproduce
the results needs to match the value of doing the work. Work reproducing or refuting
previous results is overall much less appreciated than original work, so the effort a
researcher is willing to invest in order to reproduce previous results is much lower than the
effort they are willing to put in to produce new work. On the other hand, when planning
to build own research on top of other results the investment can be higher. The most
critical case is in reviewing, when reproducibility is intended to be checked as part of the
reviewing process. Reviewers have a strict timeline to perform their review, so there is a
need for a straightforward, mostly automated process to reproduce results. Moreover,
despite contributing to verifying the results of a paper, reviewers are not mentioned in
connection with the work. As reviewers work voluntarily, they are probably the least
motivated to reproduce results.

Moreover, in our study researchers have responded critically to commercial systems
introducing payments, either from the publishing researcher side or from the consumer
side. A majority of participants also name security as a concern in this context, which
highlights an issue to be addressed for researchers being security-aware as well as for those
who are less concerned about security.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 13/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


In order to address these issues, the following guidelines are proposed:

� Code, data, and information on how to conduct an experiment should be gathered in a
single place (a single container), which can be found in connection with the paper.

� The reproduction process should be highly automated (for example by an easy to handle
build and execution script).

� To address security issues, the published artifacts should be provided as source code and
scripts allowing for running the code in a virtual environment should be provided.

� Commercial libraries and other components that require reviewers to pay for access
should be avoided.

� Since research papers tend to create some interest even long after they have been
published, it is necessary to ensure that software and environment for the reproduction
process remain available, either by packing all necessary components into a container or
by referring to well-archived open source tools.

� The time and necessary information to reproduce results should be tested with an
independent person. Unless the size of the project requires it, the reproduction process
should take at most 2 days.

EXISTING TOOLS
Most tools for sharing software artifacts are also used in the development of software
artifacts. These could be either tools for simple tasks such as compiling software projects,
but also more complex tools for tasks such as automated dependency installation and
software packaging. To prevent unnecessarily complex configuration, it is wise to select
tools based upon the complexity of the software artifact. Software artifacts which are
complex to run require more sophisticated tools with high levels of abstraction, whereas
simple artifacts do not require complex tools to run.

In this section, we tackle our second research question by presenting four open source
tools for sharing software artifacts, ranging from tools for compiling simple artifacts to
sophisticated frameworks for sharing self-contained software environments. The tools
have been selected despite of their different scopes because of their potential to support
reproducible research. It has to be noted that a complex project might even incorporate
multiple tools, for example a build system within a virtual environment.

We begin with a discussion of simple tools, such as CMake, which are used for build
management and continue by discussing tools utilizing a higher level of abstraction.
For discussion purposes, well-known tools, each representing a class of tools with similar
functionality, were selected. Discussed pros and cons are valid not only for the discussed
tool itself, but for the complete class represented by the tool. Finally, we summarize
the features of the different tools and discuss the importance of their benefits, according to
the survey results presented in the section “Survey”.

CMake
CMake is a cross-platform build tool based on C++. It is designed to be used with native
build environments such as make. Platform-independent build files are used to generate

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 14/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


compiler-specific build instructions, which define the actual build process. Main features of
CMake are tools for testing, building, and packaging software, as well as the support of
hierarchical structures, automatic resolution of dependencies and parallel builds.

One drawback of CMake and similar build management systems is that required
libraries or other dependencies of software artifacts must be available and installed in the
required version on the host system in order to successfully build the project. This could
lead to extensive preparations for a build which is mandatory for executing software
artifacts.

CMake has been chosen for discussion because it is one of the most used tools of this
type. Tools with similar functionality are configure scripts, the GNU Build System and the
WAF build automation system.

Gradle
Gradle is a general purpose build tool based on Java and Groovy. Gradle integrates the
build tools Maven as well as Ant and can replace or extend both systems. Main features of
Gradle are the support for Maven repositories for dependency resolution and installation
and the out of the box support for common tasks, that is, building, packaging and
reporting. Gradle supports multiple programming languages, but has a strong focus on
Java, especially as it is the recommended build system for Android development.
An integrated wrapper system allows to use Gradle for building software artifacts without
installing Gradle. Dependency installations and versions are maintained automatically. If a
build requires a new version of a library, it is downloaded and installed automatically.

The automated dependency installation is a great benefit of Gradle, although there are
still some challenges to overcome. One issue is that automated dependency installation
only works, if the required libraries are offered in an online repository. If the required
dependency is removed from the online repository, building any software depending on
this library becomes impossible.

For other programming languages, tools with similar functionality are available, that is,
the Node Package Manager for JavaScript projects or pip for Python projects.

Docker
The open source software Docker allows packaging software applications including code,
system tools, and system libraries into a single Docker image. The resulting image can be
published, downloaded and executed on various systems without operating system
restrictions in a virtualized environment. This way, an application embedded in a Docker
image will execute in a predefined way, independent of the software environment
installed on the host computer. The only requirement for the host system is the installed
Docker engine.

A Docker image is a kind of lightweight virtual machine image. It could contain the
runtime environment for a single application with or without graphical user interface, but
it could also contain a ready to deploy server application for web services or even
environments for heavy calculations or simulations. When running the Docker image,
a Docker container is launched. A Docker container can be seen as an isolated runtime

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 15/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


environment, which uses the kernel of the host operating system and thereby becomes
more lightweight than traditional virtual machines. A running Docker container can be
accessed via a terminal or a graphical user interface allowing for a broad range of
applications.

Docker images can be shared in two different ways. The first way is to export a running
container including all files and executables as image and to share it as a single file.
This file can be large in size but is fast to launch by others. The second way is to create
a so-called Dockerfile. Dockerfiles contain the building instructions for Docker images.
These instructions include commands for installing required dependencies and for
installing the shared software artifact itself. When building a Docker image from a
Dockerfile, all instructions from the Dockerfile are automatically executed. This leads to
a small Dockerfiles, but a more complex import process. In addition, when using
Dockerfiles, all dependencies need to be available either in online repositories, or
locally on the machine building the image. The commercial Docker Hub platform
(https://hub.docker.com/, last visited 2019-10-10) streamlines the process for sharing
Docker images. Docker provides tools to share images on Docker Hub and to download
images from Docker Hub via the command-line. Docker Hub offers the possibility of
sharing public Docker images without download restrictions for free, but also paid plans
allowing creating private repositories for sharing images among small groups.

The major difference between Docker and the previously presented tools is that Docker
is not usually used for the development of an artifact. In most cases, a Docker image is
created for sharing a predefined environment in a team. This means that the image is
created and the software artifact is deployed in the container afterward.

An alternative to Docker is using Linux containers, which allow to run multiple isolated
Linux systems on a single host.

VirtualBox
VirtualBox is an open source software for the virtualization of an entire operating system.
VirtualBox emulates a predefined hardware environment, where multiple operation
systems, like Windows, Mac OS, and most Unix Systems can be installed. The installed
operating system is stored as persistent image, which allows the installation and
configuration of software. Once the image is created, it can be shared and executed on
multiple machines.

As mentioned before, VirtualBox emulates the entire hardware of a computer
resulting in higher execution overhead as well as higher setup effort. Before the scientific
software artifact can be installed in a VirtualBox container, an operating system and all
dependencies have to be installed.

A non-open source alternative to VirtualBox is VMWare, which has similar functionality.

Comparison of analyzed tools
After the presentation of selected tools in the last section, we now want to compare their
features for sharing scientific software artifacts. As criteria for the comparison, we focus
in this section on important aspects of software for researchers, according to the survey

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 16/27

https://hub.docker.com/
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


presented in the section “Survey”. Table 2 briefly summarizes our findings; a description of
each criteria is found throughout this section. The ratings in Table 3 are based on
qualitative comparisons, as well as on our experience from using the tools for making three
different research projects reproducible, as elaborated in the section “Examples”.

Security
As indicated by the survey, local computer and data security is a major concern for
many researchers. Some software artifacts require administrator access rights on the local
machine in order to be executed. These access rights allow malicious behavior, which
could lead to unwanted consequences on the local machine or on the local network.

VirtualBox and Docker execute software artifacts in sandboxed environments and
therefore allow the secure execution of software artifacts. Tools like CMake and Gradle do
not offer this security mechanism. When executing a shared software artifact from
untrusted sources, a sandboxed environment is recommended.

Supported platforms
CMake, Docker, and VirtualBox are compatible with most Linux platforms, recent
versions of MacOS, and selected versions of Windows 10. Gradle works as long as the Java
Virtual Machine is available. Besides this platform support it has to be kept in mind that
the software artifacts itself could require a certain operating system. This problem can
be mitigated through virtualization of Docker and VirtualBox.

Required knowledge for sharing
If a build management tool is used in the development of a scientific software artifact,
we assume that the researchers are familiarized with the build management tool during the
development phase. Therefore, no additional knowledge for the researcher who is sharing
the artifact is required. VirtualBox also does not require a lot of additional background
information. Everybody who is able to install an operating system is able to share a
software artifact embedded in a VirtualBox image. The terminology of Docker seems to be
confusing at first glance, requiring some time to become familiar with Docker concepts.

Table 2 Comparison of tools for sharing scientific software artifacts.

Tool CMake Gradle Docker VirtualBox

Security No security mechanisms No security mechanisms Sandboxed environment Sandboxed environment

Supported platforms Linux, MacOS, Windows Java VM Linux, MacOS, Windows Linux, MacOS, Windows

Required knowledge for
sharing

Little Little Moderate Little

Effort for sharing Little Little Moderate High

Required knowledge for
installation and execution

Moderate Moderate Little Little

Effort for installation and
execution

Moderate/high Little Little Little

Size of shared object Small Small Up to multiple GBs Up to multiple GBs

Limitations Installation could be
exhausting

Specific Gradle project
structure recommended

GUI requires extra effort Images always include the
entire operating system

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 17/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


Effort for sharing
CMake, Gradle, and other build management systems are intended to define a
standardized build process. If a build management system is used during the development
of the scientific software artifact, no additional effort arises for sharing. The configuration
file for the build management system can be shared along the source code of the
software artifact.

Docker and VirtualBox are usually not directly involved in software development.
In most cases, a Docker or VirtualBox image has to be created explicitly for sharing the
software artifact. The structured process of building a Docker container allows easy reuse
of existing Docker containers for other software artifacts. In the case of VirtualBox,
the whole VirtualBox image has to be shared on a file server. Docker images can be shared
on the free to use Docker Hub or on a file server. Alternatively, a Dockerfile, which
contains the building instructions for a Docker image, can be created and shared as a text
file. However, using a Dockerfile requires all dependencies being available in repositories,
adding additional complexity to the overall process.

Required knowledge for installation and execution
Researchers are often not familiar with the tools used for the creation of software artifacts.
Reading the documentation of build management tools can be exhausting and time-
consuming for the short test of an artifact. CMake and Gradle require some knowledge in
order to build a software artifact, especially if errors appear.

VirtualBox and Docker are easier to use. If a Docker image is hosted on DockerHub,
a single command is sufficient for downloading and running the image. If this command
is provided, no additional knowledge is required. Due to a graphical user interface, running
a VirtualBox image is even easier.

Effort for installation and execution
According to the survey results, ease of installation is a major consideration for most
researchers (104 of 125 participants). Regarding the installation of the used tool itself,
Gradle has the lowest requirements. The Gradle Wrapper allows installing dependencies
and the build of artifacts without installing Gradle itself. For installing and executing
the shared software artifact, the highest effort arises when using CMake, where required
dependencies have to be installed manually. For building and executing software artifacts
with Gradle only a few commands are required. Docker and VirtualBox require the
least effort; the shared image only needs to be executed.

Size of shared object
When using CMake or Gradle, the source code of the software artifact and the
configuration file of the build management tool have to be shared, which usually leads to
small shared objects.

The shared image of Docker or VirtualBox has to contain the source code and all other
tools which are required for execution, such as the operating system. This results in
large shared objects, in some cases the size of a Docker image exceeds one Gigabyte.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 18/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


Alternatively, Docker provides an option allowing for smaller shared objects—
Dockerfiles. A Dockerfile contains only text instructions for building a Docker image.
Therefore, the size of a Dockerfile is only a few kilobytes, but once executed, Docker
automatically pulls the artifact’s source code from provided repositories and builds the
software artifact, resulting in a large Docker image on the local machine. Nevertheless, the
size of the download is not a major concern for the majority of the survey participants.

Limitations
All analyzed tools have limitations. CMake is a lightweight tool for software development,
but the effort for installing the dependencies of a software artifact could be extensive.
Furthermore, it is only applicable for a handful of programming languages such as C or C++.

When Gradle is chosen as build system early in the development phase, it is perfectly
suited for Java projects. Using Gradle for existing projects can be cumbersome because
it requires additional configuration for projects that do not comply with Gradle’s default
project structure. Especially for researchers that are not familiar with Gradle, the time
spent for this additional configuration step should not be neglected.

Docker is perfectly suited for command-line or web applications, which is the case for
a huge amount of scientific software artifacts. Additional configuration is required to
support GUIs of desktop applications. FREVO (see the section “FREVO”), used in one
of our examples, demonstrates GUI support for desktop applications with Docker.

VirtualBox is applicable for all types of software artifacts, but the overhead of creating
and sharing a VirtualBox image could be huge. For sharing an artifact, independent of its
size and complexity, a complete operating system has to be installed and shared.

EXAMPLES
After introducing background information in the last sections, three examples are
presented analyzing the applicability of various tools for sharing software artifacts. Three
scientific artifacts from different computer science research areas allowed us to focus on
various types of artifacts with different build systems and procedures for sharing. The first
example—Stochastic Adaptive Forwarding—is a simulation scenario, which can be
executed on a command line in order to conduct performance evaluations. Second,
FREVO is a simulation tool, mainly controlled via a graphical user interface. The third
example—LireSolr—is a server-based application used for image retrieval.

Stochastic Adaptive Forwarding
Stochastic Adaptive Forwarding (SAF) (Posch, Rainer & Hellwagner, 2017) is a
forwarding strategy for the novel Internet architecture Named data networking (NDN)
(Zhang et al., 2014). Forwarding strategies in NDN are responsible for forwarding packets
to neighboring nodes and therefore select the paths of traffic flows in the network.

The Network Forwarding Daemon (NFD) implements the networking functionalities of
NDN. It is written in C++ and uses the WAF build automation system. The network
simulator ns-3/ndnSIM (Mastorakis et al., 2016) is used for testing purposes, which also
uses the WAF build system. For testing SAF in the simulation environment three steps
are required: (i) Installation of the NFD; (ii) installation of the network simulator

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 19/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


ns-3/ndnSIM and finally (iii) patching SAF into a compatible version of the NFD. The
installation of SAF was tested and analyzed in the standard way by using WAF and Docker.

SAF with WAF
The standard way of developing NDN forwarding strategies is by using the WAF build
automation system. The functionality of the WAF build system is similar to the
functionality of CMake. This means that WAF automatically resolves dependencies, but
the installation of dependencies must be performed manually. Although extensive
installation instructions were published (https://github.com/danposch/SAF, last visited
2019-07-08), it is tricky to install the simulator and its dependencies. Furthermore, there
are slightly undocumented differences when installing the NDN framework on different
Unix versions. Once the NDN framework is compiled in the correct version, it is easy to
patch SAF. Nevertheless, it can take up to several hours to initially install and compile the
NDN framework with SAF.

SAF with Docker
Named data networking and SAF are licensed under GPL V3, meaning that there are no legal
concerns over packaging the software. Technically, Docker provides two options for creating
and sharing images. The first is to check out a preconfigured image like Ubuntu Linux from
the Docker website and connect to it via terminal. All required changes can be made in the
terminal and finally persisted with a commit. The altered image can be shared via the Docker
website or as binary file. The second possibility to create the image is by using Dockerfiles.
These files contain simple creation instructions for images and can be shared easily due to their
small size. To build an image, the Dockerfile can be executed on any host with the Docker
framework installed. Both variants were tested for SAF. The resulting images, containing all
dependencies and the compiled software artifacts, have a size of about 4.6 GB with the size of
the Dockerfile being about two KB. Using the precompiled image (https://hub.docker.com/r/
phmoll/safprebuild/, last visited 2019-07-08), running the image only takes an instant.
The execution of the Dockerfile takes, depending on the Internet connection and the
computing power of the host system, between 15 and 60 min. Once the image is running, the
results of the paper can be reproduced or new experiments with SAF can be conducted using
the provided network simulator.

FREVO
FREVO (Sobe, Fehérvári & Elmenreich, 2012) is an open source framework to help
engineers and scientists in evolutionary design or optimization tasks to create a desired
swarm behavior. The major feature of FREVO is the component-wise decomposition and
separation of the key building blocks for an optimization task. This structure enables
the components to be designed separately allowing the user to easily swap and evaluate
different configurations and methods or to connect an external simulation tool. FREVO is
typically used for evolving swarm behavior for a given simulation (Fehervari & Elmenreich,
2010; Monacchi, Zhevzhyk & Elmenreich, 2014). FREVO is a mid-sized project with
50k lines of mostly Java code, having a graphical interface as well as a mode for
pure command line operation, for example, to be used on a simulation server.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 20/27

https://github.com/danposch/SAF
https://hub.docker.com/r/phmoll/safprebuild/
https://hub.docker.com/r/phmoll/safprebuild/
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


The component-based structure allows to easily extend and remove components (e.g., a
simulation, a type of neural network, an optimization algorithm), which sometimes creates
some effort in newly setting up FREVO.

FREVO was tested and analyzed with the following three tools:

FREVO with build script
Until recently, FREVO was provided as a download zip file (http://frevo.sourceforge.net/,
last visited: 2019-07-08) that included sources of the main program and additional
components together with an ant build file. However, there had been problems in the past
with different language versions of Java. A further problem can be dependencies on third
party tools or libraries, which are not automatically maintained by this type of build script.

FREVO with Gradle
An analysis showed that the current structure of FREVO, especially due to its component-
plug-in-architecture, conflicts with the expected and possible project structure for Gradle.

FREVO with Docker
Since FREVO and its components are open source under GPL V3, there was neither a legal
nor a technical problem to put it into a virtual Docker container. We used an Ubuntu
Linux system that was provided by Docker. Openjdk8 was installed as Java Runtime
environment. After installing FREVO in the Docker system, it was pushed onto the free
Docker Hub hosting platform (https://hub.docker.com/r/frodewin/frevo/, last visited:
2019-07-08). To reproduce a result made with FREVO it thus possible to (given that
Docker is installed) download and execute the respective Docker container. In general,
the result was easily usable, apart from some effort to get a graphical display working.
The parallelization of simulation, which is a natural ability of FREVO, works fine as well
inside a Docker container. The Docker image containing FREVO has a compressed size of
223 MB, which is mostly due to the files of Ubuntu Linux.

LireSolr
LireSolr (Lux & Macstravic, 2014) is an extension for the popular Apache Solr
(http://lucene.apache.org/solr/, last visited 2019-07-08) text retrieval server to add
functionality for visual information retrieval. It adds support for indexing and searching
images based on image features and is for instance in use by the World Intellectual
Property Organisation, a UN agency, within the Global Brand DB (http://www.wipo.int/
branddb/en/, last visited 2019-07-08) for retrieval of similar visual trademarks.

LireSolr brings the functionality of the LIRE library (Lux & Marques, 2013) to the
popular search server. While LIRE is a library for visual information retrieval based on
inverted indexes, it is research driven and intented to be integrated with local Java
applications. Apache Solr is more popular than the underlying inverted index system,
Lucene, as it allows to modularize retrieval functionality by providing a specific retrieval
server with cloud functionality and multiple APIs to access it for practical use.

LireSolr is intended for people who need out of the box visual retrieval methods without
the need for integrating a library in their applications. It can be called from any mobile,

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 21/27

http://frevo.sourceforge.net/
https://hub.docker.com/r/frodewin/frevo/
http://lucene.apache.org/solr/
http://www.wipo.int/branddb/en/
http://www.wipo.int/branddb/en/
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


server or desktop platform and runs on systems with a Java 8 runtime. This flexibility
is valued among researchers as well as practitioners. LireSolr is hosted on Github
(https://github.com/dermotte/liresolr, last visited 2019-07-08). Gradle and Docker build
files are part of the repository.

LireSolr with Gradle

The standard method of building LireSolr is by using Gradle. Current IDEs can import
Gradle build files; any task can be done from within the IDE. While Gradle makes sure that
the right version for each library is downloaded and everything is ready to build, installing
the new features to the Solr server has to be done manually. The supporting task in
Gradle just exports the necessary JAR files. The user or developer has to install Solr, then
create a Solr core, change two configuration files, copy the JARs and restart the server
to complete the installation. While these steps are extensively described in the
documentation, it still presents a major effort for new users without prior experience of
retrieval in general or using Apache Solr.

LireSolr with Docker
As LireSolr is extending Solr by adding additional functionality, the intuitive way to create a
Docker container is to extend the Solr Docker container. The Dockerfile defining the build of
the Docker container is part of the LireSolr repository, where a specific Gradle task is building
and preparing relevant files for the creation of the image. This includes the aforementioned
JARs and config files as well as a pre prepared Solr core and a small web application as a client.
The Docker container can easily be run and provides basic functionality for digital image
search. Developers who just want to test LireSolr can get it running within seconds using
Docker Hub (https://hub.docker.com/r/dermotte/liresolr/, last visited: 2019-07-08).

ONE TOOL TO REPRODUCE THEM ALL?
In the previous sections, we presented tools for sharing software artifacts and examples
showing how the tools can be applied in order to share scientific software artifacts. In this
section, we now reflect on the advantages and shortcomings of the tools with respect to the
results from our survey presented in the section “Survey”.

Each of the presented tools has its pros and cons. For instance, the additional effort for
sharing an artifact when using a build management tool is very low because in most cases a
build management tool is used during the creation of the artifact. In contrast, it can be
challenging and time-consuming for other researchers to get the build management tool
up and running because required dependencies or the installation process may not be
documented in detail. Software artifacts, which are provided as virtualized containers are
easy to run and provide a high degree of security but are inconvenient in case a researcher
wants to build upon previously published software artifacts.

When weighing these advantages and shortcomings we quickly see that the one tool to
reproduce all our scientific results does not exist. Nevertheless, based on our findings
from the survey we now want to give recommendations for creating reproducible results
and scientific software artifacts which can be easily used by other researchers.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 22/27

https://github.com/dermotte/liresolr
https://hub.docker.com/r/dermotte/liresolr/
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


The survey clearly showed that many researchers are interested in building their
research on the work of others, which becomes much easier, when published software
artifacts can be reused. Furthermore, we saw that the average time researchers are willing
to invest to get artifacts running is only about two days. Thus, we assume that it is very
important for researchers to get the artifact running quickly, otherwise, researchers lose
interest in using the artifact and start developing their own solution. When taking the
demand for security into account, we see that virtualized containers appear to be a good
choice. The provided software artifact can be executed without the overhead of installing it,
by simply running the container. Furthermore, it is possible to become familiar with
the artifact in the virtualized environment and check if the artifact is suitable to base
own work on it.

Our findings mostly discuss the researchers’ perspective working on original research
questions. The role of a reviewer in a peer review process has not be discussed in the same
detail, but we assume that it is similar to researcher’s demands, but with even more
demanding time constraints.

When researchers decide to build on the artifact, it may be cumbersome to continue
using a virtualized container, because altering a software artifact is more convenient
on a local system. This means that the researcher has to install the artifact locally, without
virtualized container. According to our study, researchers currently prefer providing
detailed instructions and build tools. Solely relying on this information, it could be
challenging to install the artifact, as already discussed.

Dockerfiles are one solution to overcome this issue. As already explained, a Dockerfile is
a kind of a construction guideline for Docker containers. It contains all command line
directives, which are required to build a Docker container and can therefore be seen as
exact procedure for the local installation of an artifact. Following the commands listed
in the Dockerfile, local installation of a software artifact is relatively easy. These commands
ensure that all dependencies are installed correctly, otherwise it would not be possible
to create a Docker container. This means that by providing a Dockerfile, both options
become possible, software artifacts can be executed in a secure container, but can also be
easily installed by following instructions from the Dockerfile.

Another finding of our survey is that the long-term availability of software artifacts
is important for researchers and should be around 10 years. In addition, the ACM
Artifact Reviewing and Badging guideline (https://www.acm.org/publications/policies/
artifact-review-badging, last visited 2019-07-08) emphasizes the importance of being able
to reproduce results after a long time, by providing a separate badge for artifacts which
are archived in archival repositories. When looking at our presented tools, we can see
technical, as well as legal issues on the way to achieve long term availability. Although
services, such as Code Ocean (https://codeocean.com/, last visited 2019-07-08) or Dryad
(https://datadryad.org/, last visited 2019-07-08), are available for archiving software artifacts,
the following points should be kept in mind. Tools such as Gradle rely on online repositories
for downloading required dependencies. If only one of the required dependencies becomes
unavailable, the build fails. This means that all dependencies, as well as all required tools
have to be included when the artifact is archived. This leads to technical issues, because the

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 23/27

https://www.acm.org/publications/policies/artifact-review-badging
https://www.acm.org/publications/policies/artifact-review-badging
https://codeocean.com/
https://datadryad.org/
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


amount of required tools to reproduce a result could be tremendously high. For instance, if
a required operating system or compiler is no longer available, the results can not be
reproduced, which means that even these tools must to be archived. Besides this technical
issue, packaging these tools could lead to legal issues as well when tools with limiting licenses
are used. Furthermore, operators of platforms for archiving software could decide to
discontinue service. This would result in loss of all artifacts archived by this provider.

CONCLUSION
This article reflects on the reproducibility of research results in computer science. We
collected the opinions and requirements of 125 researchers via an online survey. Focusing
on our research question RQ1 (“To what extent is reproducibility of results based on
software artifacts important in the field of computer science and in related research areas?”),
analysis of the survey’s results confirmed our initial assumption that the reproducibility
of research results is an important concern in computer science research. Besides,
researchers not only want to reproduce results but also want to base their work on the
results of others. The main reasons for the importance of reproducibility are improved
credibility and improved understanding of results. Using established commercial models,
as they are common for scientific publications, was seen as critical. Moreover, the majority
of survey participants showed a willingness to use open source tools for making their
results accessible and reproducible. Based on the researchers’ opinions, we created
guidelines which aid researchers in making their research reproducible. The applicability
of various tools for publishing software artifacts was discussed while keeping our
guidelines in mind. Scientific artifacts of different research areas in computer science were
used to test the applicability of discussed tools for sharing reproducible research.

Regarding research question RQ2 (“What tools can be used to support reproducibility?”),
we identified a conflict between comprehensibility and simplicity of using a tool vs security
measures avoiding to compromise one’s system when testing foreign code. Available
tools provide a variety of possible solutions, however, we could not identify a single tool
fulfilling all requirements.

Finally, we discussed our findings and concerns on the process of publishing
reproducible research. According to our study, the long-term availability of reproducible
results is of great importance to many researchers, but we identified open issues in
achieving availability for longer periods. Even if reproducibility of research is not common
practice yet, we recognized a strong positive shift toward reproducible research, backed not
only by individual researchers, but also by renowned scientific journals and publishers.

With this work already leading to new insights regarding reproducibility, it also installs a
beachhead for future research. With the survey as input and the discussions regarding
the interpretation we identified the context of a researcher as a hypothetically highly
influential factor on the view on reproducibility. So how do for instance not only
cultural, geographical, and project background of a researcher, but also the research area
as well as the research communities influence the willingness to investigate extra time
in making results reproducible? Future work could also address the question whether and to
what extend project size would influence the willingness to invest time into reproducing work.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 24/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


ACKNOWLEDGEMENTS
We would like to thank all participants of the survey for their valuable input and
all colleagues who helped us by sharing their practical experience and discussion.
We thank the anonymous reviewers for their constructive comments on a previous version
of the article. We are grateful to Lizyy Dawes for proofreading an earlier version of this
article.

ADDITIONAL INFORMATION AND DECLARATIONS

Funding
This work was supported by the Austrian Science Fund (FWF) under the CHIST-ERA
project 496 CONCERT (project no. I1402). There was no additional external funding
received for this study. The funders had no role in study design, data collection and
analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors:
Austrian Science Fund (FWF) under the CHIST-ERA project 496 CONCERT: I1402.

Competing Interests
The authors declare that they have no competing interests.

Author Contributions
� Wilfried Elmenreich conceived and designed the experiments, performed the
experiments, analyzed the data, contributed reagents/materials/analysis tools, performed
the computation work, authored or reviewed drafts of the paper, approved the final draft.

� Philipp Moll conceived and designed the experiments, performed the experiments,
analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or
tables, performed the computation work, authored or reviewed drafts of the paper,
approved the final draft.

� Sebastian Theuermann conceived and designed the experiments, performed the
experiments, analyzed the data, contributed reagents/materials/analysis tools,
performed the computation work, authored or reviewed drafts of the paper, approved
the final draft.

� Mathias Lux conceived and designed the experiments, performed the experiments,
analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or
tables, performed the computation work, authored or reviewed drafts of the paper,
approved the final draft.

Data Availability
The following information was supplied regarding data availability:

Links to the tools used are available in the article. This includes data and code produced
by us as well as code from others where we have built upon.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 25/27

http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


The following repositories contain parts deposited for the article:
Docker Images:
https://hub.docker.com/r/phmoll/saf-prebuild/
https://hub.docker.com/r/dermotte/liresolr/
https://hub.docker.com/r/frodewin/frevo/
We used a version of FREVO software, Sourceforge:
http://frevo.sourceforge.net/.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/
peerj-cs.240#supplemental-information.

REFERENCES
Allen A, Schmidt J. 2015. Looking before leaping: creating a software registry. Journal of Open

Research Software 3(1):e15 DOI 10.5334/jors.bv.

Arbona A, Artigues A, Bona-Casas C, Massó J, Miñano B, Rigo A, Trias M, Bona C. 2013.
Simflowny: a general-purpose platform for the management of physical models and
simulation problems. Computer Physics Communications 184(10):2321–2331
DOI 10.1016/j.cpc.2013.04.012.

Boettiger C. 2015. An introduction to docker for reproducible research. SIGOPS Operating
Systems Review 49(1):71–79.

Bonaventure O. 2017. The january 2017 issue. SIGCOMM Computer Communication Review
47(1):1–3.

Brinckman A, Chard K, Gaffney N, Hategan M, Jones MB, Kowalik K, Kulasekaran S,
Ludäscher B, Mecum BD, Nabrzyski J, Stodden V, Taylor IJ, Turk MJ, Turner K. 2018.
Computing environments for reproducibility: capturing the “whole tale”. Future Generation
Computer Systems 94:854–867 DOI 10.1016/j.future.2017.12.029.

Casadevall A, Fang FC. 2010. Reproducible science? Infection and Immunity 78(12):4972–4975.

Chirigati F, Shasha D, Freire J. 2013. Reprozip: using provenance to support computational
reproducibility. In: Proceedings of the 5th USENIX Workshop on the Theory and Practice of
Provenance, TaPP ’13. Berkeley: USENIX Association, 1:1–1:4.

Cito J, Ferme VC, Gall H. 2016. Using docker containers to improve reproducibility in software
and web engineering research. In: 2016 IEEE/ACM 38th International Conference on Software
Engineering Companion (ICSE-C). New York: ACM, 609–612.

Fehervari I, Elmenreich W. 2010. Evolving neural network controllers for a team of
self-organizing robots. Journal of Robotics 2010:10 DOI 10.1155/2010/841286.

Flittner M, Mahfoudi MN, Saucez D, Wählisch M, Iannone L, Bajpai V, Afanasyev A. 2018.
A Survey on Artifacts from CoNEXT, ICN, IMC, and SIGCOMM Conferences in 2017.
SIGCOMM Computer Communication Review 48(1):75–80.

Hanson B, Sugden A, Alberts B. 2011. Making data maximally available. Science 331(6018):649
DOI 10.1126/science.1203354.

Janin Y, Vincent C, Duraffort R. 2014. Care, the comprehensive archiver for reproducible
execution. In: Proceedings of the 1st ACM SIGPLAN Workshop on Reproducible Research
Methodologies and New Publication Models in Computer Engineering, TRUST ’14. New York:
ACM, 1:1–1:7.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 26/27

https://hub.docker.com/r/phmoll/saf-prebuild/
https://hub.docker.com/r/dermotte/liresolr/
https://hub.docker.com/r/frodewin/frevo/
http://frevo.sourceforge.net/
http://dx.doi.org/10.7717/peerj-cs.240#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.240#supplemental-information
http://dx.doi.org/10.5334/jors.bv
http://dx.doi.org/10.1016/j.cpc.2013.04.012
http://dx.doi.org/10.1016/j.future.2017.12.029
http://dx.doi.org/10.1155/2010/841286
http://dx.doi.org/10.1126/science.1203354
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/


Lux M, Macstravic G. 2014. The LIRE request handler: a Solr plug-in for large scale content based
image retrieval. In: Gurrin C, Hopfgartner F, Hurst W, Johansen H, Lee H, O’Connor N, eds.
MultiMedia Modeling: 20th Anniversary International Conference, MMM 2014, Dublin,
Ireland, January 6-10, 2014. Cham: Springer International Publishing, 374–377.

Lux M, Marques O. 2013. Visual information retrieval using Java and LIRE. Synthesis Lectures
on Information Concepts, Retrieval, and Services 5(1):1–112.

Mastorakis S, Afanasyev A, Moiseenko I, Zhang L. 2016. ndnSIM 2: an updated NDN
simulator for NS-3. Technical Report NDN-0028, Revision 2, NDN.

McMurdie PJ, Holmes S. 2013. phyloseq: an R package for reproducible interactive analysis and
graphics of microbiome census data. PLOS ONE 8(4):1–11 DOI 10.1371/journal.pone.0061217.

Monacchi A, Zhevzhyk S, Elmenreich W. 2014. HEMS: a home energy market simulator.
Computer Science—Research and Development 31(3):111–118.

Morin A, Urban J, Adams PD, Foster I, Sali A, Baker D, Sliz P. 2012. Shining light into
black boxes. Science 336(6078):159–160 DOI 10.1126/science.1218263.

Peng RD. 2011. Reproducible research in computational science. Science 334(6060):1226–1227
DOI 10.1126/science.1213847.

Poline J-B, Breeze JL, Ghosh S, Gorgolewski K, Halchenko YO, Hanke M, Haselgrove C,
Helmer KG, Keator DB, Marcus DS, Poldrack RA, Schwartz Y, Ashburner J, Kennedy DN.
2012. Data sharing in neuroimaging research. Frontiers in Neuroinformatics 6:9
DOI 10.3389/fninf.2012.00009.

Posch D, Rainer B, Hellwagner H. 2017. SAF: stochastic adaptive forwarding in named
data networking. IEEE/ACM Transactions on Networking 25(2):14
DOI 10.1109/TNET.2016.2614710.

Ram K. 2013. Git can facilitate greater reproducibility and increased transparency in science.
Source Code for Biology and Medicine 8(1):7 DOI 10.1186/1751-0473-8-7.

Shrader-Frechette K, Oreskes N. 2011. Symmetrical transparency in science. Science
332(6030):663–664 DOI 10.1126/science.332.6030.663.

Sobe A, Fehérvári I, Elmenreich W. 2012. FREVO: a tool for evolving and evaluating self-
organizing systems. In: Proceedings of the 1st International Workshop on Evaluation for
Self-Adaptive and Self-Organizing Systems, Lyon. 105–110.

Steiniger S, Hunter AJ. 2013. The 2012 free and open source GIS software map—a guide to
facilitate research, development, and adoption. Computers, Environment and Urban Systems
39:136–150 DOI 10.1016/j.compenvurbsys.2012.10.003.

Vitek J, Kalibera T. 2011. Repeatability, reproducibility and rigor in systems research.
In: Proceedings of the 11th International Conference on Embedded Software EMSOFT 2011,
Taipei. 33–38.

Walters WP. 2013. Modeling, informatics, and the quest for reproducibility. Journal of
Chemical Information and Modeling 53(7):1529–1530 DOI 10.1021/ci400197w.

Witten DM, Tibshirani R. 2013. Scientific research in the age of omics: the good, the bad,
and the sloppy. Journal of the American Medical Informatics Association 20(1):125–127
DOI 10.1136/amiajnl-2012-000972.

Zhang L, Afanasyev A, Burke J, Jacobson V, Claffy K, Crowley P, Papadopoulos C, Wang L,
Zhang B. 2014. Named data networking. SIGCOMM Computer Communication Review
44(3):66–73.

Elmenreich et al. (2019), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.240 27/27

http://dx.doi.org/10.1371/journal.pone.0061217
http://dx.doi.org/10.1126/science.1218263
http://dx.doi.org/10.1126/science.1213847
http://dx.doi.org/10.3389/fninf.2012.00009
http://dx.doi.org/10.1109/TNET.2016.2614710
http://dx.doi.org/10.1186/1751-0473-8-7
http://dx.doi.org/10.1126/science.332.6030.663
http://dx.doi.org/10.1016/j.compenvurbsys.2012.10.003
http://dx.doi.org/10.1021/ci400197w
http://dx.doi.org/10.1136/amiajnl-2012-000972
http://dx.doi.org/10.7717/peerj-cs.240
https://peerj.com/computer-science/

	Making simulation results reproducible-Survey, guidelines, and examples based on Gradle and Docker
	Introduction
	Related Work
	Survey
	Requirements and General Guidelines
	Existing Tools
	Examples
	One Tool to Reproduce Them All?
	Conclusion
	flink9
	References


<<
  /ASCII85EncodePages false
  /AllowTransparency false
  /AutoPositionEPSFiles true
  /AutoRotatePages /None
  /Binding /Left
  /CalGrayProfile (Dot Gain 20%)
  /CalRGBProfile (sRGB IEC61966-2.1)
  /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2)
  /sRGBProfile (sRGB IEC61966-2.1)
  /CannotEmbedFontPolicy /Warning
  /CompatibilityLevel 1.4
  /CompressObjects /Off
  /CompressPages true
  /ConvertImagesToIndexed true
  /PassThroughJPEGImages true
  /CreateJobTicket false
  /DefaultRenderingIntent /Default
  /DetectBlends true
  /DetectCurves 0.0000
  /ColorConversionStrategy /LeaveColorUnchanged
  /DoThumbnails false
  /EmbedAllFonts true
  /EmbedOpenType false
  /ParseICCProfilesInComments true
  /EmbedJobOptions true
  /DSCReportingLevel 0
  /EmitDSCWarnings false
  /EndPage -1
  /ImageMemory 1048576
  /LockDistillerParams false
  /MaxSubsetPct 100
  /Optimize true
  /OPM 1
  /ParseDSCComments true
  /ParseDSCCommentsForDocInfo true
  /PreserveCopyPage true
  /PreserveDICMYKValues true
  /PreserveEPSInfo true
  /PreserveFlatness true
  /PreserveHalftoneInfo false
  /PreserveOPIComments false
  /PreserveOverprintSettings true
  /StartPage 1
  /SubsetFonts true
  /TransferFunctionInfo /Apply
  /UCRandBGInfo /Preserve
  /UsePrologue false
  /ColorSettingsFile (None)
  /AlwaysEmbed [ true
  ]
  /NeverEmbed [ true
  ]
  /AntiAliasColorImages false
  /CropColorImages true
  /ColorImageMinResolution 300
  /ColorImageMinResolutionPolicy /OK
  /DownsampleColorImages false
  /ColorImageDownsampleType /Average
  /ColorImageResolution 300
  /ColorImageDepth 8
  /ColorImageMinDownsampleDepth 1
  /ColorImageDownsampleThreshold 1.50000
  /EncodeColorImages true
  /ColorImageFilter /FlateEncode
  /AutoFilterColorImages false
  /ColorImageAutoFilterStrategy /JPEG
  /ColorACSImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /ColorImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000ColorACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /JPEG2000ColorImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /AntiAliasGrayImages false
  /CropGrayImages true
  /GrayImageMinResolution 300
  /GrayImageMinResolutionPolicy /OK
  /DownsampleGrayImages false
  /GrayImageDownsampleType /Average
  /GrayImageResolution 300
  /GrayImageDepth 8
  /GrayImageMinDownsampleDepth 2
  /GrayImageDownsampleThreshold 1.50000
  /EncodeGrayImages true
  /GrayImageFilter /FlateEncode
  /AutoFilterGrayImages false
  /GrayImageAutoFilterStrategy /JPEG
  /GrayACSImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /GrayImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000GrayACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /JPEG2000GrayImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /AntiAliasMonoImages false
  /CropMonoImages true
  /MonoImageMinResolution 1200
  /MonoImageMinResolutionPolicy /OK
  /DownsampleMonoImages false
  /MonoImageDownsampleType /Average
  /MonoImageResolution 1200
  /MonoImageDepth -1
  /MonoImageDownsampleThreshold 1.50000
  /EncodeMonoImages true
  /MonoImageFilter /CCITTFaxEncode
  /MonoImageDict <<
    /K -1
  >>
  /AllowPSXObjects false
  /CheckCompliance [
    /None
  ]
  /PDFX1aCheck false
  /PDFX3Check false
  /PDFXCompliantPDFOnly false
  /PDFXNoTrimBoxError true
  /PDFXTrimBoxToMediaBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXSetBleedBoxToMediaBox true
  /PDFXBleedBoxToTrimBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXOutputIntentProfile (None)
  /PDFXOutputConditionIdentifier ()
  /PDFXOutputCondition ()
  /PDFXRegistryName ()
  /PDFXTrapped /False

  /CreateJDFFile false
  /Description <<
    /CHS <FEFF4f7f75288fd94e9b8bbe5b9a521b5efa7684002000500044004600206587686353ef901a8fc7684c976262535370673a548c002000700072006f006f00660065007200208fdb884c9ad88d2891cf62535370300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c676562535f00521b5efa768400200050004400460020658768633002>
    /CHT <FEFF4f7f752890194e9b8a2d7f6e5efa7acb7684002000410064006f006200650020005000440046002065874ef653ef5728684c9762537088686a5f548c002000700072006f006f00660065007200204e0a73725f979ad854c18cea7684521753706548679c300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c4f86958b555f5df25efa7acb76840020005000440046002065874ef63002>
    /DAN <FEFF004200720075006700200069006e0064007300740069006c006c0069006e006700650072006e0065002000740069006c0020006100740020006f007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400650072002000740069006c0020006b00760061006c00690074006500740073007500640073006b007200690076006e0069006e006700200065006c006c006500720020006b006f007200720065006b007400750072006c00e60073006e0069006e0067002e0020004400650020006f007000720065007400740065006400650020005000440046002d0064006f006b0075006d0065006e0074006500720020006b0061006e002000e50062006e00650073002000690020004100630072006f00620061007400200065006c006c006500720020004100630072006f006200610074002000520065006100640065007200200035002e00300020006f00670020006e0079006500720065002e>
    /DEU <FEFF00560065007200770065006e00640065006e0020005300690065002000640069006500730065002000450069006e007300740065006c006c0075006e00670065006e0020007a0075006d002000450072007300740065006c006c0065006e00200076006f006e002000410064006f006200650020005000440046002d0044006f006b0075006d0065006e00740065006e002c00200076006f006e002000640065006e0065006e002000530069006500200068006f00630068007700650072007400690067006500200044007200750063006b006500200061007500660020004400650073006b0074006f0070002d0044007200750063006b00650072006e00200075006e0064002000500072006f006f0066002d00470065007200e400740065006e002000650072007a0065007500670065006e0020006d00f60063006800740065006e002e002000450072007300740065006c006c007400650020005000440046002d0044006f006b0075006d0065006e007400650020006b00f6006e006e0065006e0020006d006900740020004100630072006f00620061007400200075006e0064002000410064006f00620065002000520065006100640065007200200035002e00300020006f0064006500720020006800f600680065007200200067006500f600660066006e00650074002000770065007200640065006e002e>
    /ESP <FEFF005500740069006c0069006300650020006500730074006100200063006f006e0066006900670075007200610063006900f3006e0020007000610072006100200063007200650061007200200064006f00630075006d0065006e0074006f0073002000640065002000410064006f0062006500200050004400460020007000610072006100200063006f006e00730065006700750069007200200069006d0070007200650073006900f3006e002000640065002000630061006c006900640061006400200065006e00200069006d0070007200650073006f0072006100730020006400650020006500730063007200690074006f00720069006f00200079002000680065007200720061006d00690065006e00740061007300200064006500200063006f00720072006500630063006900f3006e002e002000530065002000700075006500640065006e00200061006200720069007200200064006f00630075006d0065006e0074006f00730020005000440046002000630072006500610064006f007300200063006f006e0020004100630072006f006200610074002c002000410064006f00620065002000520065006100640065007200200035002e003000200079002000760065007200730069006f006e0065007300200070006f00730074006500720069006f007200650073002e>
    /FRA <FEFF005500740069006c006900730065007a00200063006500730020006f007000740069006f006e00730020006100660069006e00200064006500200063007200e900650072002000640065007300200064006f00630075006d0065006e00740073002000410064006f00620065002000500044004600200070006f007500720020006400650073002000e90070007200650075007600650073002000650074002000640065007300200069006d007000720065007300730069006f006e00730020006400650020006800610075007400650020007100750061006c0069007400e90020007300750072002000640065007300200069006d007000720069006d0061006e0074006500730020006400650020006200750072006500610075002e0020004c0065007300200064006f00630075006d0065006e00740073002000500044004600200063007200e900e90073002000700065007500760065006e0074002000ea0074007200650020006f007500760065007200740073002000640061006e00730020004100630072006f006200610074002c002000610069006e00730069002000710075002700410064006f00620065002000520065006100640065007200200035002e0030002000650074002000760065007200730069006f006e007300200075006c007400e90072006900650075007200650073002e>
    /ITA <FEFF005500740069006c0069007a007a006100720065002000710075006500730074006500200069006d0070006f007300740061007a0069006f006e00690020007000650072002000630072006500610072006500200064006f00630075006d0065006e00740069002000410064006f006200650020005000440046002000700065007200200075006e00610020007300740061006d007000610020006400690020007100750061006c0069007400e00020007300750020007300740061006d00700061006e0074006900200065002000700072006f006f0066006500720020006400650073006b0074006f0070002e0020004900200064006f00630075006d0065006e007400690020005000440046002000630072006500610074006900200070006f00730073006f006e006f0020006500730073006500720065002000610070006500720074006900200063006f006e0020004100630072006f00620061007400200065002000410064006f00620065002000520065006100640065007200200035002e003000200065002000760065007200730069006f006e006900200073007500630063006500730073006900760065002e>
    /JPN <FEFF9ad854c18cea51fa529b7528002000410064006f0062006500200050004400460020658766f8306e4f5c6210306b4f7f75283057307e30593002537052376642306e753b8cea3092670059279650306b4fdd306430533068304c3067304d307e3059300230c730b930af30c830c330d730d730ea30f330bf3067306e53705237307e305f306f30d730eb30fc30d57528306b9069305730663044307e305930023053306e8a2d5b9a30674f5c62103055308c305f0020005000440046002030d530a130a430eb306f3001004100630072006f0062006100740020304a30883073002000410064006f00620065002000520065006100640065007200200035002e003000204ee5964d3067958b304f30533068304c3067304d307e30593002>
    /KOR <FEFFc7740020c124c815c7440020c0acc6a9d558c5ec0020b370c2a4d06cd0d10020d504b9b0d1300020bc0f0020ad50c815ae30c5d0c11c0020ace0d488c9c8b85c0020c778c1c4d560002000410064006f0062006500200050004400460020bb38c11cb97c0020c791c131d569b2c8b2e4002e0020c774b807ac8c0020c791c131b41c00200050004400460020bb38c11cb2940020004100630072006f0062006100740020bc0f002000410064006f00620065002000520065006100640065007200200035002e00300020c774c0c1c5d0c11c0020c5f40020c2180020c788c2b5b2c8b2e4002e>
    /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken voor kwaliteitsafdrukken op desktopprinters en proofers. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.)
    /NOR <FEFF004200720075006b00200064006900730073006500200069006e006e007300740069006c006c0069006e00670065006e0065002000740069006c002000e50020006f0070007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740065007200200066006f00720020007500740073006b00720069006600740020006100760020006800f800790020006b00760061006c00690074006500740020007000e500200062006f007200640073006b0072006900760065007200200065006c006c00650072002000700072006f006f006600650072002e0020005000440046002d0064006f006b0075006d0065006e00740065006e00650020006b0061006e002000e50070006e00650073002000690020004100630072006f00620061007400200065006c006c00650072002000410064006f00620065002000520065006100640065007200200035002e003000200065006c006c00650072002000730065006e006500720065002e>
    /PTB <FEFF005500740069006c0069007a006500200065007300730061007300200063006f006e00660069006700750072006100e700f50065007300200064006500200066006f0072006d00610020006100200063007200690061007200200064006f00630075006d0065006e0074006f0073002000410064006f0062006500200050004400460020007000610072006100200069006d0070007200650073007300f5006500730020006400650020007100750061006c0069006400610064006500200065006d00200069006d00700072006500730073006f0072006100730020006400650073006b0074006f00700020006500200064006900730070006f00730069007400690076006f0073002000640065002000700072006f00760061002e0020004f007300200064006f00630075006d0065006e0074006f00730020005000440046002000630072006900610064006f007300200070006f00640065006d0020007300650072002000610062006500720074006f007300200063006f006d0020006f0020004100630072006f006200610074002000650020006f002000410064006f00620065002000520065006100640065007200200035002e0030002000650020007600650072007300f50065007300200070006f00730074006500720069006f007200650073002e>
    /SUO <FEFF004b00e40079007400e40020006e00e40069007400e4002000610073006500740075006b007300690061002c0020006b0075006e0020006c0075006f0074002000410064006f0062006500200050004400460020002d0064006f006b0075006d0065006e007400740065006a00610020006c0061006100640075006b006100730074006100200074007900f6007000f60079007400e400740075006c006f0073007400750073007400610020006a00610020007600650064006f007300740075007300740061002000760061007200740065006e002e00200020004c0075006f0064007500740020005000440046002d0064006f006b0075006d0065006e00740069007400200076006f0069006400610061006e0020006100760061007400610020004100630072006f0062006100740069006c006c00610020006a0061002000410064006f00620065002000520065006100640065007200200035002e0030003a006c006c00610020006a006100200075007500640065006d006d0069006c006c0061002e>
    /SVE <FEFF0041006e007600e4006e00640020006400650020006800e4007200200069006e0073007400e4006c006c006e0069006e006700610072006e00610020006f006d002000640075002000760069006c006c00200073006b006100700061002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740020006600f600720020006b00760061006c00690074006500740073007500740073006b0072006900660074006500720020007000e5002000760061006e006c00690067006100200073006b0072006900760061007200650020006f006300680020006600f600720020006b006f007200720065006b007400750072002e002000200053006b006100700061006400650020005000440046002d0064006f006b0075006d0065006e00740020006b0061006e002000f600700070006e00610073002000690020004100630072006f0062006100740020006f00630068002000410064006f00620065002000520065006100640065007200200035002e00300020006f00630068002000730065006e006100720065002e>
    /ENU (Use these settings to create Adobe PDF documents for quality printing on desktop printers and proofers.  Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.)
  >>
  /Namespace [
    (Adobe)
    (Common)
    (1.0)
  ]
  /OtherNamespaces [
    <<
      /AsReaderSpreads false
      /CropImagesToFrames true
      /ErrorControl /WarnAndContinue
      /FlattenerIgnoreSpreadOverrides false
      /IncludeGuidesGrids false
      /IncludeNonPrinting false
      /IncludeSlug false
      /Namespace [
        (Adobe)
        (InDesign)
        (4.0)
      ]
      /OmitPlacedBitmaps false
      /OmitPlacedEPS false
      /OmitPlacedPDF false
      /SimulateOverprint /Legacy
    >>
    <<
      /AddBleedMarks false
      /AddColorBars false
      /AddCropMarks false
      /AddPageInfo false
      /AddRegMarks false
      /ConvertColors /NoConversion
      /DestinationProfileName ()
      /DestinationProfileSelector /NA
      /Downsample16BitImages true
      /FlattenerPreset <<
        /PresetSelector /MediumResolution
      >>
      /FormElements false
      /GenerateStructure true
      /IncludeBookmarks false
      /IncludeHyperlinks false
      /IncludeInteractive false
      /IncludeLayers false
      /IncludeProfiles true
      /MultimediaHandling /UseObjectSettings
      /Namespace [
        (Adobe)
        (CreativeSuite)
        (2.0)
      ]
      /PDFXOutputIntentProfileSelector /NA
      /PreserveEditing true
      /UntaggedCMYKHandling /LeaveUntagged
      /UntaggedRGBHandling /LeaveUntagged
      /UseDocumentBleed false
    >>
  ]
>> setdistillerparams
<<
  /HWResolution [2400 2400]
  /PageSize [612.000 792.000]
>> setpagedevice