Reproducible research - What should our goals be?
On Friday, I’ll be at the Reproducible Research workshop at the CRUK Cambridge Institute. I’ll be participating in a group discussing What should our goals be?, with James Brenton, Nicole Janz, Natasha Karp and Fiona Nielsen. I thought I would write down my thoughts beforehand. In particular, before even considering what one might be aiming for, I thought it would be useful to consider what one could be aiming for, what levels of reproducibility/replication we could consider.
Nomenclature
EDIT: The terms replicate/replication and reproduce/reproducibility have been confused and used to mean opposite things (read this post for a detailed review). I have edited the post below to follow the nomenclature recommended by Mark Liberman based on J. Claerbout, V. Stodden and R. Peng.
Update (2018-02-17): Here’s a review of Terminologies for Reproducible Research by Lorena A. Barba.
Update (2021-01-18): In this article Lorena Barba narrates “the history of the reproducibility terminology mixup, and the steps that led to the reversal of the ACM definitions”.
The following nomenclature is based on a talk by Carol Gobble at the Software Sustainability Collaborative Workshop in 2014. Here is a list of things that researchers should consider being able to do
-
Repeat my experiment, i.e. obtain the same tables/graphs/results using the same setup (data, software, …) in the same lab or on the same computer. That’s basically re-running one of my analysis some time after I original ndeveloped it.
-
Reproduce
Replicatean experiment (not mine), i.e. obtain the same tables/graphs/results in a different lab or on a different computer, using the same setup (the data would be downloaded from a public repository and the same software, but possibly different version, different OS, is used). I suppose, we should differentiate replication using a fresh install and a virtual machine or docker image that replicates the original setup. -
Replicate
Reproducean experiment, i.e. obtain the same (similar enough) tables/graphs/results in a different set up. The data could still be downloaded from the public repository, or possibly re-generate/re-simulate it, and the analysis would be re-implemented based on the original description. This requires openness, and one would clearly not be allowed the use a black box approach (VM, docker image) or just re-running a script. -
Finally, re-use the information/knowledge from one experiment to run a different experiment with the aim to confirm results from scratch.
An important distinction between these different aspects is that,
paraphrasing C. Drummond (ref below)
replicability reproducibility and re-use require changes, while repeatability
and reproducibility replicability avoid them.
Repeat and reproduce replicate are technical challenges, that
are arguably easy, or rather easier to reach. On the contrary,
replicate reproduce and reuse are scientific
challenges. Ideally, we would want to aim for the latter to identify
scientific truths that hold beyond the comfort of one’s setup. One
could even provocatively argue that the former are not very
interesting - what is the benefit of repeating something that is
potentially wrong. Ever if replicate reproducibility and re-use
are the ultimate goal of Science, repeatability and
reproducibility replication are still essential. How much trust
can we have in the Science if the results vary from day to day, if
even the technological challenges are a genuine hurdle.
We all know that even reproducibility repeatability for medium
size computational projects is difficult, even for trained
computational scientists, armed with an arsenal of tools such as git,
GitHub, knitr, docker, … assuring repeatability and
reproducibility replication requires substantial investment.
To conclude, I would argue that we, as individuals, should definitely
assure repeatability, certainly aim for
reproducibility replication but not forget that what we, as a scientific
community, should really aim for, is
replication reproducibility and re-use.
References
Carole Goble. Results may vary. Reproducibility, Science, Software Collaborations Workshop, Oxford, 26 March 2014, slides.
Drummond C. Replicability is not Reproducibility: Nor is it Good Science, online
Peng RD. Reproducible research in computational science. Science. 2011 Dec 2;334(6060):1226-7. doi: 10.1126/science.1213847. PMID:22144613; PMCID:PMC3383002.
Edit Another great reference contributed by Ben Marwick: Replicability vs. reproducibility - or is it the other way around?