Rmarkdown for reproducible research

Laurent Gatto

1 July 2019

Learning Objectives

References

The Reproducible research chapter in the Intoduction to Bioinformatics course at http://bit.ly/WSBIM1207

Direct link: https://uclouvain-cbio.github.io/WSBIM1207/sec-rr.html

Reproducible research

The table below summerises different levels of reproducible research focusing on data and code in computational projects.

Same data Different data
Same code
Different code

Reproducible research refers to research that can be reproduced under various conditions and by different people.

Reproducible research

The table below summerises different levels of reproducible research focusing on data and code in computational projects.

Same data Different data
Same code Repeat
Different code

Repeat my experiment, i.e. obtain the same tables/graphs/results using the same setup (data, software, …) in the same lab or on the same computer.

Reproducible research

The table below summerises different levels of reproducible research focusing on data and code in computational projects.

Same data Different data
Same code Repeat Reproduce
Different code Reproduce

Reproduce an experiment (not ones own), i.e. obtain the same tables/graphs/results in a different lab or on a different computer, using the same or similar setup.

Reproducible research

The table below summerises different levels of reproducible research focusing on data and code in computational projects.

Same data Different data
Same code Repeat Reproduce
Different code Reproduce Replicate

Replicate an experiment, i.e. obtain the same (similar enough) tables/graphs/results in a different set up.

Reproducible research

The table below summerises different levels of reproducible research focusing on data and code in computational projects.

Same data Different data
Same code Repeat Reproduce
Different code Reproduce Replicate

Finally, re-use the information/knowledge from one experiment to run a different experiment with the aim to confirm results from scratch.

Five selfish reasons

There are many reasons to work reproducibly, and Markowetz (2015) nicely summarises 5 good reasons. Importantly, he stressed out that the first beneficiary of reproducible work are the student/research that apply these principles:

  1. Reproducibility helps to avoid disaster.
  2. Reproducibility makes it easier to write papers.
  3. Reproducibility helps reviewers see it your way.
  4. Reproducibility enables continuity of your work.
  5. Reproducibility helps to build your reputation.

knitr and rmarkdown

The rmarkdown workflow (image from RStudio)

The rmarkdown workflow (image from RStudio)

Rmd

An Rmarkdown (Rmd) document is composed of

RStudio also supports Notebook documents[^jupyter] that execute individual code chunks independently and display directly in the source document.

Additional features

More

Using Rmarkdown, it is also possible to produce slides, websites, and complete books, interactive documents and R package vignettes.