Introduction

The RforProteomics package distributes and extends the use-cases described in the Using R and Bioconductor for proteomics data analysis manuscript (pubmed and pre-print). The package illustrates how R and a selection of dedicated packages that can be used to access mass-spectrometry proteomics data, manipulate and visualise it, how to process label-free and labelled quantitative data and how to analyse the quantitation data.

The package will be updated beyond the content of the manuscript to keep up-to-date with progress in the area. The github page can be used to edit the wiki and file new issues related to the package itself of general needed for proteomics that should be addressed in R.

It would be great if this work could stimulate a wider participation to use R and develop R packages for proteomics and promote interaction between computational biologists working in the field of proteomics, in particular by facilitating interoperability between their software. The rbioc-sig-proteomics group has tentatively been set up to provide a forum for questions and discussion for interested parties. The officiall Bioconductor support site is the channel of choice to ask questions about specific Bioconductor packages. Do not hesitate to get in touch for questions, comments or further suggestions. Note taking about plans/ideas/direction for R/Bioc and proteomics can be contributed to the RforProteomics wiki.

Data and vignette

The package uses the dataset PXD000001 from the ProteomeXchange repository in several examples. The data can be queries and downloaded from R with the rpx package. The RforProteomics vignette is a detailed document containing the exact code to reproduce all the analyses presented in the manuscript as well as other application examples. It can be accessed once the package is installed (see below) with the RforProteomics() function. Alternatively, the vignettes can be read online here and here.

A second vignette, RProtVis focuses on the visualisation of mass spectrometry and proteomics data with R and Bioconductor. From R, it is currently only available with Bioconductor >= 3.0 using the RProtViz() function. It can also be consulted on-line on the ‘RforProteomics’ development version page.

Installation

The package is available on Bioconductor (version >= 2.13). To install the package and its documentation, start R (>= 3.0.0 required) and type:

source("http://bioconductor.org/biocLite.R")
biocLite("RforProteomics")

To install all dependencies (75+ packages, including RforProteomics) and fully reproduce the code in the vignettes, replace the last line in the code chunk above with:

biocLite("RforProteomics", dependencies = TRUE)

Collaborative editing

The community and package authors are invited to contribute to the package. If you have or know of a package of interest, please fork the repository, add a new section to the vignette and send a pull request. If you update the vignette, please also add yourself as a contributor to the package.

There is also a wiki than any github user can edit to gather specific R/Bioconductor proteomics needs and ideas.

Help

To obtain help or additional information about the RforProteomics package, please contact me. For help about the packages presented in the vignette or manuscript, please refer to the R mailing list, Bioconductor mailing list (if the package is in Bioconductor) and/or the respective package authors.

For general resources about R, see the corresponding section in the vignettes and the TeachingMaterial repository.