TeachingMaterial
This repository is an aggregator for various
R, make
and git
/github
teaching
material. Most of the courses are taught at the University of
Cambridge, UK, and some have been adapted and exported outside. We
would also like to acknowledge contributions from
Aleksandra Pawlik,
Software Sustainability Institute,
Raphael Gottardo, Fred Hutchinson Cancer
Research Center and Karl Broman,
University of Wisconsin-Madison.
Each material subdirectory has its own repository; TeachingMaterial
aggregates a snapshot as a central entry point. Aggregation is done
using git-subtree
(see the
administration page
for details). The local copies linking to external repositories are
prefixed with an underscore.
Unless otherwise stated, all material is licensed under a
Creative Commons Attribution-ShareAlike 3.0 License.
This means you are free to copy, distribute and transmit the work,
adapt it to your needs as long as you cite its origin and, if you do
redistribute it, do so under the same license.
See also the
TeachingMaterial
wiki
for meta-information about the repository and general R
installation
material and links.
If you like this material and/or this initiative, do not hesitate to
let us know by starring the repo, tweeting about it and sharing it
with your colleagues.
Material
Mass spectrometry and proteomics using R/Bioconductor
- Description: In this course, we will use R/Bioconductor packages to
explore, process, visualise and understand mass spectrometry-based
proteomics data, starting with raw data, and proceeding with
identification and quantitation data, discussing some of their
peculiarities compared to sequencing data along the way. The
workflow is aimed at a beginner to intermediate level, such as, for
example, seasoned R users who want to get started with mass
spectrometry and proteomics, or proteomics practitioners who want to
familiarise themselves with R and Bioconductor infrastructure.
- Direct link: http://bit.ly/bioc-ms-prot (see also this 3-days workshop)
- Author: Laurent Gatto
- Original repository: https://github.com/lgatto/bioc-ms-prot
- More details: README
Visualising biomolecular data
- Description: This Visualisation of biomolecular data course is aimed
at people who are already familiar with the R language and syntax,
and who would like to get a hands-on introduction to visualisation,
with a focus on biomolecular data in general, and proteomics in
particular. This course is meant to be mostly hands-on, with an
intuitive understanding of the underlying techniques.
- Direct link: http://bit.ly/biomolvis
- Author: Laurent Gatto
- Original repository:
https://github.com/lgatto/VisualisingBiomolecularData
A gentle introduction to git and Github
- Description: The WSBIM1207 course is an introduction to
bioinformatics (and data science) for biology and biomedical
students. It introduces bioinformatics methodology and technologies
without relying on any prerequisites. The aim of this course is for
students to be in a position to understand important notions of
bioinformatics and tackle simple bioinformatics-related problems in
R, in particular to develope simple R analysis scripts and
reproducible analysis reports to interogate, visualise and
understand data in a tidy tabular format.
- Direct link: http://bit.ly/WSBIM1207
- Author: Laurent Gatto
- Description: The
WSBIM1322 course
is teaches the basics of statistical data analysis applied to high
throughput biology. It is aimed at biology and biomedical students
that are already familiar with the R langauge (see the pre-requisits
section below). The students will familiarise themselves with
statitical learning concepts such as unsupervised and supervised
learning, hypothesis testing, and extend their understanding and
practive in R data structures and programming and the Bioconductor
project.
- Direct link: http://bit.ly/WSBIM1322
- Author: Laurent Gatto
Advanced R programming
- Description: A two-day course taught on the 3-4 April 2017, teaching
advanced techniques in writing reliable, robust code in R.
- Author: Laurent Gatto, and Robert
Stojnic.
- Original repository:
https://github.com/lgatto/2017-04-03-adv-r-progr-EMBL
- Content: The material provides the opportunity to gain experience
and understanding of object-oriented programming, packaging your
code for distribution, advanced approaches for data visualisation,
unit testing, and debugging.
R debugging and robust programming
- Description: A 2-day workshop taught on the 25-26 February 2016 at
the EMBL, Heidelberg. The course aims at teaching participants
debugging techniques and good practice in writing reliable, robust
code.
- Author: Laurent Gatto, based on
previous content by Laurent Gatto and Robert Stojnic, and
Advanced R, by Hadley Wickham.
- Original repository:
https://github.com/lgatto/2016-02-25-adv-programming-EMBL
- Content: Part I: Coding style(s), Interactive use and programming,
Environments, Tidy data, Computing on the language. Part II:
Functions, Robust programming with functions, Scoping, Closures,
High-level functions, Vectorisation. Part III: Defensive
programming, Debbugging: techniques and tools, Condition handling:
try/tryCatch, Unit testing. Part IV: Benchmarking, Profiling,
Optimisation, Memory, Rcpp.
- More details: https://github.com/lgatto/2016-02-25-adv-programming-EMBL/blob/master/README.md
rbc
- Description:
Software carpentry R bootcamp, Jan
7-8, 2014, Cambridge, UK and 6-7 Nov 2014, Zurich, Switzerland.
- Authors: Stephen Eglen,
Laurent Gatto, Robert Stojnić and
Aleksandra Pawlik
- Original repository: https://github.com/lgatto/rbc/
- Content:
R
programming, plotting, git
/github
(via
software carpentry), make
,
shell
and knitr
, profiling, testing, debugging.
spr
Biostat-578
github_tutorial
minimal_make
QuickPackage
R package development
Benchmarking, profiling and optimisation
- Description: Benchmarking, profiling and optimisation
- Author: Laurent Gatto
- Original repository: https://github.com/lgatto/R-bmark-prof-optim
- More details: https://github.com/lgatto/R-bmark-prof-optim#readme
- Read the material
RBasics
RIntro
basicr
R functional programming
R vectorisation
R debugging
R parallel
R object oriented programming
- Description: Covers S3, S4 and S4 Reference Classes OO programming using DNA/RNA sequence data manipulation as a working example.
- Author: Laurent Gatto and Robert Stojnić
- Original repository: https://github.com/lgatto/roo
- More details: README
- Download the pdf
One day course on R OO programming and package development
- Description: A short 1-day course about R object-oriented
programming, package development and various other topics (C
interface, unit testing, debugging). See also other more recent and
detailed lessons about these topic on this page.
- Author: Laurent Gatto and Robert Stojnić
- Original repository: https://github.com/lgatto/advr1
- More details: README
- Download the pdf
Short S4 tutorial
R programming tutorial
- Description: A tutorial on R programming of intermediate level, focusing on some aspects of functional programming, profiling, testing, debugging and parallelisation. Used as more advanced
R
programming lecture during the CSAMA workshop.
- Author: Laurent Gatto
- Original repository: https://github.com/lgatto/R-programming
- More details: README
- Download the pdf
R and C/C++
visualisation
sequences
- Description: Educational package used in
R
to illustrate OO programming and package development
- Author: Laurent Gatto and Robert Stojnić
- Original repository: https://github.com/lgatto/sequences
- More details: DESCRIPTION
- Installation from CRAN:
install.packages("sequences")
- Installation from github (requires
R
and C/C++
building tools):
library(devtools)
install_github("lgatto/sequences")
- Description: A short course for a Bioinformatics minor at the
University of Cambridge. What is open science (data, source/code,
access), and how can we enable it? What is reproducible research,
and why do we need it and how can we implement it? The objective is
to familiarise students with concepts and tools of open science and
reproducible research.
- Author: Laurent Gatto
- Original repository:
https://github.com/lgatto/open-rr-bioinfo-best-practice
Beginner’s statistics in R
- Description: A 2.5 days introductionary course focusing on R and
basics statistics for proteomics scientists. The R intruduction
material is based on the
Data Carpentry R analysis lesson
and leads to the introduction and application of basic uni-variate
statistics using proteomics data. The course was developed and
taught as part of the May Institute, at the Northeastern University,
Boston, MA in May 2017.
- Authors: Laurent Gatto and
Meena Choi, with material from the Data Carpentry R
lesson.
- Original repository:
https://github.com/lgatto/2017-05-03-RstatsIntro-NEU
Statistics primer
- Description: A short course for a Bioinformatics minor at the
University of Cambridge. Introducing basic concepts in statistics:
experimental design, randomisation, technical and biological
variation, power analysis, hypothesis testing, confidence interval,
what is a p-value, false discovery rate, multiple testing
adjustment, dangers of uninformed statistical practice. The objective
is familiarise students with basic statistical concepts and initiate
them to statistical thinking. They should be able to critically
assess an experimental design and the reporting of a simple
statistical analysis.
- Author: Laurent Gatto
- Original repository: https://github.com/lgatto/statistics-primer
- Slides on
experimental desing,
significance testing,
and
practical.
Inspection, visualisation and analysis of quantitative proteomics data
R and Bioconductor for Mass Spectrometry and Proteomics data analysis
An Introduction to Machine Learning with R
- Description: This introductory workshop on machine learning with R
is aimed at participants who are not experts in machine learning
(introductory material will be presented as part of the course), but
have some familiarity with scripting in general and R in
particular. The workshop will offer a hands-on overview of typical
machine learning applications in R, including unsupervised
(clustering, such as hierarchical and k-means clustering, and
dimensionality reduction, such as principal component analysis) and
supervised (classification and regression, such as K-nearest
neighbour and linear regression) methods. We will also address
questions such as model selection using cross-validation.
- Author: Laurent Gatto
- Original repository:
https://lgatto.github.io/IntroMachineLearningWithR/
- Direct access to the material: bookdown formatted
License
We try to only aggregate material that is openly available, generally
under
Creative Commons Attribution license,
which gives you the right to share and adapt the material as long as
you credit to original author(s). Please refer to the orignal
repository for details.