Open science, reproducible research, data champion, computational biology, proteomics, more omics, emacs, a lot of R, quite a bit of running, and parenting.
Today, I’m going to introduce a recent feature from the
Spectra package,
namely annotated chromatograms. I’ll start by showing the result and
them explain how to produce it.
An annotated chromatogram
You will immediately recognise a chromatogram on the figure below,
showing MS1 scan total ion current over the experiment’s retention
time. Each MS1 event is highlighted by a dot which is colour-coded
based on the number of MS2 offspring scans that lead to
identifications. Grey dots are MS1 scans without any identified
precursor peaks while coloured dots (from dark to light blue)
represent MS1 scans have 1 to 10 (in one case, see details below)
identified precursor peaks.
You can also watch a video illustrating this post here.
The data
The generate the data for an annotated chromatogram, we need raw and
identification data and join them together. We will use the
TMT_Erwinia data available in the
msdata. Below,
we get the raw and identification data files.
Here, we load the raw data into R as a Spectra object:
Below, we load the identification data into R as a PSM object and
filter the PSMs:
We can now annotate the spectra with the identification data. For
details about these steps, see the details and example of the
joinSpectraData()
function.
The countIdentifications() function
The function that permit to produce the data for the figure above is
countIdentifications().
The function is going to tally the number of identifications (i.e
non-missing characters in the sequence spectra variable) for each
scan. In the case of MS2 scans, these will be either 1 or 0, depending
the presence of a sequence. For MS1 scans, the function will count the
number of sequences for the descendant MS2 scans, i.e. those produced
from precursor ions from each MS1 scan.
Below, we see on the second line that 3457 MS2 scans lead to no PSM,
while 2546 lead to an identification. Among all MS1 scans, 833 lead to
no MS2 scans with PSMs. 30 MS1 scans generated one MS2 scan that lead
to a PSM, 45 lead to two PSMs, …
We can now use this new countIdentifications variable to generate
our annotated chromatogram. The code chunk below filters MS1 level
data and then extracts the spectra variable, in particular the
retention time rtime, the total ion current totIonCurrent and the
newly created countIdentifications to produce a figure with
ggplot2.
I was invited to contribute to a
seminar/discussion
on AI, and using language models in research. These are the notes I
prepared my short presentation. I am ...
The HUPO Early Career Researcher (ECR) committee has organised a
discussion panel on Getting recognised for your work and have asked
me to participate - than...
The MuseeL is the
UCLouvain University museum in
Louvain-la-Neuve. Highly recommended. It’s located to the lively
place des sciences, in a nice brutalist sty...