This function combines peptides into their proteins by normalising the intensity values to a reference run/sample for each protein.
Details
This function is not intented to be used directly (that's why it is not
exported via NAMESPACE
). Instead the user should use
combineFeatures
.
The algorithm is described in Nikolovski et al., briefly it works as follows:
Find reference run (column) for each protein (grouped rows). We use the run (column) with the lowest number of
NA
. If multiple candidates are available we use the one with the highest intensity. This step is skipped if the user use his ownreference
vector.For each protein (grouped rows) and each run (column):
Find peptides (grouped rows) shared by the current run (column) and the reference run (column).
Sum the shared peptides (grouped rows) for the current run (column) and the reference run (column).
The ratio of the shared peptides (grouped rows) of the current run (column) and the reference run (column) is the new intensity for the current protein for the current run.
References
Nikolovski N, Shliaha PV, Gatto L, Dupree P, Lilley KS. Label-free protein quantification for plant Golgi protein localization and abundance. Plant Physiol. 2014 Oct;166(2):1033-43. DOI: 10.1104/pp.114.245589. PubMed PMID: 25122472.
Author
Sebastian Gibb mail@sebastiangibb.de, Pavel Shliaha
Examples
library("MSnbase")
data(msnset)
# choose the reference run automatically
combineFeatures(msnset, groupBy=fData(msnset)$ProteinAccession)
#> Your data contains missing values. Please read the relevant section in
#> the combineFeatures manual page for details on the effects of missing
#> values on data aggregation.
#> MSnSet (storageMode: lockedEnvironment)
#> assayData: 40 features, 4 samples
#> element names: exprs
#> protocolData: none
#> phenoData
#> sampleNames: iTRAQ4.114 iTRAQ4.115 iTRAQ4.116 iTRAQ4.117
#> varLabels: mz reporters
#> varMetadata: labelDescription
#> featureData
#> featureNames: BSA ECA0172 ... ENO (40 total)
#> fvarLabels: spectrum ProteinAccession ... CV.iTRAQ4.117 (19 total)
#> fvarMetadata: labelDescription
#> experimentData: use 'experimentData(object)'
#> Annotation:
#> - - - Processing information - - -
#> Data loaded: Wed May 11 18:54:39 2011
#> iTRAQ4 quantification by trapezoidation: Wed Apr 1 21:41:53 2015
#> Combined 55 features into 40 using mean: Tue Oct 15 15:27:00 2024
#> MSnbase version: 2.31.2
# use a user-given reference
combineFeatures(msnset, groupBy=fData(msnset)$ProteinAccession,
reference=rep(2, 55))
#> Your data contains missing values. Please read the relevant section in
#> the combineFeatures manual page for details on the effects of missing
#> values on data aggregation.
#> MSnSet (storageMode: lockedEnvironment)
#> assayData: 40 features, 4 samples
#> element names: exprs
#> protocolData: none
#> phenoData
#> sampleNames: iTRAQ4.114 iTRAQ4.115 iTRAQ4.116 iTRAQ4.117
#> varLabels: mz reporters
#> varMetadata: labelDescription
#> featureData
#> featureNames: BSA ECA0172 ... ENO (40 total)
#> fvarLabels: spectrum ProteinAccession ... CV.iTRAQ4.117 (19 total)
#> fvarMetadata: labelDescription
#> experimentData: use 'experimentData(object)'
#> Annotation:
#> - - - Processing information - - -
#> Data loaded: Wed May 11 18:54:39 2011
#> iTRAQ4 quantification by trapezoidation: Wed Apr 1 21:41:53 2015
#> Combined 55 features into 40 using mean: Tue Oct 15 15:27:00 2024
#> MSnbase version: 2.31.2