Storing multiple related MSnSets
MSnSetList-class.RdA class for storing lists of MSnSet
instances.
Details
There are two ways to store different sets of measurements pertaining an experimental unit, such as replicated measures of different conditions that were recorded over more than one MS acquisition. Without focusing on any proteomics technology in particular, these multiple assays can be recorded as
A single combined
MSnSet(see the section Combining MSnSet instances in the MSnbase-demo section). In such cases, the different experimental (phenotypical) conditions are recorded as anAnnotatedDataFramein thephenoDataslots.Quantitative data for features that were missing in an assay are generally encode as missing with
NAvalues. Alternatively, only features observed in all assays could be selected. See thecommonFeatureNamesfunctions to select only common features among two or moreMSnSetinstance.Each set of measurements is stored in an
MSnSetwhich are combined into oneMSnSetList. EachMSnSetelements can have identical or different samples and features. Unless compiled directly manually by the user, one would expect at least one of these dimensions (features/rows or samples/columns) are conserved (i.e. all feature or samples names are identical). Seesplit/unsplitbelow.
Objects from the Class
Objects can be created and manipluated with:
MSnSetList(x, log, featureDAta)The class constructor that takes a list of valid
MSnSetinstances as inputx, an optional logginglist, and an optional feature metadatadata.frame.split(x, f)An
MSnSetListcan be created from anMSnSetinstance.xis a singleMSnSetandfis afactoror acharacterof length 1. In the latter case,fwill be matched to the feature- and phenodata variable names (in that order). If a match is found, the respective variable is extracted, converted to a factor if it is not one already, and used to splitxalong the features/rows (fwas a feature variable name) or samples/columns (fwas a phenotypic variable name). Iffis passed as a factor, its length will be matched tonrow(x)orncol(x)(in that order) to determine ifxwill be split along the features (rows) or sample (columns). Hence, the length offmust match exactly to either dimension.unsplit(value, f)The
unsplitmethod reverses the effect of splitting thevalueMSnSetalong the groupsf.as(x, "MSnSetList")Where
xis an instance of class MzTab. See the class documentation for details.
Slots
x:Object of class
listcontaining validMSnSetinstances. Can be extracted with themsnsets()accessor.log:Object of class
listcontaining an object creation log, containing among other elements the call that generated the object. Can be accessed withobjlog().featureData:Object of class
DataFramethat stores metadata for each object in thexslot. The number of rows of thisdata.framemust be equal to the number of items in thexslot and their respective (row)names must be identical..__classVersion__:The version of the instance. For development purposes only.
Methods
"[["Extracts a single
MSnSetat position."["Extracts one of more
MSnSetsasMSnSetList.lengthReturns the number of
MSnSets.namesReturns the names of
MSnSets, if available. The replacement method is also available.showDisplay the object by printing a short summary.
lapply(x, FUN, ...)Apply function
FUNto each element of the inputx. If the application ofFUNreturns andMSnSet, then the return value is anMSnSetList, otherwise alist
.
sapply(x, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)A
lapplywrapper that simplifies the ouptut to a vector, matric or array is possible. See?base::sapplyfor details.
.
fDataReturns the features metadata
featureDataslot.fData<-Features metadata
featureDatareplacement method.
See also
The commonFeatureNames function to select common
features among MSnSet instances.
Examples
library("pRolocdata")
data(tan2009r1)
data(tan2009r2)
## The MSnSetList class
## for an unnamed list, names are set to indices
msnl <- MSnSetList(list(tan2009r1, tan2009r2))
names(msnl)
#> [1] "1" "2"
## a named example
msnl <- MSnSetList(list(A = tan2009r1, B = tan2009r2))
names(msnl)
#> [1] "A" "B"
msnsets(msnl)
#> $A
#> MSnSet (storageMode: lockedEnvironment)
#> assayData: 888 features, 4 samples
#> element names: exprs
#> protocolData: none
#> phenoData
#> sampleNames: X114 X115 X116 X117
#> varLabels: Fractions
#> varMetadata: labelDescription
#> featureData
#> featureNames: P20353 P53501 ... P07909 (888 total)
#> fvarLabels: FBgn Protein.ID ... markers.tl (16 total)
#> fvarMetadata: labelDescription
#> experimentData: use 'experimentData(object)'
#> pubMedIds: 19317464
#> Annotation:
#> - - - Processing information - - -
#> Added markers from 'mrk' marker vector. Thu Jul 16 22:53:44 2015
#> MSnbase version: 1.17.12
#>
#> $B
#> MSnSet (storageMode: lockedEnvironment)
#> assayData: 871 features, 4 samples
#> element names: exprs
#> protocolData: none
#> phenoData
#> sampleNames: X114 X115 X116 X117
#> varLabels: Fractions
#> varMetadata: labelDescription
#> featureData
#> featureNames: P20432 P20353 ... Q9VIW3 (871 total)
#> fvarLabels: FBgn Protein.ID ... markers (13 total)
#> fvarMetadata: labelDescription
#> experimentData: use 'experimentData(object)'
#> pubMedIds: 19317464
#> Annotation:
#> - - - Processing information - - -
#> Added markers from 'mrk' marker vector. Thu Jul 16 22:53:44 2015
#> MSnbase version: 1.17.12
#>
length(msnl)
#> [1] 2
objlog(msnl)
#> $call
#> MSnSetList(x = list(A = tan2009r1, B = tan2009r2))
#>
msnl[[1]] ## an MSnSet
#> MSnSet (storageMode: lockedEnvironment)
#> assayData: 888 features, 4 samples
#> element names: exprs
#> protocolData: none
#> phenoData
#> sampleNames: X114 X115 X116 X117
#> varLabels: Fractions
#> varMetadata: labelDescription
#> featureData
#> featureNames: P20353 P53501 ... P07909 (888 total)
#> fvarLabels: FBgn Protein.ID ... markers.tl (16 total)
#> fvarMetadata: labelDescription
#> experimentData: use 'experimentData(object)'
#> pubMedIds: 19317464
#> Annotation:
#> - - - Processing information - - -
#> Added markers from 'mrk' marker vector. Thu Jul 16 22:53:44 2015
#> MSnbase version: 1.17.12
msnl[1] ## an MSnSetList of length 1
#> Instance of class 'MSnSetList' containig 1 objects.
## Iterating over the elements
lapply(msnl, dim) ## a list
#> $A
#> [1] 888 4
#>
#> $B
#> [1] 871 4
#>
lapply(msnl, normalise, method = "quantiles") ## an MSnSetList
#> Instance of class 'MSnSetList' containig 2 objects.
fData(msnl)
#> DataFrame with 2 rows and 0 columns
fData(msnl)$X <- sapply(msnl, nrow)
fData(msnl)
#> DataFrame with 2 rows and 1 column
#> X
#> <integer>
#> A 888
#> B 871
## Splitting and unsplitting
## splitting along the columns/samples
data(dunkley2006)
head(pData(dunkley2006))
#> membrane.prep fraction replicate
#> M1F1A 1 1 A
#> M1F4A 1 4 A
#> M1F7A 1 7 A
#> M1F11A 1 11 A
#> M1F2B 1 2 B
#> M1F5B 1 5 B
(splt <- split(dunkley2006, "replicate"))
#> Instance of class 'MSnSetList' containig 2 objects.
lapply(splt, dim) ## the number of rows and columns of the split elements
#> $A
#> [1] 689 8
#>
#> $B
#> [1] 689 8
#>
unsplt <- unsplit(splt, dunkley2006$replicate)
stopifnot(compareMSnSets(dunkley2006, unsplt))
## splitting along the rows/features
head(fData(dunkley2006))
#> assigned evidence method new pd.2013 pd.markers markers.orig
#> AT1G09210 ER predicted PLSDA known ER ER lumen ER
#> AT1G21750 ER predicted PLSDA known ER ER lumen ER
#> AT1G51760 ER unknown PLSDA new ER ER lumen unknown
#> AT1G56340 ER predicted PLSDA known ER ER lumen ER
#> AT2G32920 ER predicted PLSDA known ER ER lumen ER
#> AT2G47470 ER predicted PLSDA known ER ER lumen ER
#> markers
#> AT1G09210 ER lumen
#> AT1G21750 ER lumen
#> AT1G51760 ER lumen
#> AT1G56340 ER lumen
#> AT2G32920 ER lumen
#> AT2G47470 ER lumen
(splt <- split(dunkley2006, "markers"))
#> Instance of class 'MSnSetList' containig 10 objects.
unsplt <- unsplit(splt, factor(fData(dunkley2006)$markers))
simplify2array(lapply(splt, dim))
#> ER lumen ER membrane Golgi Mitochondrion PM Plastid Ribosome TGN unknown
#> [1,] 14 45 28 55 46 20 19 13 428
#> [2,] 16 16 16 16 16 16 16 16 16
#> vacuole
#> [1,] 21
#> [2,] 16
stopifnot(compareMSnSets(dunkley2006, unsplt))