R/Chromatograms.R
Chromatograms.RdThe Chromatograms class encapsules chromatographic data and related
metadata. The chromatographic data is represented by a backend extending
the virtual ChromBackend class which provides the raw data to the
Chromatograms object. Different backends and their properties are
decribed in the ChromBackend class documentation.
# S4 method for class 'ChromBackendOrMissing'
Chromatograms(object = ChromBackendMemory(), processingQueue = list(), ...)
# S4 method for class 'Spectra'
Chromatograms(
object,
summarize.method = c("sum", "max"),
chromData = data.frame(),
factorize.by = c("msLevel", "dataOrigin"),
spectraVariables = character(),
...
)
# S4 method for class 'Chromatograms,ChromBackend'
setBackend(
object,
backend,
f = processingChunkFactor(object),
BPPARAM = SerialParam(),
...
)
# S4 method for class 'Chromatograms'
x$name
# S4 method for class 'Chromatograms'
x$name <- value
# S4 method for class 'Chromatograms'
x[i, j, ..., drop = FALSE]
# S4 method for class 'Chromatograms'
x[[i, j, ...]]
# S4 method for class 'Chromatograms'
x[[i, j, ...]] <- value
# S4 method for class 'Chromatograms'
factorize(object, factorize.by = c("msLevel", "dataOrigin"), ...)
# S4 method for class 'Chromatograms'
chromExtract(object, peak.table, by, ...)A Chromatograms object.
list a list of processing steps (i.e. functions) to
be applied to the chromatographic data. The processing steps are
applied in the order they are listed in the processingQueue.
Additional arguments.
For Chromatograms created with a Spectra object:
A character vector with the name of the function to be used to
summaries the spectra data intensity. The available methods are "sum"
and "max". The default is "sum".
For Chromatograms() build from a Spectra object backend,
a data.frame with the chromatographic data. If not provided
(or if empty), a default data.frame with the core chromatographic
variables will be created.
A character vector with the names of the variables in
the Spectra object and the chromData slot that should be used
to factorize the Spectra object data to generate the
chromatographic data.
A character vector specifying which variables
from the Spectra object should be added to the chromData. These
will be mapped using the chromSpectraIndex variable.
ChromBackend object providing the raw data for the
Chromatograms object.
factor defining the grouping to split the Chromatograms object.
Parallel setup configuration. See BiocParallel::bpparam()
for more information.
A Chromatograms object.
A character string specifying the name of the variable to
access.
The value to replace the variable with.
For [: integer, logical or character to subset the object.
For [ and [[: ignored.
For [: logical(1) default to FALSE.
For chromExtract() A data frame containing the
following minimum columns:
- rtMin: Minimum retention time for each peak. Cannot be NA.
- rtMax: Maximum retention time for each peak. Cannot be NA.
- mzMin: Minimum m/z value for each peak.
- mzMax: Maximum m/z value for each peak.
Additionally, the peak.table must include columns that uniquely
identify chromatograms in the object. Common choices are
"msLevel" and/or "dataOrigin". These columns must also be present
in the chromData of the object. Any extra columns in
peak.table will be added to the chromData of the newly created
object.
A character vector naming one or more columns that uniquely
identify chromatograms in both peak.table and
chromData(object). The combination of these columns must be unique
within chromData(object). Typically includes "dataOrigin",
"msLevel", or both.
Refer to the individual function description for information on the return value.
This needs to be discussed, if we want for example to be able to set a
a backend to ChromBackendMzR we need to implement backendInitialize()
better. = Support peaksData and chromData as arguments AND have a way to
write .mzml files (which we do not have for chromatographic data).
Chromatograms objects can be created using the Chromatograms()
construction function. Either by providing a ChromBackend object or by
providing a Spectra object. The Spectra object will be used to generate
a Chromatograms object with a backend of class ChromBackendSpectra.
Chromatograms objectThe Chromatograms object is a container for chromatographic data, which
includes peaks data (retention time and related intensity values, also
referred to as peaks data variables in the context of Chromatograms) and
metadata of individual chromatogram (so called chromatograms variables).
While a core set of chromatograms variables (the
coreChromatogramsVariables()) and peaks data variables (the
corePeaksVariables()) are guaranteed to be provided by a Chromatograms,
it is possible to add arbitrary variables to a Chromatograms object.
The Chromatograms object is designed to contain chromatographic data of a
(large) set of chromatograms. The data is organized linearly and can be
thought of a list of chromatograms, i.e. each element in the Chromatograms
is one chromatogram.
The chromatograms variables information in the Chromatograms object can
be accessed using the chromData() function. Specific chromatograms
variables can be accessed by either precising the "columns" parameter in
chromData() or using $. chromData can be accessed, replaced but
also filtered/subsetted. Refer to the chromData documentation for more
details.
The peaks data variables information in the Chromatograms object can be
accessed using the peaksData() function. Specific peaks variables can be
accessed by either precising the "columns" parameter in peaksData() or
using $. peaksData can be accessed, replaced but also
filtered/subsetted. Refer to the peaksData documentation for more details.
Chromatograms objectsFunctions that process the chromatograms data in some ways can be applied to
the object either directly or by using the processingQueue mechanism. The
processingQueue is a list of processing steps that are stored within the
object and only applied when needed. This was created so that the data can be
processed in a single step and is very useful for larger datasets. This is
even more true as this processing queue will call function that can be
applied on the data in a chunk-wise manner. This allows for parallel
processing of the data and reduces the memory demand. To read more about the
processingQueue, and how to parallelize your processes, see the
processingQueue documentation.
The Chromatograms class supports subsetting by chromatogram (i.e. rows) using
the [ operator. The [ operator does not support subsetting by columns.
Specific chromatograms or peaks variables can be accessed using the [[
operator or the $ operator. The [[ operator can also be used to
replace specific chromatograms or peaks variables.
The setBackend() function can be used to change the backend of a
Chromatograms object. This can be useful to switch to a backend that
better suits the needs of the user, for example switching to a memory-based
backend for smaller datasets or to a file-based backend for larger datasets.
The setBackend() function supports parallelization of the backend
conversion using the BPPARAM parameter.
The chromExtract() function allows users to extract specific regions of
interest from a Chromatograms object based on a user-provided peak table.
Each row in the peak.table defines a region to extract, using minimum and
maximum retention time (and m/z in the case of chromBackendSpectra)
boundaries, and identifiers that uniquely match chromatograms in the object.
The resulting new Chromatograms object contains only chromatograms overlapping
the specified regions, with updated metadata reflecting the extracted
boundaries.
This function is most commonly used to subset chromatographic data around detected peaks or predefined time/mass ranges, for example to reprocess, visualize, or quantify extracted chromatograms corresponding to known features.
chromData for a general description of the chromatographic metadata available in the object, as well as how to access, replace and subset them. peaksData for a general description of the chromatographic peaks data available in the object, as well as how to access, replace and subset them. processingQueue for more information on the queuing of processings and parallelization for larger dataset.
library(MsBackendMetaboLights)
library(Spectra)
## Create a Chromatograms object from a Spectra object.
be <- backendInitialize(MsBackendMetaboLights(),
mtblsId = "MTBLS39",
filePattern = c("63B.cdf")
)
#> Used data files from the assay's column "Raw Spectral Data File" since none were available in column "Derived Spectral Data File".
s <- Spectra(be)
s <- setBackend(s, MsBackendMemory())
be <- backendInitialize(new("ChromBackendSpectra"), s)
chr <- Chromatograms(be)
## Subset
chr[1:2]
#> Chromatographic data (Chromatograms) with 2 chromatograms in a ChromBackendSpectra backend:
#> chromIndex msLevel mz
#> 1 NA 1 Inf
#> 2 NA 1 Inf
#> ... 7 more chromatogram variables/columns
#> ... 2 peaksData variables
#>
#> The Spectra object contains 1651 spectra
## access a specific variables
chr[["msLevel"]]
#> [1] 1 1 1
chr$msLevel
#> [1] 1 1 1
## Replace data of a specific variable
chr$msLevel <- c(2L, 2L, 2L)
## Can re factorize the data
chr <- factorize(chr)
## Can also change the backend into memory
chr <- setBackend(chr, ChromBackendMemory())
chr
#> Chromatographic data (Chromatograms) with 3 chromatograms in a ChromBackendMemory backend:
#> chromIndex msLevel mz
#> 1 NA 2 Inf
#> 2 NA 2 Inf
#> 3 NA 2 Inf
#> ... 12 more chromatogram variables/columns
#> ... 2 peaksData variables
#> Processing:
#> Switch backend from ChromBackendSpectra to ChromBackendMemory [Wed Oct 22 08:40:22 2025]