The Chromatograms class to manage and access chromatographic data

The Chromatograms class encapsules chromatographic data and related metadata. The chromatographic data is represented by a backend extending the virtual ChromBackend class which provides the raw data to the Chromatograms object. Different backends and their properties are decribed in the ChromBackend class documentation.

# S4 method for class 'ChromBackendOrMissing'
Chromatograms(object = ChromBackendMemory(), processingQueue = list(), ...)

# S4 method for class 'Spectra'
Chromatograms(
  object,
  summarize.method = c("sum", "max"),
  chromData = data.frame(),
  factorize.by = c("msLevel", "dataOrigin"),
  spectraVariables = character(),
  ...
)

# S4 method for class 'Chromatograms,ChromBackend'
setBackend(
  object,
  backend,
  f = processingChunkFactor(object),
  BPPARAM = SerialParam(),
  ...
)

# S4 method for class 'Chromatograms'
x$name

# S4 method for class 'Chromatograms'
x$name <- value

# S4 method for class 'Chromatograms'
x[i, j, ..., drop = FALSE]

# S4 method for class 'Chromatograms'
x[[i, j, ...]]

# S4 method for class 'Chromatograms'
x[[i, j, ...]] <- value

# S4 method for class 'Chromatograms'
factorize(object, factorize.by = c("msLevel", "dataOrigin"), ...)

# S4 method for class 'Chromatograms'
chromExtract(object, peak.table, by, ...)

Arguments

object: A Chromatograms object.
processingQueue: list a list of processing steps (i.e. functions) to be applied to the chromatographic data. The processing steps are applied in the order they are listed in the processingQueue.
...: Additional arguments.
summarize.method: For Chromatograms created with a Spectra object: A character vector with the name of the function to be used to summaries the spectra data intensity. The available methods are "sum" and "max". The default is "sum".
chromData: For Chromatograms() build from a Spectra object backend, a data.frame with the chromatographic data. If not provided (or if empty), a default data.frame with the core chromatographic variables will be created.
factorize.by: A character vector with the names of the variables in the Spectra object and the chromData slot that should be used to factorize the Spectra object data to generate the chromatographic data.
spectraVariables: A character vector specifying which variables from the Spectra object should be added to the chromData. These will be mapped using the chromSpectraIndex variable.
backend: ChromBackend object providing the raw data for the Chromatograms object.
f: factor defining the grouping to split the Chromatograms object.
BPPARAM: Parallel setup configuration. See BiocParallel::bpparam() for more information.
x: A Chromatograms object.
name: A character string specifying the name of the variable to access.
value: The value to replace the variable with.
i: For [: integer, logical or character to subset the object.
j: For [ and [[: ignored.
drop: For [: logical(1) default to FALSE.
peak.table: For chromExtract() A data frame containing the following minimum columns: - rtMin: Minimum retention time for each peak. Cannot be NA. - rtMax: Maximum retention time for each peak. Cannot be NA. - mzMin: Minimum m/z value for each peak. - mzMax: Maximum m/z value for each peak. Additionally, the peak.table must include columns that uniquely identify chromatograms in the object. Common choices are "msLevel" and/or "dataOrigin". These columns must also be present in the chromData of the object. Any extra columns in peak.table will be added to the chromData of the newly created object.
by: A character vector naming one or more columns that uniquely identify chromatograms in both peak.table and chromData(object). The combination of these columns must be unique within chromData(object). Typically includes "dataOrigin", "msLevel", or both.

Value

Refer to the individual function description for information on the return value.

Note

This needs to be discussed, if we want for example to be able to set a a backend to ChromBackendMzR we need to implement backendInitialize() better. = Support peaksData and chromData as arguments AND have a way to write .mzml files (which we do not have for chromatographic data).

Creation of objects

Chromatograms objects can be created using the Chromatograms() construction function. Either by providing a ChromBackend object or by providing a Spectra object. The Spectra object will be used to generate a Chromatograms object with a backend of class ChromBackendSpectra.

Data stored in a `Chromatograms` object

The Chromatograms object is a container for chromatographic data, which includes peaks data (retention time and related intensity values, also referred to as peaks data variables in the context of Chromatograms) and metadata of individual chromatogram (so called chromatograms variables). While a core set of chromatograms variables (the coreChromatogramsVariables()) and peaks data variables (the corePeaksVariables()) are guaranteed to be provided by a Chromatograms, it is possible to add arbitrary variables to a Chromatograms object.

The Chromatograms object is designed to contain chromatographic data of a (large) set of chromatograms. The data is organized linearly and can be thought of a list of chromatograms, i.e. each element in the Chromatograms is one chromatogram.

The chromatograms variables information in the Chromatograms object can be accessed using the chromData() function. Specific chromatograms variables can be accessed by either precising the "columns" parameter in chromData() or using $. chromData can be accessed, replaced but also filtered/subsetted. Refer to the chromData documentation for more details.

The peaks data variables information in the Chromatograms object can be accessed using the peaksData() function. Specific peaks variables can be accessed by either precising the "columns" parameter in peaksData() or using $. peaksData can be accessed, replaced but also filtered/subsetted. Refer to the peaksData documentation for more details.

Processing of `Chromatograms` objects

Functions that process the chromatograms data in some ways can be applied to the object either directly or by using the processingQueue mechanism. The processingQueue is a list of processing steps that are stored within the object and only applied when needed. This was created so that the data can be processed in a single step and is very useful for larger datasets. This is even more true as this processing queue will call function that can be applied on the data in a chunk-wise manner. This allows for parallel processing of the data and reduces the memory demand. To read more about the processingQueue, and how to parallelize your processes, see the processingQueue documentation.

Subsetting and accessing data

The Chromatograms class supports subsetting by chromatogram (i.e. rows) using the [ operator. The [ operator does not support subsetting by columns. Specific chromatograms or peaks variables can be accessed using the [[ operator or the $ operator. The [[ operator can also be used to replace specific chromatograms or peaks variables.

Changing the backend

The setBackend() function can be used to change the backend of a Chromatograms object. This can be useful to switch to a backend that better suits the needs of the user, for example switching to a memory-based backend for smaller datasets or to a file-based backend for larger datasets. The setBackend() function supports parallelization of the backend conversion using the BPPARAM parameter.

Extracting chromatograms based on a peak table

The chromExtract() function allows users to extract specific regions of interest from a Chromatograms object based on a user-provided peak table. Each row in the peak.table defines a region to extract, using minimum and maximum retention time (and m/z in the case of chromBackendSpectra) boundaries, and identifiers that uniquely match chromatograms in the object.

The resulting new Chromatograms object contains only chromatograms overlapping the specified regions, with updated metadata reflecting the extracted boundaries.

This function is most commonly used to subset chromatographic data around detected peaks or predefined time/mass ranges, for example to reprocess, visualize, or quantify extracted chromatograms corresponding to known features.

Examples


library(MsBackendMetaboLights)
library(Spectra)

## Create a Chromatograms object from a Spectra object.
be <- backendInitialize(MsBackendMetaboLights(),
    mtblsId = "MTBLS39",
    filePattern = c("63B.cdf")
)
#> Used data files from the assay's column "Raw Spectral Data File" since none were available in column "Derived Spectral Data File".
s <- Spectra(be)
s <- setBackend(s, MsBackendMemory())
be <- backendInitialize(new("ChromBackendSpectra"), s)
chr <- Chromatograms(be)

## Subset
chr[1:2]
#> Chromatographic data (Chromatograms) with 2 chromatograms in a ChromBackendSpectra backend:
#>   chromIndex msLevel  mz
#> 1         NA       1 Inf
#> 2         NA       1 Inf
#> ... 7 more  chromatogram variables/columns
#> ... 2 peaksData variables
#> 
#> The Spectra object contains 1651 spectra

## access a specific variables
chr[["msLevel"]]
#> [1] 1 1 1
chr$msLevel
#> [1] 1 1 1

## Replace data of a specific variable
chr$msLevel <- c(2L, 2L, 2L)

## Can re factorize the data
chr <- factorize(chr)

## Can also change the backend into memory
chr <- setBackend(chr, ChromBackendMemory())

chr
#> Chromatographic data (Chromatograms) with 3 chromatograms in a ChromBackendMemory backend:
#>   chromIndex msLevel  mz
#> 1         NA       2 Inf
#> 2         NA       2 Inf
#> 3         NA       2 Inf
#> ... 12 more  chromatogram variables/columns
#> ... 2 peaksData variables
#> Processing:
#>  Switch backend from ChromBackendSpectra to ChromBackendMemory [Wed Oct 22 08:40:22 2025]