Skip to contents

MsBackendMetaboLights retrieves and represents mass spectrometry (MS) data from metabolomics experiments stored in the MetaboLights repository. The backend directly extends the Spectra::MsBackendMzR backend from the Spectra package and hence supports MS data in mzML, netCDF and mzXML format. Data in other formats can not be loaded with MsBackendMetaboLights. Upon initialization with the backendInitialize() method, the MsBackendMetaboLights backend downloads and caches the MS data files of an experiment locally avoiding hence repeated download of the data.

Usage

MsBackendMetaboLights()

# S4 method for class 'MsBackendMetaboLights'
backendInitialize(
  object,
  mtblsId = character(),
  assayName = character(),
  filePattern = "mzML$|CDF$|cdf$|mzXML$",
  offline = FALSE,
  ...
)

# S4 method for class 'MsBackendMetaboLights'
backendMerge(object, ...)

# S4 method for class 'MsBackendMetaboLights'
backendRequiredSpectraVariables(object, ...)

mtbls_sync(x, offline = FALSE)

Arguments

object

an instance of MsBackendMetaboLights.

mtblsId

character(1) with the ID of a single MetaboLights data set/experiment.

assayName

character with the file names of assay files of the data set. If not provided (assayName = character(), the default), MS data files of all data set's assays are loaded. Use mtbls_list_files(<MetaboLights ID>, pattern = "^a_") to list all available assay files of a data set <MetaboLights ID>.

filePattern

character with the pattern defining the supported (or requested) file types. Defaults to filePattern = "mzML$|CDF$|cdf$|mzXML$" hence restricting to mzML, CDF and mzXML files which are supported by Spectra's MsBackendMzR backend.

offline

logical(1) whether only locally cached content should be evaluated/loaded.

...

additional parameters; currently ignored.

x

an instance of MsBackendMetaboLights.

Value

  • For MsBackendMetaboLights(): an instance of MsBackendMetaboLights.

  • For backendInitialize(): an instance of MsBackendMetaboLights with the MS data of the specified MetaboLights data set.

  • For backendRequiredSpectraVariables(): character with spectra variables that are needed for the backend to provide the MS data.

  • For mtbls_sync(): the input MsBackendMetaboLights with the paths to the locally cached data files being eventually updated.

Details

File names for data files are by default extracted from the column "Derived Spectral Data File" of the MetaboLights data set's assay table. If this column does not contain any supported file names, the assay's column "Raw Spectral Data File" is evaluated instead.

The backend uses the BiocFileCache package for caching of the data files. These are stored in the default local BiocFileCache cache along with additional metadata that includes the MetaboLights ID and the assay file name with which the data file is associated with. Note that at present only MS data files in mzML, CDF and mzXML format are supported.

The MsBackendMetaboLights backend defines and provides additional spectra variables "mtbls_id", "mtbls_assay_name" and "derived_spectral_data_file" that list the MetaboLights ID, the name of the assay file and the original data file name on the MetaboLights ftp server for each individual spectrum. The "derived_spectral_data_file" can be used for the mapping between the experiment's samples and the individual data files, respective their spectra. This mapping is provided in the MetaboLights assay file.

The MsBackendMetaboLights backend is considered read-only and does thus not support changing m/z and intensity values directly.

Also, merging of MS data of MsBackendMetaboLights is not supported and thus c() of several Spectra with MS data represented by MsBackendMetaboLights will throw an error.

Initialization and loading of data

New instances of the class can be created with the MsBackendMetaboLights() function. Data is loaded and initialized using the backendInitialize() function which can be configured with parameters mtblsId, assayName and filePattern. mtblsId must be the ID of a single (existing) MetaboLights data set. Parameter assayName allows to define specific assays of the MetaboLights data set from which the data files should be loaded. If provided, it should be the file name(s) of the respective assay(s) in MetaboLights (use e.g. mtbls_list_files(<MetaboLights ID>, pattern = "^a_") to list all available assay files for a given MetaboLights ID <MetaboLights ID>). By default, with assayName = character() MS data files from all assays of a data set are loaded. Optional parameter filePattern defines the pattern that should be used to filter the file names of the MS data files. It defaults to data files with file endings of supported MS data files. backendInitialize() requires an active internet connection as the function first compares the remote file content to the locally cached files and eventually synchronizes changes/updates. This can be skipped with offline = TRUE in which case only locally cached content is queried.

The backendRequiredSpectraVariables() function returns the names of the spectra variables required for the backend to provide the MS data.

The mtbls_sync() function can be used to synchronize the local data cache and ensure that all data files are locally available. The function will check the local cache and eventually download missing data files from the MetaboLights repository.

Author

Philippine Louail, Johannes Rainer

Examples


library(MsBackendMetaboLights)

## List files of a MetaboLights data set
mtbls_list_files("MTBLS39")
#> [1] "FILES"                                                                                                          
#> [2] "a_MTBLS39_the_plasticity_of_the_grapevine_berry_transcriptome_metabolite_profiling_mass_spectrometry.txt"       
#> [3] "i_Investigation.txt"                                                                                            
#> [4] "m_MTBLS39_the_plasticity_of_the_grapevine_berry_transcriptome_metabolite_profiling_mass_spectrometry_v2_maf.tsv"
#> [5] "metexplore_mapping.json"                                                                                        
#> [6] "s_MTBLS39.txt"                                                                                                  

## Initialize a MsBackendMetaboLights representing all MS data files of
## the data set with the ID "MTBLS39". This will download and cache all
## files and subsequently load and represent them in R.

be <- backendInitialize(MsBackendMetaboLights(), "MTBLS39")
#> Used data files from the assay's column "Raw Spectral Data File" since none were available in column "Derived Spectral Data File".
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN063B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN063C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS063B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS063C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM063B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM063C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN073A.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN073B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN073C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS073A.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS073B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS073C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM073A.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM073B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM073C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN083A.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN083B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/MN083C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS083A.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS083B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/CS083C.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM083A.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM083B.cdf'
#> adding rname 'ftp://ftp.ebi.ac.uk/pub/databases/metabolights/studies/public/MTBLS39/FILES/AM083C.cdf'
be
#> MsBackendMetaboLights with 15141 spectra
#>         msLevel     rtime scanIndex
#>       <integer> <numeric> <integer>
#> 1             1  0.296384         1
#> 2             1  6.206912         2
#> 3             1 12.093056         3
#> 4             1 17.942912         4
#> 5             1 23.835072         5
#> ...         ...       ...       ...
#> 15137         1   2682.81       596
#> 15138         1   2687.29       597
#> 15139         1   2691.77       598
#> 15140         1   2696.27       599
#> 15141         1   2700.81       600
#>  ... 36 more variables/columns.
#> 
#> file(s):
#> MN063A.cdf
#> CS063A.cdf
#> AM063A.cdf
#>  ... 24 more files

## The `mtbls_sync()` function can be used to ensure that all data files are
## available locally. This function will eventually download missing data
## files or update their paths.
be <- mtbls_sync(be)
#> Used data files from the assay's column "Raw Spectral Data File" since none were available in column "Derived Spectral Data File".