The MsBackendMassbank class supports import of MS/MS spectra data from
files in Massbank format.
After import, the full MS data is kept in memory. MsBackendMassbank
extends the Spectra::MsBackendDataFrame() backend
directly and supports thus the Spectra::applyProcessing() function to make
data manipulations persistent.
New objects are created with the MsBackendMassbank() function. The
backendInitialize() method has to be subsequently called to
initialize the object and import MS/MS data from (one or more) MassBank
files. Parameter metaBlocks allows to configure the sets of spectrum
metadata that should be imported. Optional parameter nonStop allows to
specify whether the import returns with an error if one of the text files
lacks required data, such as mz and intensity values (default nonStop = FALSE), or whether only affected file(s) is(are) skipped and a
warning is shown (nonStop = TRUE). Note that any other error
will abort import regardless of parameter nonStop.
MassBank supports multiple values for some metadata fields. For a spectrum
it is for example possible to define more than one compound name. The
respective spectra variables for these metadata fields are therefore returned
as a list (see examples for more information). The fields supporting
multiple values, i.e., spectra variables stored as a list are:
"name""chrom_solvent", returned formetaBlocks = metaDataBlocks(ac = TRUE)"comment", returned formetaBlocks = metaDataBlocks(comment = TRUE)"data_processing_comment", returned formetaBlocks = metaDataBlocks(ms = TRUE)`"data_processing_reanalyze", returned formetaBlocks = metaDataBlocks(ms = TRUE)"data_processing_whole", returned formetaBlocks = metaDataBlocks(ms = TRUE)"sample", returned formetaBlocks = metaDataBlocks(sp = TRUE)
Usage
# S4 method for class 'MsBackendMassbank'
backendInitialize(
object,
files,
metaBlocks = metaDataBlocks(),
nonStop = FALSE,
...,
BPPARAM = bpparam()
)
MsBackendMassbank()
# S4 method for class 'MsBackendMassbank'
spectraVariableMapping(object, format = c("Massbank"))
# S4 method for class 'MsBackendMassbank'
export(
object,
x,
file = tempfile(),
mapping = spectraVariableMapping(MsBackendMassbank()),
...
)Arguments
- object
Instance of
MsBackendMassbankclass.- files
characterwith the (full) file name(s) of the MassBank file(s) from which MS/MS data should be imported.- metaBlocks
data.framedefining the MassBank metadata blocks (i.e., sets of spectra metadata) that should be imported from the MassBank record files. SeemetaDataBlocks()for more information.- nonStop
logical(1)whether import should be stopped if an xml file does not contain all required fields. Defaults tononStop = FALSE.- ...
Currently ignored.
- BPPARAM
Parameter object defining the parallel processing setup to import data in parallel. Defaults to
BPPARAM = bpparam(). SeeBiocParallel::bpparam()for more information.- format
for
spectraVariableMapping():character(1)defining the format to be used. Currently onlyformat = "Massbank"is supported.- x
Spectra::Spectra()object that should be exported.- file
for
export:character(1)defining the output file.- mapping
for
export(): namedcharactervector allowing to specify how fields from the Massbank file should be renamed. Names are supposed to be the spectra variable name and values of the vector the field names in the Massbank file. See output ofspectraVariableMapping(MsBackendMassbank())for the expected format.
Examples
## Create an MsBackendMassbank backend and import data from files in
## MassBank format.
fls <- dir(system.file("extdata", package = "MsBackendMassbank"),
full.names = TRUE, pattern = "txt$")
be <- backendInitialize(MsBackendMassbank(), fls)
#> Start data import from 11 files ...
#> done
#> Merging results ...
#> done
be
#> MsBackendMassbank with 12 spectra
#> msLevel rtime scanIndex
#> <integer> <numeric> <integer>
#> 1 2 NA 1
#> 2 2 NA 1
#> 3 2 142.14 1
#> 4 2 142.14 1
#> 5 2 142.14 1
#> ... ... ... ...
#> 8 2 142.14 1
#> 9 2 142.14 1
#> 10 2 143.94 1
#> 11 2 143.94 1
#> 12 2 143.94 1
#> ... 28 more variables/columns.
## spectra variable `"name"` is of type `list` and provides one or multiple
## compound names/aliases per spectrum:
be$name
#> [[1]]
#> [1] "Veratramine"
#> [2] "(3beta,23R)-14,15,16,17-Tetradehydroveratraman-3,23-diol"
#>
#> [[2]]
#> [1] "Carbazole" "9H-carbazole"
#>
#> [[3]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[4]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[5]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[6]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[7]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[8]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[9]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[10]]
#> [1] "L-Tryptophan"
#>
#> [[11]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
#> [[12]]
#> [1] "L-Tryptophan"
#> [2] "(2S)-2-amino-3-(1H-indol-3-yl)propanoic acid"
#>
be$msLevel
#> [1] 2 2 2 2 2 2 2 2 2 2 2 2
be$intensity
#> NumericList of length 12
#> [[1]] 12461 2208 2394 40390 2816 3122 2233 ... 23807 6937 2914 6059 1871 9233
#> [[2]] 650.7 14157.3
#> [[3]] 646 980 2114 20052 1248 7628 2036 494048 75708
#> [[4]] 10186 142 142 750 138 490 126 ... 11254 14266 1478 1600 16504 1446 109762
#> [[5]] 324 184 138 3770 500 800 7214 3238 ... 2802 206 898 162 166 814 250 1840
#> [[6]] 646 980 2114 20052 1248 7628 2036 494048 75708
#> [[7]] 646 980 2114 20052 1248 7628 2036 494048 75708
#> [[8]] 10186 142 142 750 138 490 126 ... 11254 14266 1478 1600 16504 1446 109762
#> [[9]] 324 184 138 3770 500 800 7214 3238 ... 2802 206 898 162 166 814 250 1840
#> [[10]] 150 200 32 232 80 12162
#> ...
#> <2 more elements>
be$mz
#> NumericList of length 12
#> [[1]] 84.1 105.1 107.1 114.1 115.1 119.1 ... 393.3 396.3 410.3 411.3 414.3
#> [[2]] 115.0167 168.0809
#> [[3]] 74.0233 132.0807 144.0805 146.0598 ... 170.0597 188.0699 205.0965
#> [[4]] 74.0232 77.0381 86.0027 91.0539 ... 160.0947 170.0596 171.0625 188.07
#> [[5]] 53.0019 53.0383 63.0225 65.0381 ... 158.0817 159.0921 160.0755 170.06
#> [[6]] 74.0233 132.0807 144.0805 146.0598 ... 170.0597 188.0699 205.0965
#> [[7]] 74.0233 132.0807 144.0805 146.0598 ... 170.0597 188.0699 205.0965
#> [[8]] 74.0232 77.0381 86.0027 91.0539 ... 160.0947 170.0596 171.0625 188.07
#> [[9]] 53.0019 53.0383 63.0225 65.0381 ... 158.0817 159.0921 160.0755 170.06
#> [[10]] 72.0095 116.0517 117.0554 159.0935 186.0558 203.0826
#> ...
#> <2 more elements>
## spectra variables imported by default:
spectraVariables(be)
#> [1] "msLevel" "rtime"
#> [3] "acquisitionNum" "scanIndex"
#> [5] "mz" "intensity"
#> [7] "dataStorage" "dataOrigin"
#> [9] "centroided" "smoothed"
#> [11] "polarity" "precScanNum"
#> [13] "precursorMz" "precursorIntensity"
#> [15] "precursorCharge" "collisionEnergy"
#> [17] "isolationWindowLowerMz" "isolationWindowTargetMz"
#> [19] "isolationWindowUpperMz" "acquistionNum"
#> [21] "accession" "name"
#> [23] "smiles" "exactmass"
#> [25] "formula" "inchi"
#> [27] "cas" "inchikey"
#> [29] "adduct" "splash"
#> [31] "title"
## Initializing a backend reading additional metadata columns/information
mb <- metaDataBlocks(ms = TRUE, ac = TRUE)
mb
#> metadata read
#> 1 ac TRUE
#> 2 ch FALSE
#> 3 sp FALSE
#> 4 ms TRUE
#> 5 record FALSE
#> 6 pk FALSE
#> 7 comment FALSE
be <- backendInitialize(MsBackendMassbank(), fls, metaBlocks = mb)
#> Start data import from 11 files ...
#> done
#> Merging results ...
#> done
## additional spectra variables are now available
spectraVariables(be)
#> [1] "msLevel" "rtime"
#> [3] "acquisitionNum" "scanIndex"
#> [5] "mz" "intensity"
#> [7] "dataStorage" "dataOrigin"
#> [9] "centroided" "smoothed"
#> [11] "polarity" "precScanNum"
#> [13] "precursorMz" "precursorIntensity"
#> [15] "precursorCharge" "collisionEnergy"
#> [17] "isolationWindowLowerMz" "isolationWindowTargetMz"
#> [19] "isolationWindowUpperMz" "acquistionNum"
#> [21] "accession" "name"
#> [23] "smiles" "exactmass"
#> [25] "formula" "inchi"
#> [27] "cas" "inchikey"
#> [29] "adduct" "splash"
#> [31] "title" "instrument"
#> [33] "instrument_type" "ms_ms_type"
#> [35] "ms_cap_voltage" "ms_col_gas"
#> [37] "ms_desolv_gas_flow" "ms_desolv_temp"
#> [39] "ms_frag_mode" "ms_ionization"
#> [41] "ms_ionization_energy" "ms_ionization_voltage"
#> [43] "ms_laser" "ms_matrix"
#> [45] "ms_mass_accuracy" "ms_mass_range"
#> [47] "ms_reagent_gas" "ms_resolution"
#> [49] "ms_scan_setting" "ms_source_temp"
#> [51] "ms_kinetic_energy" "ms_electron_current"
#> [53] "ms_reaction_time" "chrom_carrier_gas"
#> [55] "chrom_column" "chrom_column_temp"
#> [57] "chrom_column_temp_gradient" "chrom_flow_gradient"
#> [59] "chrom_flow_rate" "chrom_inj_temp"
#> [61] "chrom_inj_temp_gradient" "chrom_rti_kovats"
#> [63] "chrom_rti_lee" "chrom_rti_naps"
#> [65] "chrom_rti_uoa" "chrom_rti_uoa_pred"
#> [67] "chrom_rt" "chrom_rt_uoa_pred"
#> [69] "chrom_solvent" "chrom_transfer_temp"
#> [71] "ims_instrument_type" "ims_drift_gas"
#> [73] "ims_drift_time" "ims_ccs"
#> [75] "general_conc" "focus_base_peak"
#> [77] "focus_derivative_form" "focus_derivative_mass"
#> [79] "focus_derivative_type" "focus_ion_type"
#> [81] "data_processing_comment" "data_processing_deprofile"
#> [83] "data_processing_find_peak" "data_processing_reanalyze"
#> [85] "data_processing_recalibrate" "data_processing_whole"
## for example information on the instrument used
be$instrument
#> [1] "Bruker maXis ESI-QTOF"
#> [2] "LTQ Orbitrap XL Thermo Scientific"
#> [3] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [4] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [5] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [6] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [7] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [8] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [9] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [10] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [11] "maXis plus UHR-ToF-MS, Bruker Daltonics"
#> [12] "maXis plus UHR-ToF-MS, Bruker Daltonics"
## or the software/workflow used to process the data
be$data_processing_whole
#> [1] NA "RMassBank 1.5.2.3" "RMassBank 2.4.0"
#> [4] "RMassBank 2.4.0" "RMassBank 2.4.0" "RMassBank 2.4.0"
#> [7] "RMassBank 2.4.0" "RMassBank 2.4.0" "RMassBank 2.4.0"
#> [10] "RMassBank 2.4.0" "RMassBank 2.4.0" "RMassBank 2.4.0"
