The MsBackendMsp
class supports import of MS/MS spectra data from
files in NIST MSP file format. MsBackendMsp
extends the
MsBackendDataFrame()
backend directly and supports thus the
applyProcessing()
function to make data manipulations persistent.
New objects are created with the MsBackendMsp()
function. The
backendInitialize()
method has to be subsequently called to
initialize the object and import MS/MS data from (one or more) msp
files.
The MsBackendMsp
backend provides an export()
method that allows to
export the data from the Spectra
object (parameter x
) to a file in
MSP format.
Parameters to this function are:
x
: theSpectra
object that should be exported.file
:character(1)
with the desired file name.mapping
: namedcharacter
providing the mapping between spectra variables and MSP data fields. Defaults tomapping = spectraVariableMapping(MsBackendMsp())
.allVariables
:logical(1)
whether all spectra variables inx
should be exported or only those defined withmapping
.exportName
:logical(1)
whether aNAME
field should always be exported even if not provided inx
.
See the package vignette for details and examples.
The spectraVariableMapping()
function allows to provide the mapping between
spectra variable names (i.e. the names that will be used for the spectra
variables in the Spectra()
object) and the data field names of the
MSP file. Parameter format
allows to select pre-defined mappings. Currently
supported mapping flavors are:
format = "msp"
: default MSP field names. Should work with standard NIST MSP files or MSP files exported from MS-DIAL.format = "mona"
: MSP file format from MoNA including LipidBlast.
Usage
# S4 method for class 'MsBackendMsp'
backendInitialize(
object,
file,
mapping = spectraVariableMapping(object),
...,
BPPARAM = SerialParam()
)
MsBackendMsp()
# S4 method for class 'MsBackendMsp'
spectraVariableMapping(object, format = c("msp", "mona"))
# S4 method for class 'MsBackendMsp'
export(
object,
x,
file = tempfile(),
mapping = spectraVariableMapping(MsBackendMsp()),
allVariables = TRUE,
exportName = TRUE,
...
)
Arguments
- object
Instance of
MsBackendMsp
class.- file
character
with the (full) file name(s) of the msp file(s) from which MS/MS data should be imported or exported.- mapping
named
character
vector to rename MSP fields to spectra variables (seespectraVariableMapping()
). This allows to correctly import also custom fields or data from files with different MSP flavors.- ...
Currently ignored.
- BPPARAM
Parameter object defining the parallel processing setup to import data in parallel. Defaults to
BPPARAM = SerialParam()
. Seebpparam()
for more information. Parallel processing would make most sense for import from a large set of individual MSP files, but could also improve performance for import from a (very large) single MSP file.- format
For
spectraVariableMapping()
:character(1)
specifying for which MSP flavour the mapping should be returned. Currently supported are:format = "msp"
(generic MSP format, for example for MS-DIAL MSP files) andformat = "mona"
(MSP files in MoNA flavour).- x
For
export()
: aSpectra()
object that should be exported to the specified MSP file.- allVariables
logical(1)
whether all spectra variables inx
should be exported or only those defined withmapping
.- exportName
logical(1)
whether aNAME
field should always be exported even if not provided inx
.
Value
MsBackendMsp()
and backendInitialize()
return an instance of a
MsBackendMsp
class. spectraVariableMapping()
a named character
vector with the mapping between spectra variables and MSP data fields.
Note
Format requirements/assumptions of MSP files:
Comment lines are expected to start with a
#
.Multiple spectra within the same MSP file are separated by an empty line.
The first n lines of a spectrum entry represent metadata.
Metadata is provided as "name: value" pairs (i.e. name and value separated by a ":").
One line per mass peak, with values separated by a whitespace or tabulator.
Each line is expected to contain at least the m/z and intensity values (in that order) of a peak. Additional values are currently ignored.
Examples
## Import spectra from a MSP file from LipidBlast
f <- system.file("extdata", "small-export-LipidBlast.msp",
package = "MsBackendMsp")
be <- backendInitialize(MsBackendMsp(), f)
be
#> MsBackendMsp with 5 spectra
#> msLevel rtime scanIndex
#> <integer> <numeric> <integer>
#> 1 2 NA NA
#> 2 2 NA NA
#> 3 2 NA NA
#> 4 2 NA NA
#> 5 2 NA NA
#> ... 32 more variables/columns.
be$msLevel
#> [1] 2 2 2 2 2
be$intensity
#> NumericList of length 5
#> [[1]] 80.08008 100
#> [[2]] 80.08008 100
#> [[3]] 80.08008 100
#> [[4]] 30.03003 50.05005 50.05005 50.05005 ... 70.07007 50.05005 50.05005 100
#> [[5]] 30.03003 50.05005 50.05005 50.05005 ... 70.07007 50.05005 50.05005 100
be$mz
#> NumericList of length 5
#> [[1]] 85.02841 232.1543
#> [[2]] 85.02841 246.17
#> [[3]] 85.02841 260.1856
#> [[4]] 152.9958 227.2016 283.2643 327.2329 ... 691.4344 703.5283 1451.996
#> [[5]] 152.9958 253.2173 255.233 281.2486 ... 673.4814 721.4814 1451.996
## precursor m/z are however all missing
be$precursorMz
#> [1] NA NA NA NA NA
## Default spectra variable mapping
spectraVariableMapping(MsBackendMsp())
#> name accession formula inchikey adduct
#> "NAME" "DB#" "FORMULA" "INCHIKEY" "PRECURSORTYPE"
#> exactmass rtime precursorMz adduct smiles
#> "EXACTMASS" "RETENTIONTIME" "PRECURSORMZ" "PRECURSORTYPE" "SMILES"
#> inchi polarity instrument
#> "INCHI" "IONMODE" "INSTRUMENT"
## In fact, to read MSP files in "LipidBlast flavour" (same as MoNA) we
## should use a different spectra variable mapping
spectraVariableMapping(MsBackendMsp(), "mona")
#> name synonym accession
#> "Name" "Synon" "DB#"
#> inchikey adduct precursorMz
#> "InChIKey" "Precursor_type" "PrecursorMZ"
#> polarity formula exactmass
#> "Ion_mode" "Formula" "ExactMass"
#> collision_energy_text msLevel instrument
#> "Collision_energy" "Spectrum_type" "Instrument"
#> instrument_type
#> "Instrument_type"
## Importing the data with this will correctly retrieve data
be <- backendInitialize(MsBackendMsp(), f,
mapping = spectraVariableMapping(MsBackendMsp(), "mona"))
be$precursorMz
#> [1] 232.1543 246.1700 260.1856 1451.9962 1451.9962
## Other fields are also correctly mapped, but might need to be converted
## to e.g. numeric, such as "exactmass"
be$exactmass
#> [1] "232.1543346040907" "246.16998466809073" "260.18563473209076"
#> [4] "1453.003526472" "1453.003526472"
be$exactmass <- as.numeric(be$exactmass)
be$adduct
#> [1] "[M]+" "[M]+" "[M]+" "[M-H]-" "[M-H]-"
be$formula
#> [1] "[C11H22NO4]+" "[C12H24NO4]+" "[C13H26NO4]+" "C81H146O17P2" "C81H146O17P2"
## Exporting Spectra objects in MSP format.
sps <- Spectra(be)
export(MsBackendMsp(), sps, file = stdout())
#> NAME: ACar 4:0
#> msLevel: MS2
#> IONMODE: Positive
#> PRECURSORMZ: 232.15433
#> Comments: "SMILES=CCCC(=O)OC(CC(O)=O)C[N+](C)(C)C" "compound class=ACar" "computed SMILES=O=C(O)CC(OC(=O)CCC)C[N+](C)(C)C" "computed InChI=InChI=1S/C11H21NO4/c1-5-6-11(15)16-9(7-10(13)14)8-12(2,3)4/h9H,5-8H2,1-4H3/p+1" "retention time=0.51" "collision energy spread=15 V" "author=Tobias Kind, Hiroshi Tsugawa" "computed mass accuracy=2.3431646471704717" "computed mass error=5.439758187435473E-4" "SPLASH=splash10-001r-7090000000-aa12589a2481560ea0d5" "submitter=Tobias Kind (University of California, Davis)" "MoNA Rating=3.6363636363636367"
#> MW: 232
#> Num.Peaks: 2
#> DB#: LipidBlast000001
#> PRECURSORTYPE: [M]+
#> collision_energy_text: 45 V
#> EXACTMASS: 232.154334604091
#> FORMULA: [C11H22NO4]+
#> INCHIKEY: QWYFHHGCZUCMBN-UHFFFAOYSA-O
#> INSTRUMENT: SCIEX 5600
#> instrument_type: in-silico QTOF
#> synonym: [M]+
#> synonym: $:00in-source
#> Num Peaks: 2
#> 85.02841 80.08008
#> 232.1543 100
#>
#> NAME: ACar 5:0
#> msLevel: MS2
#> IONMODE: Positive
#> PRECURSORMZ: 246.16998
#> Comments: "SMILES=CCCCC(=O)OC(CC(O)=O)C[N+](C)(C)C" "compound class=ACar" "computed SMILES=O=C(O)CC(OC(=O)CCCC)C[N+](C)(C)C" "computed InChI=InChI=1S/C12H23NO4/c1-5-6-7-12(16)17-10(8-11(14)15)9-13(2,3)4/h10H,5-9H2,1-4H3/p+1" "retention time=0.68" "collision energy spread=15 V" "author=Tobias Kind, Hiroshi Tsugawa" "computed mass accuracy=2.209496944908765" "computed mass error=5.439118187382519E-4" "SPLASH=splash10-000b-7090000000-2b596f4df94dfefba50b" "submitter=Tobias Kind (University of California, Davis)" "MoNA Rating=3.6363636363636367"
#> MW: 246
#> Num.Peaks: 2
#> DB#: LipidBlast000002
#> PRECURSORTYPE: [M]+
#> collision_energy_text: 45 V
#> EXACTMASS: 246.169984668091
#> FORMULA: [C12H24NO4]+
#> INCHIKEY: VSNFQQXVMPSASB-UHFFFAOYSA-O
#> INSTRUMENT: SCIEX 5600
#> instrument_type: in-silico QTOF
#> synonym: [M]+
#> synonym: $:00in-source
#> Num Peaks: 2
#> 85.02841 80.08008
#> 246.17 100
#>
#> NAME: ACar 6:0
#> msLevel: MS2
#> IONMODE: Positive
#> PRECURSORMZ: 260.18563
#> Comments: "SMILES=CCCCCC(=O)OC(CC(O)=O)C[N+](C)(C)C" "compound class=ACar" "computed SMILES=O=C(O)CC(OC(=O)CCCCC)C[N+](C)(C)C" "computed InChI=InChI=1S/C13H25NO4/c1-5-6-7-8-13(17)18-11(9-12(15)16)10-14(2,3)4/h11H,5-10H2,1-4H3/p+1" "retention time=0.86" "collision energy spread=15 V" "author=Tobias Kind, Hiroshi Tsugawa" "computed mass accuracy=2.0902300356715053" "computed mass error=5.438478186761131E-4" "SPLASH=splash10-03dr-7090000000-a9ec485bc43949b4b278" "submitter=Tobias Kind (University of California, Davis)" "MoNA Rating=3.6363636363636367"
#> MW: 260
#> Num.Peaks: 2
#> DB#: LipidBlast000003
#> PRECURSORTYPE: [M]+
#> collision_energy_text: 45 V
#> EXACTMASS: 260.185634732091
#> FORMULA: [C13H26NO4]+
#> INCHIKEY: VVPRQWTYSNDTEA-UHFFFAOYSA-O
#> INSTRUMENT: SCIEX 5600
#> instrument_type: in-silico QTOF
#> synonym: [M]+
#> synonym: $:00in-source
#> Num Peaks: 2
#> 85.02841 80.08008
#> 260.1856 100
#>
#> NAME: CL 72:6
#> msLevel: MS2
#> IONMODE: Negative
#> PRECURSORMZ: 1451.99625
#> Comments: "SMILES=CCCCCCCCCCCCCCCCCC(=O)OCC(COP(O)(=O)OCC(O)COP(O)(=O)OCC(COC(=O)CC\C=C/C\C=C/C\C=C/C\C=C/C\C=C/C\C=C/CC)OC(=O)CCCCCCCCCCCCC)OC(=O)CCCCCCCCCCCCCCCCC" "compound class=CL" "computed SMILES=O=C(OCC(OC(=O)CCCCCCCCCCCCC)COP(=O)(O)OCC(O)COP(=O)(O)OCC(OC(=O)CCCCCCCCCCCCCCCCC)COC(=O)CCCCCCCCCCCCCCCCC)CCC=CCC=CCC=CCC=CCC=CCC=CCC" "computed InChI=InChI=1S/C81H146O17P2/c1-5-9-13-17-21-25-29-32-35-36-37-38-41-43-47-50-54-58-62-66-79(84)91-71-76(97-80(85)67-63-59-55-51-45-28-24-20-16-12-8-4)73-95-99(87,88)93-69-75(82)70-94-100(89,90)96-74-77(98-81(86)68-64-60-56-52-48-44-40-34-31-27-23-19-15-11-7-3)72-92-78(83)65-61-57-53-49-46-42-39-33-30-26-22-18-14-10-6-2/h9,13,21,25,32,35,37-38,43,47,54,58,75-77,82H,5-8,10-12,14-20,22-24,26-31,33-34,36,39-42,44-46,48-53,55-57,59-74H2,1-4H3,(H,87,88)(H,89,90)/b13-9-,25-21-,35-32-,38-37-,47-43-,58-54-" "retention time=11.91" "collision energy spread=15 V" "author=Tobias Kind, Hiroshi Tsugawa" "computed mass accuracy=3.2506971738511424E-4" "computed mass error=-4.720000106317457E-7" "SPLASH=splash10-0w4i-0134901100-d27b9577a1230ec65177" "submitter=Tobias Kind (University of California, Davis)" "MoNA Rating=4.5"
#> MW: 1453
#> Num.Peaks: 10
#> DB#: LipidBlast395475
#> PRECURSORTYPE: [M-H]-
#> collision_energy_text: 45 V
#> EXACTMASS: 1453.003526472
#> FORMULA: C81H146O17P2
#> INCHIKEY: KWABUIIFXPUAEH-OXNFOSRESA-N
#> INSTRUMENT: SCIEX 5600
#> instrument_type: in-silico QTOF
#> synonym: CL 14:0-22:6-18:0-18:0
#> synonym: $:00in-source
#> Num Peaks: 10
#> 152.9958 30.03003
#> 227.2016 50.05005
#> 283.2643 50.05005
#> 327.2329 50.05005
#> 363.1942 70.07007
#> 419.2568 70.07007
#> 463.2255 70.07007
#> 691.4344 50.05005
#> 703.5283 50.05005
#> 1451.996 100
#>
#> NAME: CL 72:6
#> msLevel: MS2
#> IONMODE: Negative
#> PRECURSORMZ: 1451.99625
#> Comments: "SMILES=CCCCCCCCCCCCCCCC(=O)OC(COC(=O)CCCCCCC\C=C/CCCCCCCC)COP(O)(=O)OCC(O)COP(O)(=O)OCC(COC(=O)CCCCCCCC\C=C/C\C=C/C\C=C/C\C=C/CC)OC(=O)CCCCCCC\C=C/CCCCCC" "compound class=CL" "computed SMILES=O=C(OCC(OC(=O)CCCCCCCCCCCCCCC)COP(=O)(O)OCC(O)COP(=O)(O)OCC(OC(=O)CCCCCCCC=CCCCCCC)COC(=O)CCCCCCCCC=CCC=CCC=CCC=CCC)CCCCCCCC=CCCCCCCCC" "computed InChI=InChI=1S/C81H146O17P2/c1-5-9-13-17-21-25-29-33-35-36-37-38-40-44-46-50-54-58-62-66-79(84)92-72-77(98-81(86)68-64-60-56-52-48-42-32-28-24-20-16-12-8-4)74-96-100(89,90)94-70-75(82)69-93-99(87,88)95-73-76(97-80(85)67-63-59-55-51-47-41-31-27-23-19-15-11-7-3)71-91-78(83)65-61-57-53-49-45-43-39-34-30-26-22-18-14-10-6-2/h9,13,21,25,28,32-35,37-39,75-77,82H,5-8,10-12,14-20,22-24,26-27,29-31,36,40-74H2,1-4H3,(H,87,88)(H,89,90)/b13-9-,25-21-,32-28-,35-33-,38-37-,39-34-" "retention time=11.65" "collision energy spread=15 V" "author=Tobias Kind, Hiroshi Tsugawa" "computed mass accuracy=3.2506971738511424E-4" "computed mass error=-4.720000106317457E-7" "SPLASH=splash10-0uyi-0157901100-d3b7dfa9dd7050026484" "submitter=Tobias Kind (University of California, Davis)" "MoNA Rating=4.5"
#> MW: 1453
#> Num.Peaks: 12
#> DB#: LipidBlast398514
#> PRECURSORTYPE: [M-H]-
#> collision_energy_text: 45 V
#> EXACTMASS: 1453.003526472
#> FORMULA: C81H146O17P2
#> INCHIKEY: WINXBODRGKNJDS-DUULZMMUSA-N
#> INSTRUMENT: SCIEX 5600
#> instrument_type: in-silico QTOF
#> synonym: CL 16:0-18:1-16:1-22:4
#> synonym: $:00in-source
#> Num Peaks: 12
#> 152.9958 30.03003
#> 253.2173 50.05005
#> 255.233 50.05005
#> 281.2486 50.05005
#> 331.2643 50.05005
#> 389.2098 70.07007
#> 391.2255 70.07007
#> 417.2411 70.07007
#> 467.2568 70.07007
#> 673.4814 50.05005
#> 721.4814 50.05005
#> 1451.996 100
#>