Spectra MS backend storing data in a SQL database

The MsBackendSql is an implementation for the Spectra::MsBackend() class for Spectra::Spectra() objects which stores and retrieves MS data from a SQL database. New databases can be created from raw MS data files using createMsBackendSqlDatabase().

Usage

MsBackendSql()

createMsBackendSqlDatabase(
  dbcon,
  x = character(),
  backend = MsBackendMzR(),
  chunksize = 10L,
  blob = TRUE,
  peaksStorageMode = c("blob2", "long", "blob"),
  partitionBy = c("none", "spectrum", "chunk"),
  partitionNumber = 10L
)

# S4 method for class 'MsBackendSql'
show(object)

# S4 method for class 'MsBackendSql'
backendInitialize(object, dbcon, data, ...)

# S4 method for class 'MsBackendSql'
dataStorage(object)

# S4 method for class 'MsBackendSql'
x[i, j, ..., drop = FALSE]

# S4 method for class 'MsBackendSql,ANY'
extractByIndex(object, i)

# S4 method for class 'MsBackendSql'
peaksData(object, columns = c("mz", "intensity"))

# S4 method for class 'MsBackendSql'
peaksVariables(object)

# S4 method for class 'MsBackendSql'
intensity(object)

# S4 method for class 'MsBackendSql'
intensity(object) <- value

# S4 method for class 'MsBackendSql'
mz(object)

# S4 method for class 'MsBackendSql'
mz(object) <- value

# S4 method for class 'MsBackendSql'
x$name <- value

# S4 method for class 'MsBackendSql'
spectraData(object, columns = spectraVariables(object))

# S4 method for class 'MsBackendSql'
reset(object)

# S4 method for class 'MsBackendSql'
spectraNames(object)

# S4 method for class 'MsBackendSql'
spectraNames(object) <- value

# S4 method for class 'MsBackendSql'
filterMsLevel(object, msLevel = uniqueMsLevels(object))

# S4 method for class 'MsBackendSql'
filterRt(object, rt = numeric(), msLevel. = integer())

# S4 method for class 'MsBackendSql'
filterDataOrigin(object, dataOrigin = character())

# S4 method for class 'MsBackendSql'
filterPrecursorMzRange(object, mz = numeric())

# S4 method for class 'MsBackendSql'
filterPrecursorMzValues(object, mz = numeric(), ppm = 20, tolerance = 0)

# S4 method for class 'MsBackendSql'
uniqueMsLevels(object, ...)

# S4 method for class 'MsBackendSql'
backendMerge(object, ...)

# S4 method for class 'MsBackendSql'
precScanNum(object)

# S4 method for class 'MsBackendSql'
centroided(object)

# S4 method for class 'MsBackendSql'
smoothed(object)

# S4 method for class 'MsBackendSql'
tic(object, initial = TRUE)

# S4 method for class 'MsBackendSql'
supportsSetBackend(object, ...)

# S4 method for class 'MsBackendSql'
backendBpparam(object, BPPARAM = bpparam())

# S4 method for class 'MsBackendSql'
dbconn(x)

# S4 method for class 'MsBackendSql'
longForm(object, columns = spectraVariables(object))

Arguments

dbcon: Connection to a database.
x: For createMsBackendSqlDatabase(): character with the names of the raw data files from which the data should be imported. For other methods an MsqlBackend instance.
backend: For createMsBackendSqlDatabase(): MS backend that can be used to import MS data from the raw files specified with parameter x.
chunksize: For createMsBackendSqlDatabase(): integer(1) defining the number of input that should be processed per iteration. With chunksize = 1 each file specified with x will be imported and its data inserted to the database. With chunksize = 5 data from 5 files will be imported (in parallel) and inserted to the database. Thus, higher values might result in faster database creation, but require also more memory.
blob: For createMsBackendSqlDatabase(), setBackend(): logical(1) whether individual m/z and intensity values should be stored separately (blob = FALSE) or if the peaks data should be stored as data type BLOB into the database (blob = TRUE, the default). See also parameter peaksStorageMode for different data storage options.
peaksStorageMode: character(1) defining how peaks variables are stored in the database. The default peaksStorageMode = "blob2" stores the full peaks matrix of each spectrum as data type BLOB as one entry into a single database table column. peaksStorageMode = "blob" stores in contrast the m/z and intensity vectors as separate BLOB types into two database tables. peaksStorageMode = "long" allows to store the data in long mode, i.e. each intensity and m/z value is stored individually in the database.
partitionBy: For createMsBackendSqlDatabase(): character(1) defining if and how the peak data table should be partitioned. "none" (default): no partitioning, "spectrum": peaks are assigned to the partition based on the spectrum ID (number), i.e. spectra are evenly (consecutively) assigned across partitions. For partitionNumber = 3, the first spectrum is assigned to the first partition, the second to the second, the third to the third and the fourth spectrum again to the first partition. "chunk": spectra processed as part of the same chunk are placed into the same partition. All spectra from the next processed chunk are assigned to the next partition. Note that this is only available for MySQL/MariaDB databases, i.e., if con is a MariaDBConnection. See details for more information.
partitionNumber: For createMsBackendSqlDatabase(): integer(1) defining the number of partitions the database table will be partitioned into (only supported for MySQL/MariaDB databases).
object: A MsBackendSql instance.
data: For backendInitialize(): optional DataFrame with the full spectra data that should be inserted into a (new) MsBackendSql database. If provided, it is assumed that dbcon is a (writeable) connection to an empty database into which data should be inserted. data could be the output of spectraData from another backend.
...: For [: ignored. For backendInitialize(), if parameter data is used: additional parameters to be passed to the function creating the database such as blob or peaksStorageMode. For setBackend(): any parameters supported by backendInitialize() or createMsBackendSqlDatabase().
i: For [: integer or logical to subset the object.
j: For [: ignored.
drop: For [: logical(1), ignored.
columns: For spectraData(): character() optionally defining a subset of spectra variables that should be returned. Defaults to columns = spectraVariables(object) hence all variables are returned. For peaksData accessor: optional character with requested columns in the individual matrix of the returned list. Defaults to columns = c("mz", "intensity") but all columns listed by peaksVariables would be supported.
value: For all setter methods: replacement value.
name: For <-: character(1) with the name of the spectra variable to replace.
msLevel: For filterMsLevel(): integer specifying the MS levels to filter the data.
rt: For filterRt(): numeric(2) with the lower and upper retention time. Spectra with a retention time >= rt[1] and <= rt[2] are returned.
msLevel.: For filterRt(): integerwith the MS level(s) on which the retention time filter should be applied (all spectra from other MS levels are considered for the filter and are returned *as is*). If not specified, the retention time filter is applied to all MS levels inobject`.
dataOrigin: For filterDataOrigin(): character with data origin values to which the data should be subsetted.
mz: For filterPrecursorMzRange(): numeric(2) with the desired lower and upper limit of the precursor m/z range. For filterPrecursorMzValues(): numeric with the m/z value(s) to filter the object.
ppm: For filterPrecursorMzValues(): numeric with the m/z-relative maximal acceptable difference for a m/z value to be considered matching. Can be of length 1 or equal to length(mz).
tolerance: For filterPrecursorMzValues(): numeric with the absolute difference for m/z values to be considered matching. Can be of length 1 or equal to length(mz).
initial: For tic(): logical(1) whether the original total ion count should be returned (initial = TRUE, the default) or whether it should be calculated on the spectras' intensities (initial = FALSE).
BPPARAM: for backendBpparam(): BiocParallel parallel processing setup. See BiocParallel::bpparam() for more information.

Value

See documentation of respective function.

Details

The MsBackendSql class is principally a read-only backend but by extending the Spectra::MsBackendCached() backend from the Spectra package it allows changing and adding (temporarily) spectra variables without changing the original data in the SQL database.

Note

The MsBackendSql backend keeps an (open) connection to the SQL database with the data and hence does not support saving/loading of a backend to disk (e.g. using save or saveRDS). Also, for the same reason, the MsBackendSql does not support parallel processing. The backendBpparam() method for MsBackendSql will thus always return a BiocParallel::SerialParam() object.

The MsBackendOfflineSql() could be used as an alternative as it supports saving/loading the data to/from disk and supports also parallel processing.

Creation of backend objects

New backend objects can be created with the MsBackendSql() function. SQL databases can be created and filled with MS data from raw data files using the createMsBackendSqlDatabase() function or using backendInitialize() and providing all data with parameter data. In addition it is possible to create a database from a Spectra object changing its backend to a MsBackendSql or MsBackendOfflineSql using the Spectra::setBackend() function. Existing SQL databases (created previously with createMsBackendSqlDatabase() or backendInitialize() with the data parameter) can be loaded using the conventional way to create/initialize MsBackend classes, i.e. using backendInitialize().

createMsBackendSqlDatabase(): create a database and fill it with MS data. Parameter dbcon is expected to be a database connection, parameter x a character vector with the file names from which to import the data. Parameter backend is used for the actual data import and defaults to backend = MsBackendMzR() hence allowing to import data from mzML, mzXML or netCDF files. Parameter chunksize allows to define the number of files (x) from which the data should be imported in one iteration. With the default chunksize = 10L data is imported from 10 files in x at the same time (if backend supports it even in parallel) and this data is then inserted into the database. Larger chunk sizes will require more memory and also larger disk space (as data import is performed through temporary files) but might eventually be faster. Parameter blob allows to define whether m/z and intensity values from a spectrum should be stored as a BLOB SQL data type in the database (blob = TRUE, the default) or if individual m/z and intensity values for each peak should be stored separately (blob = FALSE). The latter case results in a much larger database and slower performance of the peaksData function, but would allow to define custom (manual) SQL queries on individual peak values. For blob = TRUE, the peaks data can be stored in two different ways which can be selected with the additional parameter peaksStorageMode. The default peaksStorageMode = "blob2" stores the full peaks matrix of a spectrum as a single entry to the database (into a single database table column) while peaksStorageMode = "blob" (which was the default until version 1.7.2) stores the m/z and intensity vectors as BLOB data type into two separate database table columns. Performance for peaksData() is thus about twice as fast for peaksStorageMode = "blob2". While data can be stored in any SQL database, at present it is suggested to use MySQL/MariaDB databases. For dbcon being a connection to a MySQL/MariaDB database, the tables will use the ARIA engine providing faster data access and will use table partitioning: tables are splitted into multiple partitions which can improve data insertion and index generation. Partitioning can be defined with the parameters partitionBy and partitionNumber. By default partitionBy = "none" no partitioning is performed. For blob = TRUE partitioning is usually not required. Only for blob = FALSE and very large datasets it is suggested to enable table partitioning by selecting either partitionBy = "spectrum" or partitionBy = "chunk". The first option assignes consecutive spectra to different partitions while the latter puts spectra from files part of the same chunk into the same partition. Both options have about the same performance but partitionBy = "spectrum" requires less disk space. Note that, while inserting the data takes a considerable amount of time, also the subsequent creation of database indices can take very long (even longer than data insertion for blob = FALSE).
backendInitialize(): get access and initialize a MsBackendSql object. Parameter object is supposed to be a MsBackendSql instance, created e.g. with MsBackendSql(). Parameter dbcon is expected to be a connection to an existing MsBackendSql SQL database (created e.g. with createMsBackendSqlDatabase()). To initialize a MsBackendOfflineSql() all information required for a DBI::dbConnect() call to connect to a database need to be provided. backendInitialize() can alternatively also be used to create a new MsBackendSql database using the optional data parameter. In this case, dbcon is expected to be a writeable connection to an empty database and data a DataFrame with the full spectra and peaks data to be inserted into this database. The format of data should match the format of the DataFrame returned by the spectraData() function and requires columns "mz" and "intensity" with the m/z and intensity values of each spectrum. The backendInitialize() call will then create all necessary tables in the database, will fill these tables with the provided data and will return an MsBackendSql for this database. The MsBackendSql and MsBackendOfflineSql objects support the Spectra::setBackend() method from Spectra to change from (any) backend to a MsBackendSql. Any parameters to the backendInitialize() function can be passed to setBackend(). Note however that chunk-wise (or parallel) processing needs to be disabled in this case by passing eventually f = factor() to the setBackend() call.
supportsSetBackend(): whether MsBackendSql supports the setBackend() method to change the MsBackend of a Spectra object to a MsBackendSql. Returns TRUE, thus, changing the backend to a MsBackendSql is supported if a writeable database connection is provided in addition with parameter dbcon (i.e. setBackend(sps, MsBackendSql(), dbcon = con) with con being a connection to an empty database would store the full spectra data from the Spectra object sps into the specified database and would return a Spectra object that uses a MsBackendSql).
backendBpparam(): whether a MsBackendSql supports parallel processing. Takes a MsBackendSql and a parallel processing setup (see BiocParallel::bpparam() for details) as input and always returns a BiocParallel::SerialParam() since MsBackendSql does not support parallel processing.
dbconn(): returns the connection to the database.

Subsetting, merging and filtering data

MsBackendSql objects can be subsetted using the [ or extractByIndex() functions. Internally, this will simply subset the integer vector of the primary keys and eventually cached data. The original data in the database is not affected by any subsetting operation. Any subsetting operation can be undone by resetting the object with the reset() function. Subsetting in arbitrary order as well as index replication is supported.

Multiple MsBackendSql objects can also be merged (combined) with the backendMerge() function. Note that this requires that all MsBackendSql objects are connected to the same database. This function is thus mostly used for combining MsBackendSql objects that were previously splitted using e.g. split().

In addition, MsBackendSql supports all other filtering methods available through Spectra::MsBackendCached(). Implementation of filter functions optimized for MsBackendSql objects are:

filterDataOrigin(): filter the object retaining spectra with dataOrigin spectra variable values matching the provided ones with parameter dataOrigin. The function returns the results in the order of the values provided with parameter dataOrigin.
filterMsLevel(): filter the object based on the MS levels specified with parameter msLevel. The function does the filtering using SQL queries. If "msLevel" is a local variable stored within the object (and hence in memory) the default implementation in MsBackendCached is used instead.
filterPrecursorMzRange(): filters the data keeping only spectra with a precursorMz within the m/z value range provided with parameter mz (i.e. all spectra with a precursor m/z >= mz[1L] and <= mz[2L]).
filterPrecursorMzValues(): filters the data keeping only spectra with precursor m/z values matching the value(s) provided with parameter mz. Parameters ppmandtoleranceallow to specify acceptable differences between compared values. Lengths ofppmandtolerancecan be either1or equal tolength(mz)` to use different values for ppm and tolerance for each provided m/z value.
filterRt(): filter the object keeping only spectra with retention times within the specified retention time range (parameter rt). Optional parameter msLevel. allows to restrict the retention time filter only on the provided MS level(s) returning all spectra from other MS levels.

Accessing and modifying data

The functions listed here are specifically implemented for MsBackendSql. In addition, MsBackendSql inherits and supports all data accessor, filtering functions and data manipulation functions from Spectra::MsBackendCached().

$, $<-: access or set (add) spectra variables in object. Spectra variables added or modified using the $<- are cached locally within the object (data in the database is never changed). To restore an object (i.e. drop all cached values) the reset function can be used.
dataStorage(): returns a character vector same length as there are spectra in object with the name of the database containing the data.
intensity<-: not supported.
longForm(): extract the MS data in long form as a data.frame. Parameter columns allows to specify the columns (spectra and/or peaks variables) that should be included in the result. If MS peaks data are stored in long form in the database (i.e., peaksStorageMode = "long" is used), the data is extracted using a dedicated SQL query. Otherwise the default implementation of the longForm() method from the Spectra package that is based on the spectraData() function is used instead. Note that the performance of the SQL-based longForm() function is not necessarily higher than the default implementation, mostly because data extraction from the database layouts that store the MS peaks data as BLOB datatype (i.e., peaksStorageMode = "blob" or peaksStorageMode = "blob2") is faster.
mz<-: not supported.
peaksData(): returns a list with the spectras' peak data. The length of the list is equal to the number of spectra in object. Each element of the list is a matrix with columns according to parameter columns. For an empty spectrum, a matrix with 0 rows is returned. Use peaksVariables(object) to list supported values for parameter columns.
peaksVariables(): returns a character with the available peak variables, i.e. columns that could be queried with peaksData().
reset(): restores an MsBackendSql by re-initializing it with the data from the database. Any subsetting or cached spectra variables will be lost.
spectraData(): gets general spectrum metadata. spectraData() returns a DataFrame with the same number of rows as there are spectra in object. Parameter columns allows to select specific spectra variables.
spectraNames(), spectraNames<-: returns a character of length equal to the number of spectra in object with the primary keys of the spectra from the database (converted to character). Replacing spectra names with spectraNames<- is not supported.
uniqueMsLevels(): returns the unique MS levels of all spectra in object.
tic(): returns the originally reported total ion count (for initial = TRUE) or calculates the total ion count from the intensities of each spectrum (for initial = FALSE).

Implementation notes

Internally, the MsBackendSql class contains only the primary keys for all spectra stored in the SQL database. Keeping only these integer in memory guarantees a minimal memory footpring of the object. Still, depending of the number of spectra in the database, this integer vector might become very large. Any data access will involve SQL calls to retrieve the data from the database. By extending the Spectra::MsBackendCached() object from the Spectra package, the MsBackendSql supports to (temporarily, i.e. for the duration of the R session) add or modify spectra variables. These are however stored in a data.frame within the object thus increasing the memory demand of the object.

Author

Johannes Rainer

Examples


####
## Create a new MsBackendSql database

## Define a file from which to import the data
data_file <- system.file("microtofq", "MM8.mzML", package = "msdata")

## Create a database/connection to a database
library(RSQLite)
db_file <- tempfile()
dbc <- dbConnect(SQLite(), db_file)

## Import the data from the file into the database
createMsBackendSqlDatabase(dbc, data_file)
#> Importing data ... 
#> 

#> [==========================================================] 1/1 (100%) in  1s
#> 
#> Creating indices 
#> .
#> .
#> .
#> .
#>  Done
#> [1] TRUE
dbDisconnect(dbc)

## Initialize a MsBackendSql
dbc <- dbConnect(SQLite(), db_file)
be <- backendInitialize(MsBackendSql(), dbc)

be
#> MsBackendSql with 198 spectra
#>       msLevel precursorMz  polarity
#>     <integer>   <numeric> <integer>
#> 1           1          NA         1
#> 2           1          NA         1
#> 3           1          NA         1
#> 4           1          NA         1
#> 5           1          NA         1
#> ...       ...         ...       ...
#> 194         1          NA         1
#> 195         1          NA         1
#> 196         1          NA         1
#> 197         1          NA         1
#> 198         1          NA         1
#>  ... 35 more variables/columns.
#>  Use  'spectraVariables' to list all of them.
#> Database: /tmp/RtmpWGx4jo/file1d1237c9bab7

## Original data source
head(be$dataOrigin)
#> [1] "/__w/_temp/Library/msdata/microtofq/MM8.mzML"
#> [2] "/__w/_temp/Library/msdata/microtofq/MM8.mzML"
#> [3] "/__w/_temp/Library/msdata/microtofq/MM8.mzML"
#> [4] "/__w/_temp/Library/msdata/microtofq/MM8.mzML"
#> [5] "/__w/_temp/Library/msdata/microtofq/MM8.mzML"
#> [6] "/__w/_temp/Library/msdata/microtofq/MM8.mzML"

## Data storage
head(dataStorage(be))
#> [1] "/tmp/RtmpWGx4jo/file1d1237c9bab7" "/tmp/RtmpWGx4jo/file1d1237c9bab7"
#> [3] "/tmp/RtmpWGx4jo/file1d1237c9bab7" "/tmp/RtmpWGx4jo/file1d1237c9bab7"
#> [5] "/tmp/RtmpWGx4jo/file1d1237c9bab7" "/tmp/RtmpWGx4jo/file1d1237c9bab7"

## Access all spectra data
spd <- spectraData(be)
spd
#> DataFrame with 198 rows and 38 columns
#>       msLevel     rtime acquisitionNum scanIndex                             mz
#>     <integer> <numeric>      <integer> <integer>                  <NumericList>
#> 1           1     0.486              1         1    104.554,106.996,107.966,...
#> 2           1     0.822              2         2    107.960,109.964,112.027,...
#> 3           1     1.159              3         3    107.963,112.021,113.035,...
#> 4           1     1.495              4         4    105.971,109.984,112.028,...
#> 5           1     1.832              5         5    104.474,107.000,109.968,...
#> ...       ...       ...            ...       ...                            ...
#> 194         1   65.4360            194       194    106.992,111.459,112.022,...
#> 195         1   65.7720            195       195  99.0797,109.0031,112.0232,...
#> 196         1   66.1092            196       196    102.927,108.970,112.028,...
#> 197         1   66.4458            197       197    111.058,112.026,113.029,...
#> 198         1   66.7818            198       198    111.044,112.024,112.518,...
#>           intensity            dataStorage             dataOrigin centroided
#>       <NumericList>            <character>            <character>  <logical>
#> 1      23,35,50,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 2    35, 24,140,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 3    28,179,350,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 4    25, 28,131,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 5      26,30,25,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> ...             ...                    ...                    ...        ...
#> 194    34,24,86,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 195  24, 22,198,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 196    29,22,98,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 197  31,124,189,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#> 198    67,84,22,... /tmp/RtmpWGx4jo/file.. /__w/_temp/Library/m..       TRUE
#>      smoothed  polarity precScanNum precursorMz precursorIntensity
#>     <logical> <integer>   <integer>   <numeric>          <numeric>
#> 1          NA         1          NA          NA                 NA
#> 2          NA         1          NA          NA                 NA
#> 3          NA         1          NA          NA                 NA
#> 4          NA         1          NA          NA                 NA
#> 5          NA         1          NA          NA                 NA
#> ...       ...       ...         ...         ...                ...
#> 194        NA         1          NA          NA                 NA
#> 195        NA         1          NA          NA                 NA
#> 196        NA         1          NA          NA                 NA
#> 197        NA         1          NA          NA                 NA
#> 198        NA         1          NA          NA                 NA
#>     precursorCharge collisionEnergy isolationWindowLowerMz
#>           <integer>       <numeric>              <numeric>
#> 1                NA              NA                     NA
#> 2                NA              NA                     NA
#> 3                NA              NA                     NA
#> 4                NA              NA                     NA
#> 5                NA              NA                     NA
#> ...             ...             ...                    ...
#> 194              NA              NA                     NA
#> 195              NA              NA                     NA
#> 196              NA              NA                     NA
#> 197              NA              NA                     NA
#> 198              NA              NA                     NA
#>     isolationWindowTargetMz isolationWindowUpperMz peaksCount totIonCurrent
#>                   <numeric>              <numeric>  <integer>     <numeric>
#> 1                        NA                     NA       1743         97322
#> 2                        NA                     NA       1708         98590
#> 3                        NA                     NA       1708         96425
#> 4                        NA                     NA       1747         97144
#> 5                        NA                     NA       1730         94631
#> ...                     ...                    ...        ...           ...
#> 194                      NA                     NA       2267        142472
#> 195                      NA                     NA       2488        167102
#> 196                      NA                     NA       2324        137565
#> 197                      NA                     NA       2105        113086
#> 198                      NA                     NA       1996         97257
#>     basePeakMZ basePeakIntensity electronBeamEnergy ionisationEnergy     lowMZ
#>      <numeric>         <numeric>          <numeric>        <numeric> <numeric>
#> 1      144.050              7250                 NA                0   104.554
#> 2      144.050              7064                 NA                0   107.960
#> 3      144.050              7124                 NA                0   107.963
#> 4      144.051              7067                 NA                0   105.971
#> 5      144.051              6891                 NA                0   104.474
#> ...        ...               ...                ...              ...       ...
#> 194    144.051              4615                 NA                0  106.9921
#> 195    144.051              3812                 NA                0   99.0797
#> 196    144.050              3526                 NA                0  102.9272
#> 197    144.050              3902                 NA                0  111.0582
#> 198    144.051              3587                 NA                0  111.0437
#>        highMZ mergedScan mergedResultScanNum mergedResultStartScanNum
#>     <numeric>  <integer>           <integer>                <integer>
#> 1    1004.470         NA                  NA                       NA
#> 2    1002.799         NA                  NA                       NA
#> 3     993.959         NA                  NA                       NA
#> 4     986.756         NA                  NA                       NA
#> 5    1003.469         NA                  NA                       NA
#> ...       ...        ...                 ...                      ...
#> 194  1003.828         NA                  NA                       NA
#> 195  1003.270         NA                  NA                       NA
#> 196  1000.594         NA                  NA                       NA
#> 197   984.228         NA                  NA                       NA
#> 198  1000.851         NA                  NA                       NA
#>     mergedResultEndScanNum injectionTime filterString  spectrumId
#>                  <integer>     <numeric>  <character> <character>
#> 1                       NA             0           NA      scan=1
#> 2                       NA             0           NA      scan=2
#> 3                       NA             0           NA      scan=3
#> 4                       NA             0           NA      scan=4
#> 5                       NA             0           NA      scan=5
#> ...                    ...           ...          ...         ...
#> 194                     NA             0           NA    scan=194
#> 195                     NA             0           NA    scan=195
#> 196                     NA             0           NA    scan=196
#> 197                     NA             0           NA    scan=197
#> 198                     NA             0           NA    scan=198
#>     ionMobilityDriftTime scanWindowLowerLimit scanWindowUpperLimit spectrum_id_
#>                <numeric>            <numeric>            <numeric>    <integer>
#> 1                     NA                   NA                   NA            1
#> 2                     NA                   NA                   NA            2
#> 3                     NA                   NA                   NA            3
#> 4                     NA                   NA                   NA            4
#> 5                     NA                   NA                   NA            5
#> ...                  ...                  ...                  ...          ...
#> 194                   NA                   NA                   NA          194
#> 195                   NA                   NA                   NA          195
#> 196                   NA                   NA                   NA          196
#> 197                   NA                   NA                   NA          197
#> 198                   NA                   NA                   NA          198

## Available variables
spectraVariables(be)
#>  [1] "msLevel"                  "rtime"                   
#>  [3] "acquisitionNum"           "scanIndex"               
#>  [5] "mz"                       "intensity"               
#>  [7] "dataStorage"              "dataOrigin"              
#>  [9] "centroided"               "smoothed"                
#> [11] "polarity"                 "precScanNum"             
#> [13] "precursorMz"              "precursorIntensity"      
#> [15] "precursorCharge"          "collisionEnergy"         
#> [17] "isolationWindowLowerMz"   "isolationWindowTargetMz" 
#> [19] "isolationWindowUpperMz"   "peaksCount"              
#> [21] "totIonCurrent"            "basePeakMZ"              
#> [23] "basePeakIntensity"        "electronBeamEnergy"      
#> [25] "ionisationEnergy"         "lowMZ"                   
#> [27] "highMZ"                   "mergedScan"              
#> [29] "mergedResultScanNum"      "mergedResultStartScanNum"
#> [31] "mergedResultEndScanNum"   "injectionTime"           
#> [33] "filterString"             "spectrumId"              
#> [35] "ionMobilityDriftTime"     "scanWindowLowerLimit"    
#> [37] "scanWindowUpperLimit"     "spectrum_id_"            

## Access mz values
mz(be)
#> NumericList of length 198
#> [[1]] 104.553733825684 106.996170043945 ... 1002.67761230469 1004.47033691406
#> [[2]] 107.959617614746 109.964324951172 ... 991.343811035156 1002.79937744141
#> [[3]] 107.962753295898 112.021392822266 ... 964.238586425781 993.959167480469
#> [[4]] 105.970863342285 109.984230041504 ... 979.084838867188 986.755981445312
#> [[5]] 104.473876953125 106.999610900879 ... 1003.185546875 1003.46936035156
#> [[6]] 102.036674499512 107.963928222656 ... 993.781127929688 994.755920410156
#> [[7]] 104.272621154785 112.024925231934 ... 996.248046875 1001.64440917969
#> [[8]] 98.9297180175781 106.506416320801 ... 991.034484863281 997.045837402344
#> [[9]] 107.966445922852 109.385284423828 ... 995.031494140625 995.600463867188
#> [[10]] 108.965042114258 109.974197387695 ... 982.596313476562 989.777404785156
#> ...
#> <188 more elements>

## Subset the object to spectra in arbitrary order
be_sub <- be[c(5, 1, 1, 2, 4, 100)]
be_sub
#> MsBackendSql with 6 spectra
#>     msLevel precursorMz  polarity
#>   <integer>   <numeric> <integer>
#> 1         1          NA         1
#> 2         1          NA         1
#> 3         1          NA         1
#> 4         1          NA         1
#> 5         1          NA         1
#> 6         1          NA         1
#>  ... 35 more variables/columns.
#>  Use  'spectraVariables' to list all of them.
#> Database: /tmp/RtmpWGx4jo/file1d1237c9bab7

## The internal spectrum IDs (primary keys from the database)
be_sub$spectrum_id_
#> [1]   5   1   1   2   4 100

## Add additional spectra variables
be_sub$new_variable <- "B"

## This variable is *cached* locally within the object (not inserted into
## the database)
be_sub$new_variable
#> [1] "B" "B" "B" "B" "B" "B"