Package: MetaboCoreUtils
Authors: Johannes Rainer [aut, cre] (https://orcid.org/0000-0002-6977-7147), Michael Witting
[aut] (https://orcid.org/0000-0002-1462-4426), Andrea Vicini
[aut], Liesa Salzer [ctb] (https://orcid.org/0000-0003-0761-0656), Sebastian Gibb
[ctb] (https://orcid.org/0000-0001-7406-4443), Michael Stravs
[ctb] (https://orcid.org/0000-0002-1426-8572), Roger Gine [aut]
(https://orcid.org/0000-0003-0288-9619)
Last modified: 2023-04-12 13:32:04.724574
Compiled: Wed Apr 12 13:34:26 2023
Introduction
The MetaboCoreUtils
defines metabolomics-related core
functionality provided as low-level functions to allow a data
structure-independent usage across various R packages (Rainer et al. 2022). This includes functions to
calculate between ion (adduct) and compound mass-to-charge ratios and
masses or functions to work with chemical formulas. The package provides
also a set of adduct definitions and information on some commercially
available internal standard mixes commonly used in MS experiments.
For a full list of function, see
library("MetaboCoreUtils")
ls(pos = "package:MetaboCoreUtils")
## [1] "addElements" "adductFormula"
## [3] "adductNames" "adducts"
## [5] "calculateKm" "calculateKmd"
## [7] "calculateMass" "calculateRkmd"
## [9] "containsElements" "convertMtime"
## [11] "correctRindex" "countElements"
## [13] "formula2mz" "indexRtime"
## [15] "internalStandardMixNames" "internalStandards"
## [17] "isotopicSubstitutionMatrix" "isotopologues"
## [19] "isRkmd" "mass2mz"
## [21] "multiplyElements" "mz2mass"
## [23] "pasteElements" "standardizeFormula"
## [25] "subtractElements"
or the reference page on the package webpage.
Installation
The package can be installed with the BiocManager
package. To install BiocManager
use
install.packages("BiocManager")
and, after that,
BiocManager::install("MetaboCoreUtils")
to install this
package.
Examples
The functions defined in this package utilise basic classes with the aim of being reused in packages that provide a more formal, high-level interface.
The examples below demonstrate the basic usage of the functions from the package.
Conversion between ion m/z and compound masses
The mass2mz
and mz2mass
functions allow to
convert between compound masses and ion (adduct) mass-to-charge ratios
(m/z). The MetaboCoreUtils
package provides definitions of
common ion adducts generated by electrospray ionization (ESI). These can
be listed with the adductNames
function.
## [1] "[M+3H]3+" "[M+2H+Na]3+" "[M+H+Na2]3+"
## [4] "[M+Na3]3+" "[M+2H]2+" "[M+H+NH4]2+"
## [7] "[M+H+K]2+" "[M+H+Na]2+" "[M+C2H3N+2H]2+"
## [10] "[M+2Na]2+" "[M+C4H6N2+2H]2+" "[M+C6H9N3+2H]2+"
## [13] "[M+H]+" "[M+Li]+" "[M+2Li-H]+"
## [16] "[M+NH4]+" "[M+H2O+H]+" "[M+Na]+"
## [19] "[M+CH4O+H]+" "[M+K]+" "[M+C2H3N+H]+"
## [22] "[M+2Na-H]+" "[M+C3H8O+H]+" "[M+C2H3N+Na]+"
## [25] "[M+2K-H]+" "[M+C2H6OS+H]+" "[M+C4H6N2+H]+"
## [28] "[2M+H]+" "[2M+NH4]+" "[2M+Na]+"
## [31] "[2M+K]+" "[2M+C2H3N+H]+" "[2M+C2H3N+Na]+"
## [34] "[3M+H]+" "[M+H-NH3]+" "[M+H-H2O]+"
## [37] "[M+H-Hexose-H2O]+" "[M+H-H4O2]+" "[M+H-CH2O2]+"
## [40] "[M]+"
With that we can use the mass2mz
function to calculate
the m/z for a set of compounds assuming the generation of certain ions.
In the example below we define masses for some theoretical compounds and
calculate their expected m/z assuming that ions "[M+H]+"
and "[M+Na]+"
are generated.
## [M+H]+ [M+Na]+
## [1,] 124.0073 145.9892
## [2,] 843.0073 864.9892
## [3,] 325.0073 346.9892
As a result we get a matrix
with each row representing
one compound and each column the m/z for one of the defined adducts.
With the mz2mass
we could perform the reverse calculation,
i.e. from m/z to compound masses.
In addition, it is possible to calculate m/z values from chemical
formulas with the formula2mz
function. Below we calculate
the m/z values for [M+H]+
and [M+Na]+
adducts
from the chemical formulas of glucose and caffeine.
formula2mz(c("C6H12O6", "C8H10N4O2"), adduct = c("[M+H]+", "[M+Na]+"))
## [M+H]+ [M+Na]+
## C6H12O6 181.0707 203.0526
## C8H10N4O2 195.0877 217.0696
Working with chemical formulas
The lack of consistency in the format in which chemical formulas are
written poses a big problem comparing formulas coming from different
resources. The MetaboCoreUtils
package provides functions
to standardize formulas as well as combine formulas or
substract elements from formulas. Below we use an artificial example to
show this functionality. First we standardize a chemical formula with
the standardizeFormula
function.
frml <- "Na3C4"
frml <- standardizeFormula(frml)
frml
## Na3C4
## "C4Na3"
Next we add "H2O"
to the formula using the
addElements
function.
frml <- addElements(frml, "H2O")
frml
## [1] "C4H2ONa3"
We can also substract elements with the subtractElements
function:
frml <- subtractElements(frml, "H")
frml
## [1] "C4HONa3"
Chemical formulas could also be multiplied with a scalar using the
multiplyElements
function. The counts for individual
elements in a chemical formula can be calculated with the
countElements
function.
countElements(frml)
## $C4HONa3
## C H O Na
## 4 1 1 3
The function adductFormula
allows in addition to create
chemical formulas of specific adducts of compounds. Below we create
chemical formulas for [M+H]+
and [M+Na]+
adducts for glucose and caffeine.
adductFormula(c("C6H12O6", "C8H10N4O2"), adduct = c("[M+H]+", "[M+Na]+"))
## [M+H]+ [M+Na]+
## C6H12O6 "[C6H13O6]+" "[C6H12O6Na]+"
## C8H10N4O2 "[C8H11N4O2]+" "[C8H10N4O2Na]+"
Kendrick mass defect calculation
Lipids and other homologous series based on fatty acyls can be found
in data by using Kendrick mass defects (KMD) or referenced kendrick mass
defects (RKMD). The MetaboCoreUtils
package provides
functions to calculate everything around Kendrick mass defects. The
following example calculates the KMD and RKMD for three lipids
(PC(16:0/18:1(9Z)), PC(16:0/18:0), PS(16:0/18:1(9Z))) and checks, if
they fit the RKMD of PCs detected as [M+H]+ adducts.
lipid_masses <- c(760.5851, 762.6007, 762.5280)
calculateKmd(lipid_masses)
## [1] 0.7358239 0.7491732 0.6765544
Next the RKMD is calculated and checked if it fits to a specific range. RKMDs are either 0 or negative integers according to the number of double bonds in the lipids, e.g. -2 if two double bonds are present in the lipids.
lipid_rkmd <- calculateRkmd(lipid_masses)
isRkmd(lipid_rkmd)
## [1] TRUE TRUE FALSE
Retetion time indexing
Retention times are often not directly comparable between two LC-MS
systems, even if nominally the same separation method is used.
Conversion of retention times to retetion indices can overcome this
issue. The MetaboCoreUtils
package provides a function to
perform this conversion. Below we use an example based on indexing with
a homologoues series af N-Alkyl-pyridinium sulfonates (NAPS).
rti <- read.table(system.file("retentionIndex",
"rti.txt",
package = "MetaboCoreUtils"),
header = TRUE,
sep = "\t")
rtime <- read.table(system.file("retentionIndex",
"metabolites.txt",
package = "MetaboCoreUtils"),
header = TRUE,
sep = "\t")
A data.frame
with the retetion times of the NAPS and
their respective index value is required.
head(rti)
## rtime rindex
## 1 1.14 100
## 2 1.18 200
## 3 1.38 300
## 4 2.11 400
## 5 4.34 500
## 6 5.92 600
The indexing is peformed using the function
indexRtime
.
rtime$rindex_r <- indexRtime(rtime$rtime, rti)
For comparison the manual calculated retention indices are included.
head(rtime)
## name rtime rindex_manual rindex_r
## 1 VITAMIN D2 NA NA NA
## 2 SQUALENE 15.66 1709.8765 1709.8765
## 3 4-COUMARATE 6.26 629.3103 629.3103
## 4 NONANOATE 11.73 1244.5783 1244.5783
## 5 ESTRADIOL-17ALPHA 10.27 1065.4321 1065.4321
## 6 CAPRYLATE 10.67 1114.8148 1114.8148
Conditions that shall be compared by the retention index might not
perfectly match. In case the deviation is linear a simple two-point
correction can be applied to the data. This is performed by the function
correctRindex
. The correction requires two reference
standards and their measured RIs and reference RIs.
ref <- data.frame(rindex = c(1709.8765, 553.7975),
refindex = c(1700, 550))
rtime$rindex_cor <- correctRindex(rtime$rindex_r, ref)
Contributions
If you would like to contribute any low-level functionality, please open a GitHub issue to discuss it. Please note that any contributions should follow the style guide and will require an appropriate unit test.
If you wish to reuse any functions in this package, please just go ahead. If you would like any advice or seek help, please either open a GitHub issue.
Session information
## R version 4.3.0 beta (2023-04-06 r84184)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] MetaboCoreUtils_1.7.0 BiocStyle_2.27.1
##
## loaded via a namespace (and not attached):
## [1] jsonlite_1.8.4 compiler_4.3.0 BiocManager_1.30.20
## [4] stringr_1.5.0 cluster_2.1.4 jquerylib_0.1.4
## [7] systemfonts_1.0.4 textshaping_0.3.6 yaml_2.3.7
## [10] fastmap_1.1.1 R6_2.5.1 knitr_1.42
## [13] BiocGenerics_0.45.3 MASS_7.3-58.4 bookdown_0.33
## [16] desc_1.4.2 rprojroot_2.0.3 bslib_0.4.2
## [19] rlang_1.1.0 cachem_1.0.7 stringi_1.7.12
## [22] xfun_0.38 fs_1.6.1 MsCoreUtils_1.11.5
## [25] sass_0.4.5 memoise_2.0.1 cli_3.6.1
## [28] pkgdown_2.0.7.9000 magrittr_2.0.3 digest_0.6.31
## [31] lifecycle_1.0.3 clue_0.3-64 S4Vectors_0.37.5
## [34] vctrs_0.6.1 evaluate_0.20 glue_1.6.2
## [37] ragg_1.2.5 stats4_4.3.0 rmarkdown_2.21
## [40] purrr_1.0.1 tools_4.3.0 htmltools_0.5.5