
Advanced Feature Annotation using RuSirius
Metabonaut Developers
Source:vignettes/advanced_feature_annotation_with_RuSirius.Rmd
advanced_feature_annotation_with_RuSirius.RmdNote: this vignette is pre-computed. See the session info for information on packages used and the date the vignette was rendered. The vignette requires a running Sirius instance and a Sirius account for structure database searches. To reproduce this analysis, you will need to log in to your own Sirius account (see the Prerequisites section below). This vignette was pre-rendered using the authors’ Sirius account; users must provide their own credentials.
Introduction
In the main end-to-end LC-MS/MS untargeted metabolomics workflow, we successfully preprocessed our data, performed statistical analysis, and identified features with significant differences in abundance. However, matching the MS1 and MS2 spectra directly against the MassBank database only confidently annotated a single feature (caffeine).
As noted previously, this low proportion of annotated signals is common, and external software tools such as SIRIUS can be excellent alternatives for structure elucidation. SIRIUS uses sophisticated algorithms to predict molecular formulas and structures de novo from isotope patterns and fragmentation trees.
This vignette will demonstrate how to seamlessly continue your
metabonaut analysis by passing your unannotated significant
features into Sirius using the RuSirius R package.
Prerequisites
To run this workflow interactively, you must have:
- Sirius 6.3 installed and running on your system.
- The
RuSiriuspackage installed and loaded. - A Sirius account (free registration at bright-giant.com) — required for structure database searches.
Loading Previous Data
In the end-to-end workflow, we isolated the MS2 spectra for all
significant features, concatenated them into a single
Spectra object, and saved them to disk. We will load this
exact object to begin our advanced annotation.
#' Load the Spectra object containing MS2 scans of significant features
load(system.file("extdata", "spectra_significant_fts.RData",
package = "Metabonaut"))
#' Verify the loaded object
ms2_ctr_fts
#> MSn data (Spectra) with 315 spectra in a MsBackendMemory backend:
#> msLevel rtime scanIndex
#> <integer> <numeric> <integer>
#> 1 2 147.357 2043
#> 2 2 148.587 2061
#> 3 2 149.817 2079
#> 4 2 152.297 2115
#> 5 2 147.376 2041
#> ... ... ... ...
#> 311 2 178.082 2481
#> 312 2 179.322 2499
#> 313 2 180.572 2517
#> 314 2 181.822 2535
#> 315 2 183.072 2553
#> ... 39 more variables/columns.
#> Processing:
#> Filter: select retention time [10..240] on MS level(s) 1 2 [Tue Mar 18 11:56:42 2025]
#> Filter: select MS level(s) 2 [Tue Mar 18 11:56:50 2025]
#> Remove peaks based on their intensities and a user-provided function in spectra of MS level(s) 2. [Tue Mar 18 11:56:50 2025]
#> ...19 more processings. Use 'processingLog' to list all.Each spectrum in this object already contains a
feature_id variable that links it back to our main
XcmsExperiment results. Note that multiple MS2 spectra can
be available per feature — we deliberately send all of them to SIRIUS,
which will merge and use them to improve annotation quality.
Connecting to Sirius and Importing Data
We start by establishing a connection to the Sirius application using
the Sirius() function. This function will either connect to
a running Sirius instance, or start a new one and connect to
that. Parameter port used below allows to configure to
Sirius application to use a particular port, but generally the
function can be used without specifying a port.
We will create a new project dedicated to annotating these specific features.
#' Initialize the Sirius connection and create a new project
srs <- Sirius(projectId = "metabonaut_significant_features",
path = getwd(), port = 9999)
#> Found SIRIUS in PATH! Using this information to start the application.
#> SIRIUS was started without specifying --port (-p), trying to find the sirius.port file.
#' Check connection status
checkConnection(srs)
#> [1] TRUEIf you have not already logged in during the Sirius()
call (by passing username and password
arguments), you must log in before running any structure database
searches:
#' Log in to your Sirius account
srs <- logIn(srs,
username = "your_email@example.com",
password = "your_password")Next, we import our Spectra object into the Sirius
project. We map the ms_column_name parameter to
"feature_id" so that Sirius uses our existing feature
names.
#' Import the MS2 spectra into Sirius
srs <- import(
sirius = srs,
spectra = ms2_ctr_fts,
ms_column_name = "feature_id",
deleteExistingFeatures = TRUE
)
#' View summary of imported features
head(featuresInfo(srs))
#> alignedFeatureId compoundId externalFeatureId
#> [1,] "818899412191263562" "818899412153514821" "FT0371"
#> [2,] "818899412258372427" "818899412153514822" "FT0565"
#> [3,] "818899412350647116" "818899412153514823" "FT0732"
#> [4,] "818899412380007245" "818899412153514824" "FT0845"
#> [5,] "818899412400978766" "818899412153514825" "FT1171"
#> ionMass charge detectedAdducts hasMs1 hasMsMs computing
#> [1,] 138.0548 1 list,1 FALSE TRUE FALSE
#> [2,] 161.0401 1 list,1 FALSE TRUE FALSE
#> [3,] 182.0748 1 list,1 FALSE TRUE FALSE
#> [4,] 195.0877 1 list,1 FALSE TRUE FALSE
#> [5,] 229.1299 1 list,1 FALSE TRUE FALSERunning Sirius Computations
With the data imported, we can submit a job to Sirius. We will ask Sirius to:
- Identify candidate molecular formulas
(
formulaIdParams). - Predict the compound class (
predictParams). - Search structure databases for candidate structures
(
structureDbSearchParams).
Because our samples are human serum/plasma analyzed in positive
polarity, we expect protonated ions ([M+H]+), sodium
adducts ([M+Na]+), and ammonium adducts
([M+H-NH3]+). We supply these as our fallback adducts.
#' Submit the annotation job
job_id <- run(
srs,
fallbackAdducts = c("[M+H]+", "[M+Na]+", "[M+H-NH3]+"),
formulaIdParams = formulaIdParam(
numberOfCandidates = 5,
instrument = "QTOF",
massAccuracyMS2ppm = 10
),
predictParams = predictParam(),
structureDbSearchParams = structureDbSearchParam(
structureSearchDbs = c("BIO")
),
recompute = TRUE,
wait = TRUE
)
#' Optional: Print job info if you want to verify successful completion
jobInfo(srs, job_id)
#> [1] "Job ID: 1\n\nCommand: \n--IsotopeSettings.filter=true\n--InjectSpectralLibraryMatchFormulas.minPeakMatchesToInject=6\n--FormulaSettings.enforced=HCNOP\n--InjectSpectralLibraryMatchFormulas.injectFormulas=true\n--TagStructuresByElGordo=true\n--AdductSettings.detectable=[M+H3N+H]+,[M-H4O2+H]+,[M-H2O-H]-,[M-H3N-H]-,[M+Cl]-,[2M+K]+,[M+K]+,[2M+Cl]-,[M+C2H4O2-H]-,[M+H]+,[2M+H]+,[M-CH3-H]-,[M-H]-,[M+Na]+,[M-H2O+H]+\n--RecomputeResults=true\n--UseHeuristic.useHeuristicAboveMz=300\n--IsotopeMs2Settings=IGNORE\n--MS2MassDeviation.allowedMassDeviation=10.0ppm\n--FormulaSearchSettings.applyFormulaConstraintsToDatabaseCandidates=false\n--EnforceElGordoFormula=true\n--NumberOfCandidatesPerIonization=1\n--AdductSettings.fallback=[M+H]+,[M+Na]+,[M+H-NH3]+\n--FormulaSearchSettings.performBottomUpAboveMz=0.0\n--FormulaSettings.fallback=S\n--FormulaSearchSettings.applyFormulaConstraintsToBottomUp=false\n--UseHeuristic.useOnlyHeuristicAboveMz=650\n--ExpansiveSearchConfidenceMode.confidenceScoreSimilarityMode=APPROXIMATE\n--InjectSpectralLibraryMatchFormulas.minScoreToInject=0.7\n--FormulaSearchDB=\n--FormulaResultThreshold=true\n--InjectSpectralLibraryMatchFormulas.alwaysPredict=false\n--FormulaSettings.detectable=B,S,Cl,Se,Br\n--NumberOfCandidates=5\nformulas\nfingerprints\nclasses\nstructures\n\nProgress:\n State: DONE\n Current Progress: 2650\n Max Progress: 2650\n\nAffected Compound IDs:\n 818899412153514825, 818899412153514824, 818899412153514823, 818899412153514822, 818899412153514821\n\nAffected Aligned Feature IDs:\n818899412400978766\n818899412380007245\n818899412350647116\n818899412258372427\n818899412191263562\n"Tip: If you want to explore the data visually while or after it
processes, you can use openGUI(srs) to launch the Sirius
graphical interface!
Retrieving and Interpreting Results
Once the computation is complete, we can extract the structural and formula predictions back into R for downstream analysis.
High-Level Summary
The summary() function provides a compact overview of
the top formulas, structures, and compound classes predicted for each
feature. This includes confidence scores that indicate how reliable each
annotation is.
summary_results <- summary(sirius = srs, result.type = "structure")
summary_results
#> alignedFeatureId compoundId externalFeatureId
#> 1 818899412191263562 818899412153514821 FT0371
#> 2 818899412258372427 818899412153514822 FT0565
#> 3 818899412350647116 818899412153514823 FT0732
#> 4 818899412380007245 818899412153514824 FT0845
#> 5 818899412400978766 818899412153514825 FT1171
#> ionMass charge hasMs1 hasMsMs formulaId
#> 1 138.0548 1 FALSE TRUE 818899436262374254
#> 2 161.0401 1 FALSE TRUE 818899436237208404
#> 3 182.0748 1 FALSE TRUE 818899437071874932
#> 4 195.0877 1 FALSE TRUE 818899436253985629
#> 5 229.1299 1 FALSE TRUE 818899436258179943
#> molecularFormula adduct rank siriusScoreNormalized
#> 1 C7H7NO2 [M + H]+ 3 0.04957077
#> 2 C4H5BFNO4 [M + H]+ 1 0.50000000
#> 3 C5H9F2N3O2 [M + H]+ 1 0.49968135
#> 4 C8H10N4O2 [M + H]+ 1 0.48622197
#> 5 C12H18N2O [M + Na]+ 2 0.08134845
#> siriusScore isotopeScore treeScore inchiKey
#> 1 20.13407 0 20.13407 VOCKNCWQVHJMAE
#> 2 69.24605 0 69.24605 <NA>
#> 3 112.21981 0 112.21981 GWRLHZIGYXCDKL
#> 4 68.84529 0 68.84529 RYYVLZVUVIJVGH
#> 5 5.47353 0 5.47353 HBIDZSUDZACENV
#> smiles
#> 1 COC1=NC=CC(=C1)C=O
#> 2 <NA>
#> 3 C(CF)NC(=O)N(CCF)N=O
#> 4 CN1C=NC2=C1C(=O)N(C(=O)N2C)C
#> 5 CC(=CCCC(=CCC(=O)C=[N+]=[N-])C)C
#> structureName xlogP
#> 1 2-Methoxyisonicotinaldehyde 0.9654924
#> 2 <NA> NA
#> 3 Bis(fluoroethyl)nitrosourea 0.8681960
#> 4 Thein -0.1085821
#> 5 (4E)-1-Diazo-5,9-dimethyl-4,8-decadiene-2-one 3.4000000
#> rank.1 csiScore tanimotoSimilarity mcesDistToTopHit type
#> 1 1 -75.15435 0.2592593 0 NPC
#> 2 NA NA NA NA NPC
#> 3 1 -152.63122 0.3125000 0 NPC
#> 4 1 -12.05606 0.9743590 0 NPC
#> 5 1 -125.16005 0.3018868 0 NPC
#> level levelIndex name
#> 1 PATHWAY 0 Alkaloids
#> 2 PATHWAY 0 Amino acids and Peptides
#> 3 PATHWAY 0 Alkaloids
#> 4 PATHWAY 0 Alkaloids
#> 5 PATHWAY 0 Terpenoids
#> description id probability index type.1
#> 1 Pathway: Alkaloids 0 0.5230578 0 NPC
#> 2 Pathway: Amino acids and Peptides 1 0.8786789 1 NPC
#> 3 Pathway: Alkaloids 0 0.1652622 0 NPC
#> 4 Pathway: Alkaloids 0 0.9989780 0 NPC
#> 5 Pathway: Terpenoids 6 0.5267499 6 NPC
#> level.1 levelIndex.1 name.1
#> 1 SUPERCLASS 1 Nicotinic acid alkaloids
#> 2 SUPERCLASS 1 Small peptides
#> 3 SUPERCLASS 1 Small peptides
#> 4 SUPERCLASS 1 Pseudoalkaloids (transamidation)
#> 5 SUPERCLASS 1 Sesquiterpenoids
#> description.1 id.1
#> 1 Superclass: Nicotinic acid alkaloids 44
#> 2 Superclass: Small peptides 63
#> 3 Superclass: Small peptides 63
#> 4 Superclass: Pseudoalkaloids (transamidation) 59
#> 5 Superclass: Sesquiterpenoids 61
#> probability.1 index.1 type.2 level.2 levelIndex.2
#> 1 0.6076760 44 NPC CLASS 2
#> 2 0.8957072 63 NPC CLASS 2
#> 3 0.1016899 63 NPC CLASS 2
#> 4 0.9999007 59 NPC CLASS 2
#> 5 0.1907384 61 NPC CLASS 2
#> name.2 description.2 id.2
#> 1 Pyridine alkaloids Class: Pyridine alkaloids 602
#> 2 Aminoacids Class: Aminoacids 109
#> 3 Imidazole alkaloids Class: Imidazole alkaloids 399
#> 4 Purine alkaloids Class: Purine alkaloids 597
#> 5 Terpenoid alkaloids Class: Terpenoid alkaloids 682
#> probability.2 index.2 confidenceExactMatch
#> 1 0.60770983 602 0.06000814
#> 2 0.93744338 109 NA
#> 3 0.08924162 399 0.02842510
#> 4 0.99996340 597 0.67976562
#> 5 0.15184069 682 0.03929589
#> confidenceApproxMatch expansiveSearchState computing
#> 1 0.06000814 OFF FALSE
#> 2 NA <NA> FALSE
#> 3 0.02842510 OFF FALSE
#> 4 0.89135261 OFF FALSE
#> 5 0.03929589 APPROXIMATE FALSEThe summary() output contains the top-ranked molecular
formula, predicted structure (with SMILES, InChIKey, and CSI:FingerID
score), compound class predictions (NPC pathway, superclass, class), and
— critically — the confidence scores (confidenceExactMatch
and confidenceApproxMatch). These confidence scores are the
most important indicator for evaluating the reliability of the
prediction.
You can seamlessly merge these results back into your
rowData(res) from the main workflow using the feature
identifiers to complement your differential abundance statistics with
structural predictions.
For more detailed results (e.g., multiple structure candidates per
formula, compound class predictions, fragmentation trees), see the
results() function documented in ?results.
De Novo Structure Annotation for Low-Confidence Features
De novo structure annotation using MSNovelist is particularly useful for features where the database search did not yield high-confidence results. MSNovelist generates molecular structures directly from MS/MS data without relying on any database, making it valuable for novel or poorly characterized compounds.
We identify features with low confidence (below 0.5) or no structure match and run de novo annotation on those.
#' Identify features with low confidence or no structure match
fts_denovo <- summary_results$alignedFeatureId[which(
is.na(summary_results$confidenceApproxMatch) |
summary_results$confidenceApproxMatch < 0.5)]
fts_denovo
#> [1] "818899412191263562" "818899412258372427"
#> [3] "818899412350647116" "818899412400978766"We submit a de novo annotation job for these features. It is recommended to also use ZODIAC re-ranking when running MSNovelist.
#' Run de novo structure annotation for low-confidence features
job_id_denovo <- tryCatch(
run(
srs,
formulaIdParams = formulaIdParam(
numberOfCandidates = 5,
instrument = "QTOF",
massAccuracyMS2ppm = 10
),
msNovelistParams = deNovoStructureParam(numberOfCandidateToPredict = 5),
alignedFeaturesIds = fts_denovo,
recompute = FALSE,
wait = TRUE
),
error = function(e) NULL
)We can now retrieve the de novo results:
#' Get de novo summary
summary_denovo <- summary(srs, result.type = "deNovo")
summary_denovo
#> alignedFeatureId compoundId externalFeatureId
#> 1 818899412191263562 818899412153514821 FT0371
#> 2 818899412258372427 818899412153514822 FT0565
#> 3 818899412350647116 818899412153514823 FT0732
#> 4 818899412380007245 818899412153514824 FT0845
#> 5 818899412400978766 818899412153514825 FT1171
#> ionMass charge hasMs1 hasMsMs formulaId
#> 1 138.0548 1 FALSE TRUE 818899436262374254
#> 2 161.0401 1 FALSE TRUE 818899436237208404
#> 3 182.0748 1 FALSE TRUE 818899437071874933
#> 4 195.0877 1 FALSE TRUE 818899436253985629
#> 5 229.1299 1 FALSE TRUE 818899436258179946
#> molecularFormula adduct rank siriusScoreNormalized
#> 1 C7H7NO2 [M + H]+ 3 0.049570774
#> 2 C4H5BFNO4 [M + H]+ 1 0.500000000
#> 3 C5H12F2N4O2 [M - H3N + H]+ 2 0.499681353
#> 4 C8H10N4O2 [M + H]+ 1 0.486221973
#> 5 C6H19N6P [M + Na]+ 5 0.007302549
#> siriusScore isotopeScore treeScore inchiKey
#> 1 20.134072 0 20.134072 WDWLOWADVFWUQE
#> 2 69.246054 0 69.246054 <NA>
#> 3 112.219809 0 112.219809 PTOGTCVHRNJTSH
#> 4 68.845292 0 68.845292 RYYVLZVUVIJVGH
#> 5 3.063011 0 3.063011 LHYCAMVLJXLUBB
#> smiles xlogP rank.1 csiScore
#> 1 C=Cc1ncoc1C(C)=O 0.0000000 1 -54.11018
#> 2 <NA> NA 1 NA
#> 3 NCCNOC(=O)NNCC(F)F 0.0000000 1 -70.82645
#> 4 CN1C=NC2=C1C(=O)N(C(=O)N2C)C -0.1085821 1 -12.05606
#> 5 CCN(CC)P(N)(N)=NN=C(C)N 0.0000000 1 -61.81776
#> tanimotoSimilarity type level levelIndex
#> 1 0.5641026 NPC PATHWAY 0
#> 2 NA NPC PATHWAY 0
#> 3 0.5094340 NPC PATHWAY 0
#> 4 0.9743590 NPC PATHWAY 0
#> 5 0.2790698 NPC PATHWAY 0
#> name description id
#> 1 Alkaloids Pathway: Alkaloids 0
#> 2 Amino acids and Peptides Pathway: Amino acids and Peptides 1
#> 3 Alkaloids Pathway: Alkaloids 0
#> 4 Alkaloids Pathway: Alkaloids 0
#> 5 Alkaloids Pathway: Alkaloids 0
#> probability index type.1 level.1 levelIndex.1
#> 1 0.5230578 0 NPC SUPERCLASS 1
#> 2 0.8786789 1 NPC SUPERCLASS 1
#> 3 0.2048894 0 NPC SUPERCLASS 1
#> 4 0.9989780 0 NPC SUPERCLASS 1
#> 5 0.8715882 0 NPC SUPERCLASS 1
#> name.1
#> 1 Nicotinic acid alkaloids
#> 2 Small peptides
#> 3 Small peptides
#> 4 Pseudoalkaloids (transamidation)
#> 5 Histidine alkaloids
#> description.1 id.1
#> 1 Superclass: Nicotinic acid alkaloids 44
#> 2 Superclass: Small peptides 63
#> 3 Superclass: Small peptides 63
#> 4 Superclass: Pseudoalkaloids (transamidation) 59
#> 5 Superclass: Histidine alkaloids 34
#> probability.1 index.1 type.2 level.2 levelIndex.2
#> 1 0.60767603 44 NPC CLASS 2
#> 2 0.89570719 63 NPC CLASS 2
#> 3 0.34776333 63 NPC CLASS 2
#> 4 0.99990070 59 NPC CLASS 2
#> 5 0.08852576 34 NPC CLASS 2
#> name.2 description.2 id.2
#> 1 Pyridine alkaloids Class: Pyridine alkaloids 602
#> 2 Aminoacids Class: Aminoacids 109
#> 3 Aminoacids Class: Aminoacids 109
#> 4 Purine alkaloids Class: Purine alkaloids 597
#> 5 Polyamines Class: Polyamines 571
#> probability.2 index.2 formulaId.1 molecularFormula.1
#> 1 0.6077098 602 818899436262374254 C7H7NO2
#> 2 0.9374434 109 818899436237208404 C4H5BFNO4
#> 3 0.1858719 109 818899437071874933 C5H12F2N4O2
#> 4 0.9999634 597 818899436253985629 C8H10N4O2
#> 5 0.1634796 571 818899436258179946 C6H19N6P
#> adduct.1 rank.2 siriusScoreNormalized.1 siriusScore.1
#> 1 [M + H]+ 3 0.049570774 20.134072
#> 2 [M + H]+ NA 0.500000000 69.246054
#> 3 [M - H3N + H]+ 2 0.499681353 112.219809
#> 4 [M + H]+ 1 0.486221973 68.845292
#> 5 [M + Na]+ 5 0.007302549 3.063011
#> isotopeScore.1 treeScore.1 inchiKey.1
#> 1 0 20.134072 WDWLOWADVFWUQE
#> 2 0 69.246054 <NA>
#> 3 0 112.219809 PTOGTCVHRNJTSH
#> 4 0 68.845292 RYYVLZVUVIJVGH
#> 5 0 3.063011 LHYCAMVLJXLUBB
#> smiles.1 xlogP.1 rank.3 csiScore.1
#> 1 C=Cc1ncoc1C(C)=O 0.0000000 1 -54.11018
#> 2 <NA> NA NA NA
#> 3 NCCNOC(=O)NNCC(F)F 0.0000000 1 -70.82645
#> 4 CN1C=NC2=C1C(=O)N(C(=O)N2C)C -0.1085821 1 -12.05606
#> 5 CCN(CC)P(N)(N)=NN=C(C)N 0.0000000 1 -61.81776
#> tanimotoSimilarity.1 type.3 level.3 levelIndex.3
#> 1 0.5641026 NPC PATHWAY 0
#> 2 NA NPC PATHWAY 0
#> 3 0.5094340 NPC PATHWAY 0
#> 4 0.9743590 NPC PATHWAY 0
#> 5 0.2790698 NPC PATHWAY 0
#> name.3 description.3
#> 1 Alkaloids Pathway: Alkaloids
#> 2 Amino acids and Peptides Pathway: Amino acids and Peptides
#> 3 Alkaloids Pathway: Alkaloids
#> 4 Alkaloids Pathway: Alkaloids
#> 5 Alkaloids Pathway: Alkaloids
#> id.3 probability.3 index.3 type.4 level.4 levelIndex.4
#> 1 0 0.5230578 0 NPC SUPERCLASS 1
#> 2 1 0.8786789 1 NPC SUPERCLASS 1
#> 3 0 0.2048894 0 NPC SUPERCLASS 1
#> 4 0 0.9989780 0 NPC SUPERCLASS 1
#> 5 0 0.8715882 0 NPC SUPERCLASS 1
#> name.4
#> 1 Nicotinic acid alkaloids
#> 2 Small peptides
#> 3 Small peptides
#> 4 Pseudoalkaloids (transamidation)
#> 5 Histidine alkaloids
#> description.4 id.4
#> 1 Superclass: Nicotinic acid alkaloids 44
#> 2 Superclass: Small peptides 63
#> 3 Superclass: Small peptides 63
#> 4 Superclass: Pseudoalkaloids (transamidation) 59
#> 5 Superclass: Histidine alkaloids 34
#> probability.4 index.4 type.5 level.5 levelIndex.5
#> 1 0.60767603 44 NPC CLASS 2
#> 2 0.89570719 63 NPC CLASS 2
#> 3 0.34776333 63 NPC CLASS 2
#> 4 0.99990070 59 NPC CLASS 2
#> 5 0.08852576 34 NPC CLASS 2
#> name.5 description.5 id.5
#> 1 Pyridine alkaloids Class: Pyridine alkaloids 602
#> 2 Aminoacids Class: Aminoacids 109
#> 3 Aminoacids Class: Aminoacids 109
#> 4 Purine alkaloids Class: Purine alkaloids 597
#> 5 Polyamines Class: Polyamines 571
#> probability.5 index.5 computing structureName structureName.1
#> 1 0.6077098 602 FALSE <NA> <NA>
#> 2 0.9374434 109 FALSE <NA> <NA>
#> 3 0.1858719 109 FALSE <NA> <NA>
#> 4 0.9999634 597 FALSE Thein Thein
#> 5 0.1634796 571 FALSE <NA> <NA>Clean Up
Once you have saved your results to your R environment, it is good practice to cleanly shut down the Sirius connection.
#' Close the project and shut down the Sirius connection
shutdown(srs)
#> Sirius was shut down successfullySummary
By integrating RuSirius into the metabonaut
workflow, we transitioned from unresolved MS2 spectra to structural
predictions using SIRIUS’s formula identification, CSI:FingerID
structure database search, and MSNovelist de novo structure
generation. Importantly, we pass all available MS2 spectra per feature
to SIRIUS (not just a single consensus spectrum), allowing it to
leverage multiple fragmentation patterns for improved annotation. For
features where database searches yield low-confidence results, de
novo structure generation provides an alternative path to
structural elucidation. This bridges the gap between raw statistical
feature discovery and biological interpretation.
References and Acknowledgements
This vignette relies on the SIRIUS software suite developed by the Böcker Lab at Friedrich-Schiller-Universität Jena and Bright Giant GmbH. SIRIUS integrates several algorithms for metabolite annotation from high-resolution mass spectrometry data. When using SIRIUS and the tools accessed through this workflow, please cite the following references:
SIRIUS — fragmentation tree computation and molecular formula identification (Dührkop et al. 2019).
CSI:FingerID — molecular structure database search (Dührkop et al. 2015).
COSMIC — confidence scoring for structural annotations (Hoffmann et al. 2022).
CANOPUS — compound class prediction from fragmentation spectra (Dührkop et al. 2021).
ZODIAC — molecular formula re-ranking using Gibbs sampling (Ludwig et al. 2020).
MSNovelist — de novo structure generation from mass spectra (Stravs et al. 2022).
The R interface to SIRIUS is provided by the RuSirius package (Louail et al.), which is built upon the RSirius REST API library. Special thanks to Markus Fleischauer for his work on the Sirius SDKs, Jonas Emmert for making the R API usable, and Marcus Ludwig for support in implementing RuSirius.
Session information
The R code was run on:
date()
#> [1] "Mon Mar 9 18:34:51 2026"Information on the R session:
sessionInfo()
#> R version 4.5.2 (2025-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26200)
#>
#> Matrix products: default
#> LAPACK version 3.12.1
#>
#> locale:
#> [1] LC_COLLATE=English_United Kingdom.utf8
#> [2] LC_CTYPE=English_United Kingdom.utf8
#> [3] LC_MONETARY=English_United Kingdom.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United Kingdom.utf8
#>
#> time zone: Europe/Paris
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets
#> [7] methods base
#>
#> other attached packages:
#> [1] RuSirius_0.2.5 jsonlite_2.0.0 Spectra_1.20.1
#> [4] BiocParallel_1.44.0 S4Vectors_0.48.0 BiocGenerics_0.56.0
#> [7] generics_0.1.4 RSirius_6.3.3 ProtGenerics_1.42.0
#>
#> loaded via a namespace (and not attached):
#> [1] DBI_1.2.3 bitops_1.0-9
#> [3] MetaboAnnotation_1.14.0 gridExtra_2.3
#> [5] httr2_1.2.2 remotes_2.5.0
#> [7] rlang_1.1.7 magrittr_2.0.4
#> [9] clue_0.3-67 otel_0.2.0
#> [11] matrixStats_1.5.0 compiler_4.5.2
#> [13] RSQLite_2.4.6 png_0.1-8
#> [15] callr_3.7.6 vctrs_0.7.1
#> [17] reshape2_1.4.5 stringr_1.6.0
#> [19] crayon_1.5.3 pkgconfig_2.0.3
#> [21] MetaboCoreUtils_1.19.2 fastmap_1.2.0
#> [23] dbplyr_2.5.2 XVector_0.50.0
#> [25] ps_1.9.1 purrr_1.2.1
#> [27] bit_4.6.0 xfun_0.56
#> [29] MultiAssayExperiment_1.36.1 cachem_1.1.0
#> [31] ChemmineR_3.62.0 blob_1.3.0
#> [33] DelayedArray_0.36.0 parallel_4.5.2
#> [35] cluster_2.1.8.1 R6_2.6.1
#> [37] stringi_1.8.7 RColorBrewer_1.1-3
#> [39] GenomicRanges_1.62.1 Rcpp_1.1.1
#> [41] Seqinfo_1.0.0 SummarizedExperiment_1.40.0
#> [43] knitr_1.51 base64enc_0.1-6
#> [45] IRanges_2.44.0 BiocBaseUtils_1.12.0
#> [47] Matrix_1.7-4 igraph_2.2.2
#> [49] tidyselect_1.2.1 abind_1.4-8
#> [51] yaml_2.3.12 codetools_0.2-20
#> [53] curl_7.0.0 processx_3.8.6
#> [55] pkgbuild_1.4.8 lattice_0.22-7
#> [57] tibble_3.3.1 plyr_1.8.9
#> [59] Biobase_2.70.0 KEGGREST_1.50.0
#> [61] S7_0.2.1 evaluate_1.0.5
#> [63] desc_1.4.3 BiocFileCache_3.0.0
#> [65] xml2_1.5.2 Biostrings_2.78.0
#> [67] pillar_1.11.1 BiocManager_1.30.27
#> [69] filelock_1.0.3 MatrixGenerics_1.22.0
#> [71] DT_0.34.0 RCurl_1.98-1.17
#> [73] BiocVersion_3.22.0 ggplot2_4.0.2
#> [75] scales_1.4.0 glue_1.8.0
#> [77] lazyeval_0.2.2 tools_4.5.2
#> [79] AnnotationHub_4.0.0 QFeatures_1.20.0
#> [81] fs_1.6.6 grid_4.5.2
#> [83] tidyr_1.3.2 MsCoreUtils_1.22.1
#> [85] AnnotationDbi_1.72.0 cli_3.6.5
#> [87] rappdirs_0.3.4 rsvg_2.7.0
#> [89] S4Arrays_1.10.1 dplyr_1.2.0
#> [91] AnnotationFilter_1.34.0 gtable_0.3.6
#> [93] digest_0.6.39 SparseArray_1.10.8
#> [95] rjson_0.2.23 htmlwidgets_1.6.4
#> [97] farver_2.1.2 memoise_2.0.1
#> [99] htmltools_0.5.9 lifecycle_1.0.5
#> [101] httr_1.4.8 CompoundDb_1.14.2
#> [103] bit64_4.6.0-1 MASS_7.3-65