This function configures the parameters for molecular formula annotation in Sirius. Molecular formula identification is done using isotope pattern analysis on the MS1 data as well as fragmentation tree computation on the MS2 data. The score of a molecular formula candidate is a combination of the isotope pattern score and the fragmentation tree score.
Usage
formulaIdParam(
instrument = c("QTOF", "ORBITRAP", "FTICR"),
numberOfCandidates = 10,
numberOfCandidatesPerIonization = 1,
massAccuracyMS2ppm = 10,
isotopeMs2Settings = c("IGNORE", "FILTER", "SCORE"),
filterByIsotopePattern = TRUE,
enforceElGordoFormula = TRUE,
performBottomUpSearch = TRUE,
performDeNovoBelowMz = 400,
formulaSearchDBs = character(0),
applyFormulaConstraintsToDBAndBottomUpSearch = FALSE,
enforcedFormulaConstraints = c("H", "C", "N", "O", "P"),
fallbackFormulaConstraints = c("S"),
detectableElements = c("B", "S", "Cl", "Se", "Br"),
ilpTimeout = FALSE,
numberOfSecondsPerDecomposition = 0,
numberOfSecondsPerInstance = 0,
useHeuristic = TRUE,
useHeuristicAboveMz = 300,
useOnlyHeuristicAboveMz = 650,
injectSpecLibMatchFormulas = TRUE,
minScoreToInjectSpecLibMatch = 0.7,
minPeaksToInjectSpecLibMatch = 6,
candidateFormulas = character(0)
)Arguments
- instrument
character(1)The type of mass spectrometer used for the analysis. Options include"QTOF","ORBITRAP", and"FTICR". This choice mainly affects the allowed mass deviation. If you are unsure about the instrument, use the default value"QTOF".- numberOfCandidates
integer(1)The number of formula candidates to keep in the result list. Default is10.- numberOfCandidatesPerIonization
integer(1)Forces SIRIUS to report at least this number of candidates per ionization.- massAccuracyMS2ppm
numeric(1)The maximum allowed mass deviation (in parts per million, ppm) for molecular formulas. Only formulas within this mass window are considered. Default is10.- isotopeMs2Settings
character(1)Specifies how isotope patterns in MS/MS should be handled. Default is"IGNORE". Options:- filterByIsotopePattern
logicalWhenTRUE, filters molecular formulas by comparing their theoretical isotope patterns to the measured ones, excluding those that don't match. Default isTRUE.- enforceElGordoFormula
logicalEl Gordo may predict that an MS/MS spectrum is a lipid spectrum. If enabled, the corresponding molecular formula will be enforeced as molecular formula candidate. Default isTRUE.- performBottomUpSearch
logicalIfTRUE, enables molecular formula generation through a bottom-up search. Default isTRUE.- performDeNovoBelowMz
numeric(1)Specifies the m/z below which de novo molecular formula generation is enabled. Set to0to disable de novo molecular formula generation. Default is400.- formulaSearchDBs
list(character)A list of structure databases (e.g.,"CHEBI","HMDB") from which molecular formulas are extracted to reduce the search space. Use only if necessary, as de novo formula annotation is usually more effective. Default ischaracter(0).- applyFormulaConstraintsToDBAndBottomUpSearch
logicalIfTRUE, applies formula (element) constraints to both database search and bottom-up search, in addition to de novo generation. Default isFALSE.- enforcedFormulaConstraints
characterSpecifies the elements that are always considered when auto-detecting the formula. Enforced elements are always included in the formula, even if the compound is already assigned to a specific molecular formula. Default isH,C,N,O,P.- fallbackFormulaConstraints
characterSpecifies the elements that are used as fallback when auto-detection fails (e.g., no isotope pattern). Default isS.- detectableElements
list(character)Defines the elements that can be added to the chemical alphabet when detected in the spectrum, such as from isotope patterns. Default isc("B", "S", "Cl", "Se", "Br").- ilpTimeout
logicalThe timeout settings for the integer linear programming (ILP) solver. IfTRUE, it should include timeout parameters such asnumberOfSecondsPerDecompositionandnumberOfSecondsPerInstance.- numberOfSecondsPerDecomposition
numeric- numberOfSecondsPerInstance
numeric- useHeuristic
logicalIfTRUE, enables the use of heuristics in molecular formula annotation. When enabled, additional thresholds likeuseHeuristicAboveMzanduseOnlyHeuristicAboveMzcan be set.- useHeuristicAboveMz
numericThe m/z threshold above which heuristic is used. Default is300.- useOnlyHeuristicAboveMz
numeric(1)The m/z threshold above which only heuristic is used. Default is650.- injectSpecLibMatchFormulas
logicalIfTRUE, formula candidates matching spectral library entries above a certain similarity threshold will be preserved for further analysis, regardless of score or filter settings. Default isTRUE.- minScoreToInjectSpecLibMatch
numeric(1)The similarity threshold for injecting spectral library match formulas. If the score is above this threshold, the formula will be preserved. Default is0.7.- minPeaksToInjectSpecLibMatch
integerThe minimum number of matching peaks required to inject spectral library match formulas into further analysis.- candidateFormulas
characterOptional vector of molecular formulas to use as candidates. When provided, SIRIUS will only consider these formulas instead of performing de novo formula generation. This is useful when the molecular formula of a compound is already known. Formulas should be given as neutral molecular formulas (e.g.,"C10H12N2O","C6H12O6"). You can provide one or more formulas. Default ischaracter(0)(no restriction, all formulas are considered).
Note
For more information, see the Sirius documentation.