This function takes the Report.tsv output files from DIA-NN and converts them into a multi-set QFeatures object. It is a wrapper around readQFeatures() with default parameters set to match DIA-NN label-free and plexDIA report files: default runCol is "File.Name" and default quantColsis"Ms1.Area"`.

readQFeaturesFromDIANN(
  assayData,
  colData = NULL,
  quantCols = "Ms1.Area",
  runCol = "File.Name",
  multiplexing = c("none", "mTRAQ"),
  extractedData = NULL,
  ecol = NULL,
  verbose = TRUE,
  ...
)

Arguments

assayData

A data.frame, or any object that can be coerced into a data.frame, holding the quantitative assay. For readSummarizedExperiment(), this can also be a character(1) pointing to a filename. This data.frame is typically generated by an identification and quantification software, such as Sage, Proteome Discoverer, MaxQuant, ...

colData

A data.frame (or any object that can be coerced to a data.frame) containing sample/column annotations, including quantCols and runCol (see details).

quantCols

A numeric(), logical() or character() defining the columns of the assayData that contain the quantitative data. This information can also be defined in colData (see details).

runCol

For the multi-set case, a numeric(1) or character(1) pointing to the column of assayData (and colData, is set) that contains the runs/batches. Make sure that the column name in both tables are identical and syntactically valid (if you supply a character) or have the same index (if you supply a numeric). Note that characters are converted to syntactically valid names using make.names

multiplexing

A character(1) indicating the type of multiplexing used in the experiment. One of "none" (default, for label-free experiments) or "mTRAQ" (for plexDIA experiments).

extractedData

A data.frame or any object that can be coerced to a data.frame that contains the data from the *_ms1_extracted.tsv file generated by DIA-NN. This argument is optional and is currently only applicable for mTRAQ multiplexed experiments where DIA-NN was run using the plexdia module (see references).

ecol

Same as quantCols. Available for backwards compatibility. Default is NULL. If both ecol and colData are set, an error is thrown.

verbose

A logical(1) indicating whether the progress of the data reading and formatting should be printed to the console. Default is TRUE.

...

Further arguments passed to readQFeatures().

Value

An instance of class QFeatures. The quantiative data of each acquisition run is stored in a separate set as a SummarizedExperiment object.

References

Derks, Jason, Andrew Leduc, Georg Wallmann, R. Gray Huffman, Matthew Willetts, Saad Khan, Harrison Specht, Markus Ralser, Vadim Demichev, and Nikolai Slavov. 2022. "Increasing the Throughput of Sensitive Proteomics by plexDIA." Nature Biotechnology, July. Link to article

See also

  • The QFeatures (see QFeatures()) class to read about how to manipulate the resulting QFeatures object.

  • The readQFeatures() function which this one depends on.

Author

Laurent Gatto, Christophe Vanderaa

Examples


x <- read.delim(MsDataHub::benchmarkingDIA.tsv())
#> see ?MsDataHub and browseVignettes('MsDataHub') for documentation
#> loading from cache
x[["File.Name"]] <- x[["Run"]]

#################################
## Label-free multi-set case

## using default arguments
readQFeaturesFromDIANN(x)
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> An instance of class QFeatures containing 24 assays:
#>  [1] RD139_Overlap_UPS1_0_1fmol_inj1: SummarizedExperiment with 28980 rows and 1 columns 
#>  [2] RD139_Overlap_UPS1_0_1fmol_inj2: SummarizedExperiment with 29495 rows and 1 columns 
#>  [3] RD139_Overlap_UPS1_0_1fmol_inj3: SummarizedExperiment with 29210 rows and 1 columns 
#>  ...
#>  [22] RD139_Overlap_UPS1_5fmol_inj1: SummarizedExperiment with 30941 rows and 1 columns 
#>  [23] RD139_Overlap_UPS1_5fmol_inj2: SummarizedExperiment with 30321 rows and 1 columns 
#>  [24] RD139_Overlap_UPS1_5fmol_inj3: SummarizedExperiment with 24168 rows and 1 columns 

## with a colData (and default arguments)
cd <- data.frame(sampleInfo = LETTERS[1:24],
                 quantCols = "Ms1.Area",
                 runCol = unique(x[["File.Name"]]))
readQFeaturesFromDIANN(x, colData = cd)
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> An instance of class QFeatures containing 24 assays:
#>  [1] RD139_Overlap_UPS1_0_1fmol_inj1: SummarizedExperiment with 28980 rows and 1 columns 
#>  [2] RD139_Overlap_UPS1_0_1fmol_inj2: SummarizedExperiment with 29495 rows and 1 columns 
#>  [3] RD139_Overlap_UPS1_0_1fmol_inj3: SummarizedExperiment with 29210 rows and 1 columns 
#>  ...
#>  [22] RD139_Overlap_UPS1_5fmol_inj1: SummarizedExperiment with 30941 rows and 1 columns 
#>  [23] RD139_Overlap_UPS1_5fmol_inj2: SummarizedExperiment with 30321 rows and 1 columns 
#>  [24] RD139_Overlap_UPS1_5fmol_inj3: SummarizedExperiment with 24168 rows and 1 columns 

#################################
## mTRAQ multi-set case

x2 <- read.delim(MsDataHub::Report.Derks2022.plexDIA.tsv())
#> see ?MsDataHub and browseVignettes('MsDataHub') for documentation
#> loading from cache
x2[["File.Name"]] <- x2[["Run"]]
readQFeaturesFromDIANN(x2, multiplexing = "mTRAQ")
#> Pivoting quantiative data.
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> An instance of class QFeatures containing 54 assays:
#>  [1] wJD1146: SummarizedExperiment with 2635 rows and 3 columns 
#>  [2] wJD1147: SummarizedExperiment with 3000 rows and 3 columns 
#>  [3] wJD1148: SummarizedExperiment with 2676 rows and 3 columns 
#>  ...
#>  [52] wJD1203: SummarizedExperiment with 4441 rows and 3 columns 
#>  [53] wJD1204: SummarizedExperiment with 4416 rows and 3 columns 
#>  [54] wJD1205: SummarizedExperiment with 4492 rows and 3 columns