
Read DIA-NN output as a QFeatures objects
Source:R/readQFeaturesFromDIANN.R
readQFeaturesFromDIANN.Rd
This function takes the Report.tsv
output files from DIA-NN and
converts them into a multi-set QFeatures
object. It is a wrapper
around readQFeatures()
with default parameters set to match
DIA-NN label-free and plexDIA report files: default runCol
is
"File.Name"
and default quantColsis
"Ms1.Area"`.
Usage
readQFeaturesFromDIANN(
assayData,
colData = NULL,
quantCols = "Ms1.Area",
runCol = "File.Name",
multiplexing = c("none", "mTRAQ"),
extractedData = NULL,
ecol = NULL,
verbose = TRUE,
...
)
Arguments
- assayData
A
data.frame
, or any object that can be coerced into adata.frame
, holding the quantitative assay. ForreadSummarizedExperiment()
, this can also be acharacter(1)
pointing to a filename. Thisdata.frame
is typically generated by an identification and quantification software, such as Sage, Proteome Discoverer, MaxQuant, ...- colData
A
data.frame
(or any object that can be coerced to adata.frame
) containing sample/column annotations, includingquantCols
andrunCol
(see details).- quantCols
A
numeric()
,logical()
orcharacter()
defining the columns of theassayData
that contain the quantitative data. This information can also be defined incolData
(see details).- runCol
For the multi-set case, a
numeric(1)
orcharacter(1)
pointing to the column ofassayData
(andcolData
, is set) that contains the runs/batches. Make sure that the column name in both tables are identical and syntactically valid (if you supply acharacter
) or have the same index (if you supply anumeric
). Note that characters are converted to syntactically valid names usingmake.names
- multiplexing
A
character(1)
indicating the type of multiplexing used in the experiment. One of"none"
(default, for label-free experiments) or"mTRAQ"
(for plexDIA experiments).- extractedData
A
data.frame
or any object that can be coerced to adata.frame
that contains the data from the*_ms1_extracted.tsv
file generated by DIA-NN. This argument is optional and is currently only applicable for mTRAQ multiplexed experiments where DIA-NN was run using theplexdia
module (see references).- ecol
Same as
quantCols
. Available for backwards compatibility. Default isNULL
. If bothecol
andcolData
are set, an error is thrown.- verbose
A
logical(1)
indicating whether the progress of the data reading and formatting should be printed to the console. Default isTRUE
.- ...
Further arguments passed to
readQFeatures()
.
Value
An instance of class QFeatures
. The quantiative data of
each acquisition run is stored in a separate set as a
SummarizedExperiment
object.
References
Derks, Jason, Andrew Leduc, Georg Wallmann, R. Gray Huffman, Matthew Willetts, Saad Khan, Harrison Specht, Markus Ralser, Vadim Demichev, and Nikolai Slavov. 2022. "Increasing the Throughput of Sensitive Proteomics by plexDIA." Nature Biotechnology, July. Link to article
See also
The
QFeatures
(seeQFeatures()
) class to read about how to manipulate the resultingQFeatures
object.The
readQFeatures()
function which this one depends on.
Examples
x <- read.delim(MsDataHub::benchmarkingDIA.tsv())
#> see ?MsDataHub and browseVignettes('MsDataHub') for documentation
#> loading from cache
x[["File.Name"]] <- x[["Run"]]
#################################
## Label-free multi-set case
## using default arguments
readQFeaturesFromDIANN(x)
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> An instance of class QFeatures containing 24 set(s):
#> [1] RD139_Overlap_UPS1_0_1fmol_inj1: SummarizedExperiment with 28980 rows and 1 columns
#> [2] RD139_Overlap_UPS1_0_1fmol_inj2: SummarizedExperiment with 29495 rows and 1 columns
#> [3] RD139_Overlap_UPS1_0_1fmol_inj3: SummarizedExperiment with 29210 rows and 1 columns
#> ...
#> [22] RD139_Overlap_UPS1_5fmol_inj1: SummarizedExperiment with 30941 rows and 1 columns
#> [23] RD139_Overlap_UPS1_5fmol_inj2: SummarizedExperiment with 30321 rows and 1 columns
#> [24] RD139_Overlap_UPS1_5fmol_inj3: SummarizedExperiment with 24168 rows and 1 columns
## use the precursor identifier as assay rownames
readQFeaturesFromDIANN(x, fnames = "Precursor.Id") |>
rownames()
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> Setting assay rownames.
#> CharacterList of length 24
#> [["RD139_Overlap_UPS1_0_1fmol_inj1"]] AAAAIAGELGLEFK2 ... YYTETEGALR2
#> [["RD139_Overlap_UPS1_0_1fmol_inj2"]] AAAAEIAVK1 ... YYTETEGALR2
#> [["RD139_Overlap_UPS1_0_1fmol_inj3"]] AAAAIAGELGLEFK2 ... YYTETEGALR2
#> [["RD139_Overlap_UPS1_0_25fmol_inj1"]] AAAAEIAVK1 AAAAEIAVK2 ... YYTETEGALR2
#> [["RD139_Overlap_UPS1_0_25fmol_inj2"]] AAAAEIAVK1 ... YYTETEGALR2
#> [["RD139_Overlap_UPS1_0_25fmol_inj3"]] AAAAIAGELGLEFK2 ... YYTETEGALR2
#> [["RD139_Overlap_UPS1_10fmol_inj1"]] AAAAEIAVK1 AAAAIAGELGLEFK2 ... YYTLEEIQK2
#> [["RD139_Overlap_UPS1_10fmol_inj2"]] AAAAEIAVK1 AAAAIAGELGLEFK2 ... YYTLEEIQK2
#> [["RD139_Overlap_UPS1_10fmol_inj3"]] AAAAEIAVK1 AAAAIAGELGLEFK2 ... YYTLEEIQK2
#> [["RD139_Overlap_UPS1_1fmol_inj1"]] AAAAEIAVK1 AAAAIAGELGLEFK2 ... YYTETEGALR2
#> ...
#> <14 more elements>
## with a colData (and default arguments)
cd <- data.frame(sampleInfo = LETTERS[1:24],
quantCols = "Ms1.Area",
runCol = unique(x[["File.Name"]]))
readQFeaturesFromDIANN(x, colData = cd)
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> An instance of class QFeatures containing 24 set(s):
#> [1] RD139_Overlap_UPS1_0_1fmol_inj1: SummarizedExperiment with 28980 rows and 1 columns
#> [2] RD139_Overlap_UPS1_0_1fmol_inj2: SummarizedExperiment with 29495 rows and 1 columns
#> [3] RD139_Overlap_UPS1_0_1fmol_inj3: SummarizedExperiment with 29210 rows and 1 columns
#> ...
#> [22] RD139_Overlap_UPS1_5fmol_inj1: SummarizedExperiment with 30941 rows and 1 columns
#> [23] RD139_Overlap_UPS1_5fmol_inj2: SummarizedExperiment with 30321 rows and 1 columns
#> [24] RD139_Overlap_UPS1_5fmol_inj3: SummarizedExperiment with 24168 rows and 1 columns
#################################
## mTRAQ multi-set case
x2 <- read.delim(MsDataHub::Report.Derks2022.plexDIA.tsv())
#> see ?MsDataHub and browseVignettes('MsDataHub') for documentation
#> loading from cache
x2[["File.Name"]] <- x2[["Run"]]
readQFeaturesFromDIANN(x2, multiplexing = "mTRAQ")
#> Pivoting quantiative data.
#> Checking arguments.
#> Loading data as a 'SummarizedExperiment' object.
#> Splitting data in runs.
#> Formatting sample annotations (colData).
#> Formatting data as a 'QFeatures' object.
#> An instance of class QFeatures containing 54 set(s):
#> [1] wJD1146: SummarizedExperiment with 2635 rows and 3 columns
#> [2] wJD1147: SummarizedExperiment with 3000 rows and 3 columns
#> [3] wJD1148: SummarizedExperiment with 2676 rows and 3 columns
#> ...
#> [52] wJD1203: SummarizedExperiment with 4441 rows and 3 columns
#> [53] wJD1204: SummarizedExperiment with 4416 rows and 3 columns
#> [54] wJD1205: SummarizedExperiment with 4492 rows and 3 columns