The aim of the R for Mass Spectrometry initiative is to provide efficient, thoroughly documented, tested and flexible R software for the analysis and interpretation of high throughput mass spectrometry assays, including proteomics and metabolomics experiments. The project formalises the longtime collaborative development efforts of its core members under the RforMassSpectrometry organisation to facilitate dissemination and accessibility of their work.
Figure 1.1: The R for Mass Spectrometry intiative sticker, designed by Johannes Rainer.
This material introduces participants to the analysis and exploration of mass spectrometry (MS) based proteomics data using R and Bioconductor. The course will cover all levels of MS data, from raw data to identification and quantitation data, up to the statistical interpretation of a typical shotgun MS experiment and will focus on hands-on tutorials. At the end of this course, the participants will be able to manipulate MS data in R and use existing packages for their exploratory and statistical proteomics data analysis.
The course material is targeted to either proteomics practitioners or data analysts/bioinformaticians that would like to learn how to use R and Bioconductor to analyse proteomics data. Familiarity with MS or proteomics in general is desirable, but not essential as we will walk through and describe a typical MS data as part of learning about the tools. A beginner’s guide to mass spectrometry–based proteomics (Sinha and Mann 2020Sinha, Ankit, and Matthias Mann. 2020. “A beginner’s guide to mass spectrometry–based proteomics.” The Biochemist, September. https://doi.org/10.1042/BIO20200057.) is an approachable introduction to sample preparation, mass spectrometry and data analysis.
A working knowledge of R (R syntax, commonly used functions, basic data structures such as data frames, vectors, matrices, … and their manipulation) is required. Familiarity with other Bioconductor omics data classes and the tidyverse syntax is useful, but not necessary.
This material uses the latest version of the R for Mass Spectrometry package and their dependencies. It might thus be possible that even the latest Bioconductor stable version isn’t recent enough.
To install all the necessary package, please use the latest release of R and execute:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") ::install("tidyverse") BiocManager::install("factoextra") BiocManager::install("msdata") BiocManager::install("mzR") BiocManager::install("rhdf5") BiocManager::install("rpx") BiocManager::install("MsCoreUtils") BiocManager::install("QFeatures") BiocManager::install("Spectra") BiocManager::install("ProtGenerics") BiocManager::install("PSMatch") BiocManager::install("pheatmap") BiocManager::install("limma") BiocManager::install("impute") BiocManager::install("MSnID") BiocManager::install("RforMassSpectrometry/SpectraVis")BiocManager
Follow the instructions in this script to install the packages and download some of the data used in the following chapters. All software versions used to generate this document are recoded at the end of the book in 7.
To compile and render the teaching material, you will also need the BiocStyle package and the (slighly modified) Modern Statistics for Model Biology (msmb) HTML Book Style by Mike Smith:
::install(c("bookdown", "BiocStyle", "lgatto/msmbstyle"))BiocManager
Run the installation script by executing the line below to install all requirements to compile the book:
Thank you to Charlotte Soneson for fixing many typos in a previous version of this book.
This material is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. You are free to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) for any purpose, even commercially, as long as you give appropriate credit and distribute your contributions under the same license as the original.
Page built: 2023-05-27 using R version 4.3.0 (2023-04-21)