Skip to content

Latest commit

 

History

History
118 lines (83 loc) · 3.46 KB

README.md

File metadata and controls

118 lines (83 loc) · 3.46 KB

R-CMD-check

The multiGSEA R package

Authors

Sebastian Canzler and Jörg Hackermüller

multiGSEA: a GSEA-based pathway enrichment analysis for multi-omics data, BMC Bioinformatics 21, 561 (2020)

Introduction

The multiGSEA package was designed to run a robust GSEA-based pathway enrichment for multiple omics layers. The enrichment is calculated for each omics layer separately and aggregated p-values are calculated afterwards to derive a composite multi-omics pathway enrichment.

Pathway definitions can be downloaded from up to eight different pathway databases by means of the graphite Bioconductor package.

Features of the transcriptome and proteome level can be mapped to the following ID formats:

* Entrez Gene ID
* Uniprot IDs
* Gene Symbols
* RefSeq
* Ensembl

Features of the metabolome layer can be mapped to:

* Comptox Dashboard IDs (DTXCID, DTXSID)
* CAS-RN numbers
* Pubchem IDs (CID)
* HMDB IDs
* KEGG IDs
* ChEBI IDs
* Drugbank IDs
* Common names

Please note, that the mapping of metabolite IDs is accomplished through the metaboliteIDmapping package. This AnnotationHub package provides a comprehensive mapping table with more than one million compounds (metaboliteIDmapping on our github page or at Bioconductor).

Installation

There are two ways to install the multiGSEA package. For both you have to install and start R in at least version 4.0:

(i) Use the Bioconductor framework:

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("multiGSEA")

(ii) Alternatively, you can install the most up to date version (development) easily with devtools:

install.packages("devtools")
devtools::install_github("https://github.com/yigbt/multiGSEA")

Once installed, just load the multiGSEA package with:

library(multiGSEA)

Workflow

A common workflow is exemplified in the package vignette and is typically separated in the following steps:

  1. Load required libraries, including the multiGSEA package, and omics data sets.
  2. Create data structure for enrichment analysis.
  3. Download and customize the pathway definitions.
  4. Run the pathway enrichment for each omics layer.
  5. Calculate the aggregated pathway enrichment.

For more information please have a look in the vignette at our Bioconductor page.

LICENSE

Copyright (C) 2011 - 2020 Helmholtz Centre for Environmental Research UFZ.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the UFZ License document for more details: https://github.com/yigbt/multiGSEA/blob/master/LICENSE.md