Skip to content

Shared tools on Imperial HPC

Brian M. Schilder edited this page Aug 12, 2021 · 2 revisions

Many tools that our lab uses are not installed on Imperial HPC. Installing tools yourself on HPC can be tricky because you don't have root permissions. However, there are some ways around this:

  1. Install local executable files on HPC (covered on this page).
  2. Create conda environments on HPC (covered here).
  3. Request that it be installed via ASK. Do note that this can take some time depending on how busy the HPC team is.

The Tools folder (/rds/general/project/neurogenomics-lab/live/Tools) contains executables of software that can be used by the entire neurogenomics-lab group. It includes the following:

  • Description: Tool description.
  • Download steps: How the tool from downloaded and installed on HPC. The header includes a hyperlink to the official instructions.
  • Usage: Example usage. Any export commands can be pasted into your ~/.bashrc so they are automatically available when you next log into HPC. Alternatively, you could add the export commands to specific scripts.

cellranger-6.0.2

Description

CellRanger is a toolkit for the pre-processing of (single-cell) RNA-seq data. It contains a number of tools for creating fastq files, gene count matrices, and more.

  1. Download: curl -o cellranger-6.0.2.tar.gz "https://cf.10xgenomics.com/releases/cell-exp/cellranger-6.0.2.tar.gz?Expires=1628114300&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZi4xMHhnZW5vbWljcy5jb20vcmVsZWFzZXMvY2VsbC1leHAvY2VsbHJhbmdlci02LjAuMi50YXIuZ3oiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2MjgxMTQzMDB9fX1dfQ__&Signature=ksNjxn8WMwd5Q4eirSfxGXxoC6X41-RvV2XliuPR5iu1v9ftDs4f967z9W3krCUdDJxCpwAr5YGw4WOr-XZsHPc5h5eV7X5Zt7aXDEffnOUmIARhYdLn3utC1lm9bHsuzjwJVyxH3TjjsxgPa8PY7E5TXitxVSPZXvUJrJLTFHqi5d1xRKQJaVIKvMnuyN0OAJhZTxqQRwaSvjE-H7-U-y2Q1WReYFjFcYUgAwz-jkGqCbBb~e4D29-sYJyjoKCCbLLiZde3D85v1JVGy4zzsCVqkpiXYsV90uS-IPsoNgI1UaJ4nQ9LluZ4-nY0ihchjIqkRKvkexb9YHXG3kvETg__&Key-Pair-Id=APKAI7S6A5RYOXBWRPDA"
  2. Decompress (WARNING: takes a long time, ~1 hour):
    tar -xzvf cellranger-6.0.2.tar.gz
  3. Set permissions: chmod -R u=rwx,go=rx cellranger-6.0.2/

Usage

export PATH=/rds/general/project/neurogenomics-lab/live/Tools/cellranger-6.0.2:$PATH

cellranger -h

cellranger-arc-2.0.0

Description

CellRanger ARC is an extension of CellRanger that specialises in the pre-processing of Chromium Single Cell Multiome ATAC + Gene Expression sequencing data. However, CellRanger ARC also includes a number of post-processing (secondary) analysis pipelines that can, for example, generate PCA/t-SNE/UMAP projections and cell clusters as well as feature linkages.

Usage

export PATH=/rds/general/project/neurogenomics-lab/live/Tools/cellranger-arc-2.0.0:$PATH

cellranger-arc -h

cellranger-atac-1.2.0

Description

CellRanger ATAC is an extension of CellRanger that specialises in the pre-processing of Chromium Single Cell ATAC data.

Usage

export PATH=/rds/general/project/neurogenomics-lab/live/Tools/cellranger-atac-1.2.0/cellranger-atac-1.2.0:$PATH

cellranger-atac -h

jdk-11.0.12

Description

Update JAVA executables. The default JAVA version on HPC is not updated and can cause conflicts with other software like Nextflow. Therefore you need to use this local version instead.

Download the Java SE Development Kit (JDK) from the Oracle website. Annoyingly, they now require you to log into an Oracle account in order to donwload their software. This means you can't simply wget/curl this software directly from HPC.

Instead, you must download via a web browser on your local computer, and then copy the file over to HPC by dragging it into your mounted folder (if you have that set up), or:

scp <local_file_path> <hpc_username>:@wmcr-nskene.med.ic.ac.uk:/rds/general/project/neurogenomics-lab/live/Tools

Once the compressed file is on HPC, decompress it:

tar -xzvf jdk-11.0.12_linux-x64_bin.tar.gz

Usage

When using other software like Nextflow, you must first load the paths to this updated version of JAVA to override the HPC defaults.

export PATH=/rds/general/project/neurogenomics-lab/live/Tools/jdk-11.0.12/bin:$PATH
export JAVA_HOME=/rds/general/project/neurogenomics-lab/live/Tools/jdk-11.0.12

nextflow-21.04.3.5560

Description

Nextflow is a workflow executation software that allows you to write rboust and reproducible pipelines (at least in theory). However, HPC is not set up properly to use Nextflow as-is. Therefore a lot of work is needed to get Nextflow to work on HPC. I've tried to document as many of these steps as possible.

I've downloaded the latest version of Nextflow (v21.04.3.5560 as of Aug 4th, 2021) and provided it here.

Nextflow requires JAVA version 8 or greater. Therefore, you will need to follow the steps in the jdk-11.0.12 section in add.

The following command downloads nextflow to your current working directory:

curl -s https://get.nextflow.io | bash

Usage

export PATH=/rds/general/project/neurogenomics-lab/live/Tools/nextflow-21.04.3.5560:$PATH 

nextflow run -h

nextflow (conda)

Alternatively, you can install Nextflow via a conda environment. This can be useful when trying to make sure all other softeware are compatible with a particular version of Nextflow.

  1. Install miniconda/anaconda if you haven't done so already.
  2. Create a conda env using the following yaml file (stored remotely).
# Only need to create the conda env once. 
conda env create -f https://github.com/bschilder/scKirby/raw/main/inst/conda/nfcore.yml

Usage

# Extra steps are required only when on HPC
module load anaconda3/personal
bash

# Activate the conda env
conda activate nfcore

# May need to run this extra export step if you have other versions of Java installed on your machine that are overriding your conda-installed version. 
# export JAVA_HOME=/opt/anaconda3/envs/nfcore 

# You will now be using the version of nextflow that is inside your conda env. 
nextflow

magma_v1.08a

Description

MAGMA is a tool for gene analysis and generalized gene-set analysis of GWAS data. It can be used to analyse both raw genotype data as well as summary SNP p-values from a previous GWAS or meta-analysis.

Download the "Linux (Debian, 64 bits)" version of magma from their website, or simply run:

wget https://ctg.cncr.nl/software/MAGMA/prog/magma_v1.09a.zip
unzip magma_v1.09a.zip

The magma executable will now be in your current working directory and you can move it wherever you.

The MAGMA documentation website also includes auxillary files, such as reference genomes, which you can use.

Usage

export PATH=/rds/general/project/neurogenomics-lab/live/Tools/magma_v1.08a:$PATH 

magma

define1.0

?

Ask Nathan for documentation.

Clone this wiki locally