Skip to content

Standard preprocessing and processing scRNA-seq pipelines, ready for server use.

Notifications You must be signed in to change notification settings

umr1085-irset/scrnaseq_standard_pipes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Preprocessing and processing scRNA-seq pipelines

This repository encapsulates two pipelines and their respective launch scripts. Please read the section on how to launch the scripts

Preprocessing pipeline

  • Input: Seurat object (.rds file), with sample information stored under orig.ident in the metadata
  • Steps:
    • Load input Seurat file
    • Compute ribosomal and mitochondrial read percentages
    • Remove cells with less than 200 features
    • Remove genes with less than 10 expressing cells
    • Find cell outliers using Scater's runColDataPCA(), with ribosomal and mitochondrial read percentages, number of total reads and number of expressed genes as variables. Results stored under scateroutlier. PCA coordinates stored under scateroutlierPC1 and scateroutlierPC2
    • Find doublet cells using DoubletFinder. This is performed on individual samples. Results stored under DoubletFinder
    • Apply Seurat's SCTransform() and CellCycleScoring(). Rerun SCTransform() and regress on mitochondrial read percentage, S and G2M cell cycle scores
    • Compute dimensionality reductions (PCA, UMAP)
    • Find clusters, stored under seurat_clusters
    • Run Seurat's FindAllMakers() using the top 5000 most variable features
    • Save Seurat object

Processing pipeline

  • Input: Preprocessed Seurat object (.rds file)
  • Steps:
    • Load input preprocessed Seurat object
    • Remove outliers and doublets
    • Clean object (umap, pca, SCT assay, multiple metadata columns)
    • Remove genes with less than 10 expressing cells
    • Run SCTransform() and regress on mitochondrial read percentage, S and G2M cell cycle scores
    • Compute dimensionality reductions (PCA, UMAP)
    • Find clusters
    • Run Seurat's FindAllMakers() using the top 5000 most variable features
    • Save Seurat object

Use launch scripts

The launch scripts are to be used with Slurm's sbatch command on a server. Each script has Slurm options that can modified:

#SBATCH --job-name="pre_pipe"
#SBATCH --output=pre_pipe.out
#SBATCH --mem=500G
#SBATCH --cpus-per-task=16
#SBATCH --partition=sihp
#SBATCH --mail-user=<user-email>
#SBATCH --mail-type=ALL
#SBATCH --chdir=<output-dir>
  • job-name: Name of Slurm job
  • output: Name of log file
  • mem: Assigned memory (nust end with a G character)
  • cpus-per-task: Number of assigned cps
  • partition: cluster partition
  • mail-user: user email for notification purposes
  • mail-type: type of notifications (leave it set to ALL)
  • chdir: Output directory where files will be saved

Users have to reference the pipeline input and output files directly in the launch scripts. The pipeline scripts should NOT be modified in that regard.

To launch a script:

sbatch pre_pipe_launcher.sh

About

Standard preprocessing and processing scRNA-seq pipelines, ready for server use.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published