GRO-seq pipeline package (Wei, Pei-Chi group; DKFZ)

Global Run-On Sequencing (GRO-Seq) pipeline for analyzing transcription activity of genes from engaged RNA polymerase.

Singularity

Pull GRO-seq image from data server

singularity pull docker://sandrejev/groseq:latest

Run container

Before running GROseq pipeline you will need to obtain genome(fasta), bowtie index (.bt2), chromosome sizes (.chrom.sizes) and annotation. For mm9, mm10 and hg19 these can be downloaded automatically with download command. The only exception being annotation *.bed file.

singularity exec -B `pwd` groseq_latest.sif download mm10

Run groseq pipeline. Keep in mind that annotation file (-a flag) is created automatically for you from geneRef.gtf

singularity exec -B `pwd` groseq_latest.sif groseq -f AS-512172-LR-52456/fastq/AS-512172-LR-52456_R1.fastq -a mm10.refGene.bed -g ./mm10 -o "singularity1" --chromInfo mm10.chrom.sizes

You can create annotation file with different clipping at the start or end of the transcript using longest-transcript command

singularity exec -B `pwd` groseq_latest.sif longest-transcript mm10.refGene.gtf.gz mm10.refGene.bed --clip-start=50

You can extract rpkm using extract-rpkm command (done automatically as part of the pipeline)

singularity exec -B `pwd` groseq_latest.sif extract-rpkm -a mm10.refGene.bed -o AS-512172-LR-52456_R1 --clip-start=50

For convenience GRO-seq image contains a script that can be used to run the pipeline on LSF cluster

# Run GRO-seq pipeline on all *.fastq files in the folder
singularity exec -B `pwd` groseq_latest.sif lsf | bsub

# Run GRO-seq pipeline on all *.fastq files in the folder that match the pattern. Additionaly prefix the results with tag PREFIX
singularity exec -B `pwd` groseq_latest.sif lsf PREFIX --pattern "AS-512178" | bsub

Inspect container

singularity shell groseq_latest.sif

Docker

Pull GRO-seq image from DockerHUB

docker pull sandrejev:groseq

Run container

Before running GROseq pipeline you will need to obtain genome(fasta), bowtie index (.bt2), chromosome sizes (.chrom.sizes) and annotation. For mm9, mm10 and hg19 these can be downloaded automatically with download command. The only exception being annotation *.bed file.

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint download groseq mm10

Run groseq pipeline. Keep in mind that annotation file (-a flag) is created automatically for you from geneRef.gtf

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it groseq -f AS-512172-LR-52456/fastq/AS-512172-LR-52456_R1.fastq -a mm10.refGene.bed -g ./mm10 -o AS-512172-LR-52456 --chromInfo mm10.chrom.sizes

You can create annotation file with different clipping at the start or end of the transcript using longest-transcript command

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint longest-transcript groseq mm10.refGene.gtf.gz mm10.refGene.bed --clip-start=50

You can extract rpkm using extract-rpkm command (done automatically as part of the pipeline)

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint extract-rpkm groseq -a mm10.refGene.bed -o AS-512172-LR-52456_R1

For convenience GRO-seq image contains a script that can be used to run the pipeline on LSF cluster.

# Run GRO-seq pipeline on all *.fastq files in the folder
docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint lsf groseq | bsub

# Run GRO-seq pipeline on all *.fastq files in the folder that match the pattern. Additionaly prefix the results with tag PREFIX
docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint lsf groseq PREFIX --pattern "AS-512178" | bsub

Inspect container

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint bash groseq

Building GRO-seq docker image

Download required packages and files

To successfully build GRO-seq image first required libraries and files must be downloaded. This can be done by running following command

python download.py dependencies

Build docker image

To build Docker image you need to execute

docker build --squash --build-arg http_proxy="http://www.inet.dkfz-heidelberg.de:80" --build-arg https_proxy="http://www.inet.dkfz-heidelberg.de:80" --rm -t sandrejev/groseq:latest .

Push docker image to Docker HUB

docker login
docker push sandrejev/groseq:latest

Convert cached docker image to singularity (for local testing)

singularity pull docker-daemon:sandrejev/groseq:latest

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
data		data
preprocess		preprocess
src		src
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GRO-seq pipeline package (Wei, Pei-Chi group; DKFZ)

Table of Contents

Singularity

Pull GRO-seq image from data server

Run container

Inspect container

Docker

Pull GRO-seq image from DockerHUB

Run container

Inspect container

Building GRO-seq docker image

Download required packages and files

Build docker image

Push docker image to Docker HUB

Convert cached docker image to singularity (for local testing)

About

Releases 1

Packages

Contributors 2

Languages

brainbreaks/GROseq

Folders and files

Latest commit

History

Repository files navigation

GRO-seq pipeline package (Wei, Pei-Chi group; DKFZ)

Table of Contents

About

Resources

Stars

Watchers

Forks

Languages