Skip to content

brainbreaks/GROseq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GRO-seq pipeline package (Wei, Pei-Chi group; DKFZ)

Global Run-On Sequencing (GRO-Seq) pipeline for analyzing transcription activity of genes from engaged RNA polymerase. DOI

Table of Contents

singularity pull docker://sandrejev/groseq:latest

Before running GROseq pipeline you will need to obtain genome(fasta), bowtie index (.bt2), chromosome sizes (.chrom.sizes) and annotation. For mm9, mm10 and hg19 these can be downloaded automatically with download command. The only exception being annotation *.bed file.

singularity exec -B `pwd` groseq_latest.sif download mm10

Run groseq pipeline. Keep in mind that annotation file (-a flag) is created automatically for you from geneRef.gtf

singularity exec -B `pwd` groseq_latest.sif groseq -f AS-512172-LR-52456/fastq/AS-512172-LR-52456_R1.fastq -a mm10.refGene.bed -g ./mm10 -o "singularity1" --chromInfo mm10.chrom.sizes

You can create annotation file with different clipping at the start or end of the transcript using longest-transcript command

singularity exec -B `pwd` groseq_latest.sif longest-transcript mm10.refGene.gtf.gz mm10.refGene.bed --clip-start=50

You can extract rpkm using extract-rpkm command (done automatically as part of the pipeline)

singularity exec -B `pwd` groseq_latest.sif extract-rpkm -a mm10.refGene.bed -o AS-512172-LR-52456_R1 --clip-start=50

For convenience GRO-seq image contains a script that can be used to run the pipeline on LSF cluster

# Run GRO-seq pipeline on all *.fastq files in the folder
singularity exec -B `pwd` groseq_latest.sif lsf | bsub

# Run GRO-seq pipeline on all *.fastq files in the folder that match the pattern. Additionaly prefix the results with tag PREFIX
singularity exec -B `pwd` groseq_latest.sif lsf PREFIX --pattern "AS-512178" | bsub 
singularity shell groseq_latest.sif
docker pull sandrejev:groseq

Before running GROseq pipeline you will need to obtain genome(fasta), bowtie index (.bt2), chromosome sizes (.chrom.sizes) and annotation. For mm9, mm10 and hg19 these can be downloaded automatically with download command. The only exception being annotation *.bed file.

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint download groseq mm10

Run groseq pipeline. Keep in mind that annotation file (-a flag) is created automatically for you from geneRef.gtf

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it groseq -f AS-512172-LR-52456/fastq/AS-512172-LR-52456_R1.fastq -a mm10.refGene.bed -g ./mm10 -o AS-512172-LR-52456 --chromInfo mm10.chrom.sizes

You can create annotation file with different clipping at the start or end of the transcript using longest-transcript command

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint longest-transcript groseq mm10.refGene.gtf.gz mm10.refGene.bed --clip-start=50

You can extract rpkm using extract-rpkm command (done automatically as part of the pipeline)

docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint extract-rpkm groseq -a mm10.refGene.bed -o AS-512172-LR-52456_R1

For convenience GRO-seq image contains a script that can be used to run the pipeline on LSF cluster.

# Run GRO-seq pipeline on all *.fastq files in the folder
docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint lsf groseq | bsub

# Run GRO-seq pipeline on all *.fastq files in the folder that match the pattern. Additionaly prefix the results with tag PREFIX
docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint lsf groseq PREFIX --pattern "AS-512178" | bsub 
docker run -v ${PWD}:/mount -u $(id -g ${USER}):$(id -g ${USER}) -it --entrypoint bash groseq

To successfully build GRO-seq image first required libraries and files must be downloaded. This can be done by running following command

python download.py dependencies

To build Docker image you need to execute

docker build --squash --build-arg http_proxy="http://www.inet.dkfz-heidelberg.de:80" --build-arg https_proxy="http://www.inet.dkfz-heidelberg.de:80" --rm -t sandrejev/groseq:latest .
docker login
docker push sandrejev/groseq:latest
singularity pull docker-daemon:sandrejev/groseq:latest