Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem methylation_calling latency #14

Open
alberto-rodriguezizquierdo opened this issue Feb 4, 2022 · 3 comments
Open

Problem methylation_calling latency #14

alberto-rodriguezizquierdo opened this issue Feb 4, 2022 · 3 comments

Comments

@alberto-rodriguezizquierdo

Hi,

My name is Alberto Rodriguez. Actually, I'm working with your package to analyze data coming from a epiGBS2 experiment using your protocol published in Biorxiv.

I have the following message from the log:

MissingOutputException in line 136 of /home/arodriguez/epiGBS2/src/rules/denovo.rules:
Job Missing files after 1000 seconds:
/home/arodriguez/epiGBS2/output/methylation_calling/CHH_OT_RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.txt
/home/arodriguez/epiGBS2/output/methylation_calling/CHG_OT_RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.txt
/home/arodriguez/epiGBS2/output/methylation_calling/CpG_OT_RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.txt
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 847 completed successfully, but some output files are missing. 847
File "/home/arodriguez/miniconda3/envs/snake/lib/python3.10/site-packages/snakemake/executors/init.py", line 583, in handle_job_success
File "/home/arodriguez/miniconda3/envs/snake/lib/python3.10/site-packages/snakemake/executors/init.py", line 252, in handle_job_success
Removing output files of failed job methylation_calling_denovo_bismark since they might be corrupted:
/home/arodriguez/epiGBS2/output/methylation_calling/RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.CX_report.txt, /home/arodriguez/epiGBS2/output/methylation_calling/RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.bismark.cov.gz, /home/arodriguez/epiGBS2/output/methylation_calling/CHH_OB_RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.txt, /home/arodriguez/epiGBS2/output/methylation_calling/CHG_OB_RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.txt, /home/arodriguez/epiGBS2/output/methylation_calling/CpG_OB_RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.txt
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/arodriguez/epiGBS2/.snakemake/log/2022-02-04T110740.662534.snakemake.log

I've tried changing the --latency-wait from 30 to 1000 s obtaining the same message. Could you suggest me how to solve that problem?

Thanks a lot!

Alberto.

@MaartenPostuma
Copy link
Collaborator

Hi Alberto,
This issue indicates that snakemake can not find certain files that should've have been generated during a certain step in the pipeline. In this case the methylation_calling fails to produce the CHH_OT/CG_OT/CHG_OT files.

These files are quite large so my first guess would be to check if your system has enough disk space.
If there is enough disk space I would recommend running the command outside the pipeline to see if you can get a more informative error message (see code below).

conda env create -f src/env/bismark.yaml -n bismark
conda activate bismark
bismark_methylation_extractor -p --CX --no_overlap --report --bedGraph --scaffolds --cytosine_report --genome_folder /home/arodriguez/epiGBS2/output/output_denovo/NNNNref/ /home/arodriguez/epiGBS2/output/alignment/RAYADA_MELONERA_3_trimmed_filt_merged.1_bismark_bt2_pe.bam -o /home/arodriguez/epiGBS2/output/methylation_calling/

@alberto-rodriguezizquierdo
Copy link
Author

Hi Marteen,

Thank you for your reply! I've tried and it works!

Kind regards,

Alberto.

@MWSchmid
Copy link

MWSchmid commented Mar 3, 2022

Hi Maarten

Btw, the bismark methylation extractor is very slow with large files and generates a lot of temporary files while sorting. Would be something to improve. Maybe with MethylDackel:

MethylDackel extract --CHG --CHH -@ 2 refGenome.fasta sampleX.sorted.bam

Best,

Marc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants