Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flat requirements for whole genome mode are wasteful #773

Open
adamnovak opened this issue Nov 15, 2019 · 0 comments
Open

Flat requirements for whole genome mode are wasteful #773

adamnovak opened this issue Nov 15, 2019 · 0 comments

Comments

@adamnovak
Copy link
Member

When running a whole genome construct run, I have chromosomes 1-22, X, Y, and then a bunch of different little unplaced/unlocalized contigs and decoys.

We're using e.g. 200 GB of memory to compute snarls for each of those chromosomes, including all the tiny ones, but I don't observe them using nearly that much memory. It could be that chr1 takes that much memory, or even that a whole genome combined graph that we might ask to index takes that much memory, but chr21 doesn't, to say nothing of all the little unlocalized bits and decoys.

We should have some way to scale our job requirements based on file size, and/or we should run a test run with Toil's stat collection on and cut limits down to closer to what is really needed now with current vg.

This is currently causing me to waste most of our lab Kubernetes cluster capacity as unused-but-subscribed memory, and making my run very slow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant