Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is k2mask set to use only half the number of available cores? #858

Open
hermidalc opened this issue Jul 30, 2024 · 5 comments
Open

Why is k2mask set to use only half the number of available cores? #858

hermidalc opened this issue Jul 30, 2024 · 5 comments

Comments

@hermidalc
Copy link

kraken2/scripts/k2

Lines 485 to 487 in 4cbdc5f

argv = masking_binary + " -outfmt fasta -threads {} -r x".format(
multiprocessing.cpu_count() // 2
)

@ChillarAnand
Copy link

For kraken2 I observed that performance is best at 1/2 to 3/4 of available cores. After that performance decreases.

Not sure about if it is the same case with k2mask. Need to run verify it.

https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html

@hermidalc
Copy link
Author

hermidalc commented Jul 30, 2024

For kraken2 I observed that performance is best at 1/2 to 3/4 of available cores. After that performance decreases.

Not sure about if it is the same case with k2mask. Need to run verify it.

https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html

The downside to it being hardcoded like that is when you run Kraken2 in a cluster environment you don't want the script looking at all the cores available on a node because you typically request a certain number of cores to use for the job and the job could be assigned to a node with a lot more cores than what you requested (that other jobs are using).

@ChillarAnand
Copy link

I agree that it shouldn't be hardcoded. I am trying to understand why it could be hardcoded to multiprocessing.cpu_count() // 2 in the initial stage.

@ch4rr0
Copy link
Collaborator

ch4rr0 commented Aug 2, 2024

I agree that this change should be configurable via the command line. I will address this in my next commit to k2.

@ChillarAnand
Copy link

Raised PR for that #866

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants