Skip to content

Slurm handling stockout in zone #2463

Answered by rohitramu
casassg asked this question in Q&A
Discussion options

You must be logged in to vote

When a stockout occurs, the output of sacct on the login/controller node will show that the status of the submitted job changed to NODE_FAIL. This job will not be requeued if it was submitted via srun, however it will be requeued if submitted via sbatch.

One possible error message when there's a stockout (found in /var/log/slurm/resume.log on the controller) could be:

bulkInsert operation errors: VM_MIN_COUNT_NOT_REACHED

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by nick-stroud
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants