-
We are trying to determine how Slurm handles when a certain node type is unavailable on a zone for a partition w dynamic nodes. I would hope that it keeps jobs on the queue and continues to retry to setup the node until they appear (similar to the newly announced Dynamic Workload Scheduler) but would like to confirm since it's not easy to see from the documentation here and I have not been able to reproduce it. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
When a stockout occurs, the output of One possible error message when there's a stockout (found in
|
Beta Was this translation helpful? Give feedback.
When a stockout occurs, the output of
sacct
on the login/controller node will show that the status of the submitted job changed toNODE_FAIL
. This job will not be requeued if it was submitted viasrun
, however it will be requeued if submitted viasbatch
.One possible error message when there's a stockout (found in
/var/log/slurm/resume.log
on the controller) could be: