Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a way to get a feeling for where my jobs are in the priority queue #74

Open
lesteve opened this issue Oct 18, 2021 · 4 comments
Open

Comments

@lesteve
Copy link
Member

lesteve commented Oct 18, 2021

squeue --start can help but sometimes too optimistic or can not give an estimate (N/A).

Here are some notes from a while ago that can be used as as starting point if someone wants to refine them.

https://slurm.schedmd.com/priority_multifactor.html

❯ sprio -w
          JOBID PARTITION   PRIORITY        AGE  FAIRSHARE        QOS
        Weights                           10000     100000     250000

To have all factors entering priority:

sprio -l -p gpu_p13

Show factor for different QoS:

sacctmgr show qos format=name,priority

From squeue man output:

%p Priority of the job (converted to a floating point number between 0.0 and 1.0)
%Q Priority of the job (generally a very large unsigned integer).
squeue -p gpu_p13 -o "%i;%P;%j;%u;%T;%M;%l;%D;%R;%p;%Q;%q;%f" --sort p

The idea would be start with this and do histograms of priority of running jobs.

Not really related but I found:

%f (for contraints, features in Slurm jargon, e.g. 16GB or 32GB nodes)
@RemiLacroix-IDRIS
Copy link
Contributor

I doubt you can get a better estimate than what --time gives you. Slurm tries the best estimate it can but new jobs are being continually added, with maximum duration that can actually be much longer than the real duration so it's pretty difficult to get it right.

@lesteve
Copy link
Member Author

lesteve commented Oct 18, 2021

I guess my original idea when I looked at it a while ago (it was discussed during at least one Users Committee) was to get a better intuition about priority. Things like these could be useful:

  • this job is in the 10% lowest priority jobs.
  • this job is currently ranked 5th in the pending jobs
  • compare your pending job priority to the priority of the jobs that are running to see how big the gap is

I would think also that using historical data could potentially lead to a better estimate than what squeue --time is doing but this is not something I want to look at for now 😉

@RemiLacroix-IDRIS
Copy link
Contributor

We have a command in beta test that shows the rank of you jobs and another that tries to show how your fairshare compare to others. I guess that could fit your need, at least partially.

@lesteve
Copy link
Member Author

lesteve commented Oct 19, 2021

Good to know, let us know when the command can be tested by normal users!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants