Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[slurm] Add slurm plugin #3329

Merged
merged 1 commit into from
Aug 9, 2023
Merged

[slurm] Add slurm plugin #3329

merged 1 commit into from
Aug 9, 2023

Conversation

arif-ali
Copy link
Member

@arif-ali arif-ali commented Aug 7, 2023

Slurm is a workload manager in the HPC space, this is a start on this, and there may be further additions in the future

Resolves: #3329


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?

@packit-as-a-service
Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-3329
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

Slurm is a workload manager in the HPC space, this is a start on this,
and there may be further additions in the future.

Signed-off-by: Arif Ali <[email protected]>
@arif-ali arif-ali marked this pull request as ready for review August 8, 2023 20:09
'sinfo --all --long',
])

if is_executable('squeue'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test might not be needed. If a command isn't executable, sos won't store output of it (https://github.com/sosreport/sos/blob/main/sos/utilities.py#L261-L270) - but will add a record to manifest.json.

But definitely it is fine to use the test here.

'topology',
]

if is_executable('scontrol'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, here the test makes a lot of sense :) (test executable once, save many "command not found" deadends ahead). :)

@arif-ali
Copy link
Member Author

arif-ali commented Aug 8, 2023

I tested it without the test, and the file was created, so hence added the tests.

# more sosreport-slurm-comp01-2023-08-08-ddtiwod/sos_commands/slurm/squeue_--all_--long
timeout: failed to run command ‘squeue’: No such file or directory

I could easily split the plugin out to multiple plugins, one for slurmd, slurmctld and slurmdbd, but pointless imho. It was easier to add everything in one rather than multiple plugins.

slurmd nodes (on their own) will never have the commands, but logs would be there, the commands are only on hosts that the server service is running, slurm or EL hosts, and slurm-client for Ubuntu.

Alternative method would have been to test for the above 2 packages for the command availability

@pmoravec
Copy link
Contributor

pmoravec commented Aug 8, 2023

I tested it without the test, and the file was created, so hence added the tests.

# more sosreport-slurm-comp01-2023-08-08-ddtiwod/sos_commands/slurm/squeue_--all_--long
timeout: failed to run command ‘squeue’: No such file or directory

I could easily split the plugin out to multiple plugins, one for slurmd, slurmctld and slurmdbd, but pointless imho. It was easier to add everything in one rather than multiple plugins.

slurmd nodes (on their own) will never have the commands, but logs would be there, the commands are only on hosts that the server service is running, slurm or EL hosts, and slurm-client for Ubuntu.

Alternative method would have been to test for the above 2 packages for the command availability

Ah I see, then the test makes a lot of sense on either place.

If there is more stuff to collect common than different, I would stay with one plugin.

@TurboTurtle TurboTurtle merged commit 7545f7d into sosreport:main Aug 9, 2023
30 checks passed
@arif-ali arif-ali deleted the slurm branch October 2, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants