Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot submit jobs from a GPU Node #482

Open
ziw-liu opened this issue Sep 24, 2024 · 2 comments
Open

[BUG] Cannot submit jobs from a GPU Node #482

ziw-liu opened this issue Sep 24, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@ziw-liu
Copy link
Contributor

ziw-liu commented Sep 24, 2024

When submitting jobs on a GPU node, the generated batch job requests GRES that is not valid:

srun: error: Unable to create step for job 16413634: Invalid generic resource (gres) specification

Submitting the same job on a login node succeeds.

@ziw-liu ziw-liu added the bug Something isn't working label Sep 24, 2024
@ieivanov
Copy link
Contributor

ieivanov commented Oct 8, 2024

Here is the error I got with similar submision:

srun: fatal: SLURM_MEM_PER_CPU, SLURM_MEM_PER_GPU, and SLURM_MEM_PER_NODE are mutually exclusive.

@edyoshikun
Copy link
Contributor

I haven't been able to reproduce this bug. I used a reservation done via

 sbatch --job-name=nomachine --constraint=nomachine --partition=interactive --mem-per-cpu=8G --cpus-per-task=16 --gpus=1 --time=5-0:00:00 --wrap "sleep 120h" --output=$HOME/logs/sbatch.out --nodelist=gpu-sm01-02

submitted jobs for deskew and reconstruction without a problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants