-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust device permissions #61
base: main
Are you sure you want to change the base?
Conversation
Note: for our modules for the time being we will stick without the |
Wow! You must have a hard time to figure this out yesterday! I've tested open and proprietary driver. Both work fine with that configuration. I noticed that open driver doesn't add any ACLs for
but had set
whereas for the proprietary driver ACLs for these devices are added + these settings of
Output for open driver:
|
@e4t Will this be an issue that adding users to the |
Eh, did it this morning, no time yesterday, but where I did test is a system with a GTX 1600, so no open drivers to test, it supports only the proprietary modules.
I'm not sure I understand from your comment if it works or not... I guess this is due to the fact that the logic is, if you are trying to access an nvidia character file, create them with The other devices without the ACL, it's probably because the user/process logged in on the console was not trying to access them but it's just a result of the Proprietary modules are not staying forever, maybe we can clean that up when they disappear. Regarding the |
Sure, still pre-Turing hardware.
Don't worry. Your solution works with open AND proprietary driver.
Well a session was running with this user. Same scenario with open and proprietary driver, everything working fine, but different
Yes, but could be in year 2027/2028. ;-)
That's indeed true.
Of course, but it's an easy solution for open and proprietary driver. Behaviour is just different than before. We may need to document this somehow/somewhere. |
Do you know of any? I'm not aware of such a thing. It could be that Slurm supports ACLs now - I've read in the release notes that they have improved handling of GPUs - but I don't know. |
Woah, long holidays, congrats! 🥇 |
Just had a try by adding a second
But this breaks again the
Users NOT in the video group can still log in graphically, but |
@scaronni-nvidia Any new ideas about this? Otherwise I would say we keep it as is for now and dig into this again once @e4t is back from vacation in January. |
I would say leave the merge request open and we'll update as soon as we have something to show. Anyway it needs the approval of Egbert? |
I might have time tomorrow to look at it. |
I would say we don't need Egbert's input if we can keep the possibility to just add users to the |
Ok will try. |
@scaronni-nvidia Any outcomings already? I'll have the vacation from Tue Dec 17 2024 until Mon Jan 6 2025 |
Had no time yet, sorry. Maybe in the next days. I'll try to sort it out before the end of the week. |
In releation to #57 and #52.
This pull request makes the following changes:
kmp-post.sh
andkmp-trigger.sh
.udev
rule to work for all NVIDIA modules, not justnvidia
.uaccess
ACL on device files.video
group (a compute cluster node does not have any "video").nvidia
optionsNVreg_DeviceFileUID
andNVreg_DeviceFileGID
, the proprietary modules get theuaccess
ACL correctly. I don't know the exact reason, but I guess it's a race condition when the device files get created.