-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow customization of Parallelstore mounts #3144
base: develop
Are you sure you want to change the base?
Allow customization of Parallelstore mounts #3144
Conversation
In this case, if you |
Current behavior we dont use systemd unit to mount the instance. We manually mount it using bash and |
I strongly recommend |
My idea was to run The only thing to do is to instead call directly the |
Using systemd always would simplify the code, as there would only one way to mount the filesystem, whether this is the first startup of the instance, or the next one. The only drawback is that I did not find a way, to fail I can add a check after the call, with some sleeps to check if either service started successfully or the mountpoint is mounted and fail the startup script if those checks fail. WDYT? |
/gcbrun |
You may need to rebase this after #3256 (this one was high priorty). I have discussed with Ivan and he will follow up with on this as I am ooo for 3 weeks from today. Btw, we agreed on moving forward with adding customization for mounts. Thanks for this PR. |
0d3506a
to
87f6475
Compare
Thanks for letting me know. For now, I rebased this PR on top of #3256, once it is in, I can skip a few commits from here. I aligned now |
@@ -31,7 +31,7 @@ stdlib::runner() { | |||
stdlib::info "=== start executing runner: $object ===" | |||
case "$1" in | |||
ansible-local) stdlib::run_playbook "$destpath/$filename" "$args";; | |||
shell) chmod u+x /$destpath/$filename && ./$destpath/$filename $args;; | |||
shell) chmod u+x /$destpath/$filename && $destpath/$filename $args;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is to fix failures like this:
startup-script: Mon Dec 16 08:44:43 -0500 2024 Info [4589]: === start executing runner: early_run_hotfixes.sh-b383 ===
startup-script: /slurm/custom_scripts/controller.d/ghpc_daos_mount.sh: line 373: .//tmp/tmp.g1yoDyA6T5/early_run_hotfixes.sh: No such file or directory
startup-script: Mon Dec 16 08:44:43 -0500 2024 Info [4589]: === early_run_hotfixes.sh-b383 finished with exit_code=127 ===
startup-script: Mon Dec 16 08:44:43 -0500 2024 Error [4589]: === execution of early_run_hotfixes.sh-b383 failed, exiting ===
If re-running the script, using for example: DEBUG=1 google_metadata_script_runner startup
Add two customization options:
/etc/daos/daos_agent.yml
dfuse
processTopics for discussion:
/etc/daos/daos_agent.yml
- currently anyway everything is commented, and we just uncomment specific lines withsed
. The other approach is either to create a new file (as implemnted), or just make sure that everything is commented, and just add configuration at the end. I think this aproach is cleaner in terms of what is expected result / file content.systemctl start
command if dfuse failed to start and this is current behaviorRestart=always
to systemd unit. With my testing, one needs to run firstumount $mount_point
, but if this indeed is necessary, this could be wrapped in script together with$mount_command
, and this would make sure, that even ifdfuse
crashes, it will restartTODO
modules/file-system/pre-existing-network-storage