Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcache not reliable enough for education #159

Open
BSchilperoort opened this issue Apr 4, 2024 · 5 comments
Open

dcache not reliable enough for education #159

BSchilperoort opened this issue Apr 4, 2024 · 5 comments

Comments

@BSchilperoort
Copy link
Member

Occasionally Surf's dcache in not accessible, for hours at a time. This is annoying for researchers (or during demo's). However, if you use ewatercycle for teaching this is terrible.

@sverhoeven suggested rehosting a subset of what's on dcache on a SRC virtual machine, which can then be mounted on the machines which the students will work on.

@BSchilperoort
Copy link
Member Author

@RolfHut

@RolfHut
Copy link
Contributor

RolfHut commented Apr 4, 2024

Thank you @BSchilperoort for raising the issue. For education currently we only use the dCache data for generating forcing, so we would also be helped if the import ewatercycle (and mainly: import ewatercycle.forcing) doesn't fail when paths are not available, but only throws an error when paths are actually accessed. This would allow students that only need to run models to still do so, even if dCache is down.

@sverhoeven
Copy link
Member

sverhoeven commented Apr 10, 2024

I tried https://servicedesk.surf.nl/wiki/display/WIKI/Create+a+shared+storage+for+Linux on https://grader35.ewatercycle-nle.src.surf-hosted.nl/ with /data/shared being a mounted remote storage.

I was able to run apptainer and plot a netcdf file.

Operational steps:

  1. Create file server workspace
  2. Populate file server
  3. Create teaching machines

Cons:

  • extra machine with its own storage item
  • file server must be in same collaborative organization as teaching machines
  • Samba server catalog item needs slight tweaking, could make new catalog item if needed
    • client for dcache like rclone with creds
    • share should be read only

Pros:

  • only need dcache working while populating file server
  • smaller then dcache, restricted by storage items offered by Surf Research Cloud
  • main branch and grader branch are very different
  • no longer need cache storage item on teaching machine

Todos:

  • populate file server with training material + matching ewatercycle.yaml
  • add https://gitlab.com/rsc-surf-nl/plugins/plugin-samba/-/blob/main/samba-client-linux.yml to our roles
    • modify /etc/fstab to mount read only and use root only readable creds
  • make /etc/ewatercycle.yaml symlink to /data/shared/ewatercycle.yaml + replace /mnt/data with /data/shared
  • delete dcache mount stuff
  • (optional) host training material on non-src resources so material can be downloaded during creation of file server.
Notes

Create storage item called shared for ewatercycle-nlesc

In ewatercycle-nlesc collab org add secret
samba_password:

Create private network for ewatercycle-nlesc
name: file-storage-network

Create samba file server
attach:

  • storage: shared
  • network: file-storage-network
    name: fs1
    description: File server for eWatercycle teaching machines

On server
In smb.conf set to read only = yes
Restart samba systemctl restart smbd

Populate

cd /data/volume_2/samba-share
mkdir singularity-images
wget -O singularity-images/ewatercycle-pcrg-grpc4bmi_setters.sif 'https://webdav.grid.surfsara.nl/singularity-images/ewatercycle-pcrg-grpc4bmi_setters.sif?action=show&authz=<dcache macaroon>'
mkdir -p climate-data/obs6/Tier3/ERA5
wget -O climate-data/obs6/Tier3/ERA5/OBS6_ERA5_reanaly_1_day_pr_1995-1995.nc 'https://webdav.grid.surfsara.nl/climate-data/obs6/Tier3/ERA5/OBS6_ERA5_reanaly_1_day_pr_1995-1995.nc?action=show&authz=<dcache macaroon>'

Create teaching machine

  • Add private network

Mount samba share
Install + configure similar to https://gitlab.com/rsc-surf-nl/plugins/plugin-samba/-/blob/main/samba-client-linux.yml

nmap -p 445 -T4 -v 10.10.10.0/24 | awk -F'[ /]' '/Discovered open port/{print $NF}'
10.10.10.44
# in /etc/fstab
//10.10.10.44/samba-share /data/shared cifs username=smbuser,password=<samba password>,ro,cache=loose
# replace password= with credentials file which is not readable by others
mount -a

Test

from grpc4bmi.bmi_client_apptainer import BmiClientApptainer
!mkdir /tmp/work
client = BmiClientApptainer('/data/shared/singularity-images/ewatercycle-pcrg-grpc4bmi_setters.sif',work_dir='/tmp/work')
client.get_component_name()
'pcrglobwb'
del client
import xarray as xr
ds = xr.open_dataset('/data/shared/climate-data/obs6/Tier3/ERA5/OBS6_ERA5_reanaly_1_day_pr_1995-1995.nc')
ds
ds['pr'].isel(time=15).plot()

@RolfHut
Copy link
Contributor

RolfHut commented Apr 10, 2024

Thank you @sverhoeven . I think I got most of that :-). Relating to "(optional) host training material on non-src resources so material can be downloaded during creation of file server.". Maybe we can host this on a repo like Zenodo, 4TU? Or if it is purely for teaching we can even make it available as part of Open Educational Resources?

@BSchilperoort
Copy link
Member Author

We could host the data somewhere, but we would have to decide what exactly, and solve some potential issues:

  • apptainer images (for running containerized models). *
  • parameter sets for models you want to include (have to make sure we don't violate any copyright)
  • CMORized ERA5 data (which need a lot of storage... 4TU would be possible, if we can stay under 1TB**)

* Alternative could be a script to generate them from docker images? not sure what's better @sverhoeven

**At TU Delft you're can upload 1TB/year for free. Zenodo is max 50GB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants