Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container-in-a-container issues #434

Open
samharrison7 opened this issue Jun 21, 2024 · 12 comments
Open

Container-in-a-container issues #434

samharrison7 opened this issue Jun 21, 2024 · 12 comments

Comments

@samharrison7
Copy link

Hi all,

Thought I would open up an issue to discuss the container-in-a-container issues that we in UKCEH (@CansuUluseker and @mjhollaway) are having.

Our goal is to use eWaterCycle within a DataLab project, but this is generally relevant to any system built on containers.

Suggestions for us to focus on after our latest meeting:

Did I forget anything important or get any of that wrong?

@BSchilperoort
Copy link
Member

Hi Sam,

This seems pretty complete. I expect no issues with the local python models, so that should at least help you run a model with forcing generated using ESMValTool.

Investigate whether giving Apptainer setuid privileges is an option within DataLabs

Yes, I hope that will work (not certain though). Otherwise, getting things to work would be become more complex, as explained in that Kubernetes issue.

@Daafip
Copy link
Collaborator

Daafip commented Jun 27, 2024

Could also try this related example: https://github.com/Daafip/ewatercycle-hbv

The docs still need to be improved & updated to explain how the local model can be used for HBV. But to run HBV local it essentially is:

  • running pip install HBV
  • replacing from ewatercycle.models import HBV with from ewatercycle.models import HBVLocal when referencing the example in docs.

@Daafip
Copy link
Collaborator

Daafip commented Jul 1, 2024

The docs still need to be improved & updated

The current doc's contain an updated example!

@samharrison7
Copy link
Author

Hey all,

We managed to get the local version of the models running and have confirmed these issues are container-in-a-container issues.

Our DataLabs developers have suggested using Podman instead of Docker/Apptainer. Is that something you guys have ever experimented with, and do you think it might be a way forward? Would it need modification to eWaterCycle itself, e.g. to deal with config options specifically for Podman?

Using setuid to propagate root access could be another option but this is less desirable due to potential security considerations.

Cheers,
Sam

@BSchilperoort
Copy link
Member

Hi Sam, @sverhoeven has at some point been interested into using Podman, however that might not have gone anywhere due to our infrastructure provider not currently supporting it.

We would probably have to write some code specific to Podman in grpc4bmi, just as we have for Docker and Apptainer. If we're lucky it's just writing very similar code for the Podman API instead of the Docker API.

You can see some of the code here. It starts up a container, and bind mounts the right directories to it, maintaining the original folder structure. This is so a user can pass the path to a configuration file using BMI.initialize(), without having to modify this string (or the config) behind the scenes.

@samharrison7
Copy link
Author

Thanks Bart. Just having a quick look at the Podman Python docs and the API doesn't look too different to Docker's, though the devil is probably in the detail. I guess there would need to be some new code on the eWaterCycle side too (e.g. here)?

@BSchilperoort
Copy link
Member

Thanks Bart. Just having a quick look at the Podman Python docs and the API doesn't look too different to Docker's, though the devil is probably in the detail.

Yeah, it seems straightforward but there will probably be some issues that are difficult to predict.

I guess there would need to be some new code on the eWaterCycle side too (e.g. here)?

Yes, that would be step two. The first step is to make grpc4bmi work with podman. Then you should be able to spawn a new container and connect to it, like https://grpc4bmi.readthedocs.io/en/latest/container/usage.html#using-the-container-clients

@BSchilperoort
Copy link
Member

BSchilperoort commented Sep 26, 2024

I did find this guide where a rootless podman can run a rootless podman:

$ podman run -it --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable podman run alpine echo hello

This could be a good starting point to try to run a containerized model from inside podman (without writing any code for ewatercycle/grpc4bmi).
E.g.:

podman run --security-opt label=disable --user podman --device /dev/fuse quay.io/podman/stable /bin/bash
will start podman and connect to an interactive bash shell. Instead of quay.io/podman/stable you could use an image that also has a python environment with ewatercycle installed, and has some pre-generated forcing files + config.

Next you can start a grpc4bmi server in headless mode:
podman run -d ghcr.io/daafip/hbv-bmi-grpc4bmi:v1.5.0 (and also mount volumes, of course)

Then you should be able to open up python, connect to the running grpc4bmi server, and try to initialize the model 🤞

@samharrison7
Copy link
Author

Oh nice, that sounds positive! We'll give that a go and see how far we get. I guess there isn't already an image with eWaterCycle installed available anywhere is there?

@BSchilperoort
Copy link
Member

BSchilperoort commented Sep 26, 2024

I guess there isn't already an image with eWaterCycle installed available anywhere is there?

Try the following 🤓
Still have to build it locally.

Details

Dockerfile (started from the podman container):

FROM quay.io/podman/stable

RUN mkdir -p ~/miniconda3
RUN curl -o  ~/miniconda3/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
RUN bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
RUN rm ~/miniconda3/miniconda.sh
RUN source ~/miniconda3/bin/activate

RUN curl -o conda-lock.yml https://raw.githubusercontent.com/eWaterCycle/ewatercycle/main/conda-lock.yml
RUN source ~/miniconda3/bin/activate; conda install mamba conda-lock -n base -c conda-forge -y
RUN source ~/miniconda3/bin/activate; conda-lock install --no-dev -n ewatercycle
RUN source ~/miniconda3/bin/activate; conda activate ewatercycle; pip install ewatercycle

To build and run:

docker build -t podman-ewc .
docker run -it podman-ewc
source ~/miniconda3/bin/activate; conda activate ewatercycle
python
import ewatercycle

Could be more efficient (without the repeated source declarations) but it does work

@BSchilperoort
Copy link
Member

@CansuUluseker & @mjhollaway I have managed to get a rootless podman container to sucessfully run a grpc4bmi model.

The info and Dockerfiles are all here: https://github.com/eWaterCycle/nested-podman
The containers are hosted on docker hub so it should be easy to pull and run.

a todo for this repository is to support the podman Python SDK https://podman-py.readthedocs.io, however it seems it's basically a drop-in replacement of the Docker SDK so it shouldn't be too much work.

@sverhoeven
Copy link
Member

We are using https://pypi.org/project/docker/ to interact with docker, it talks to a docker deamon so for podman we need a podman socket or switch to podman-py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants