Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify adding models #338

Closed
5 tasks done
Peter9192 opened this issue Feb 27, 2023 · 6 comments
Closed
5 tasks done

Simplify adding models #338

Peter9192 opened this issue Feb 27, 2023 · 6 comments
Milestone

Comments

@Peter9192
Copy link
Collaborator

Peter9192 commented Feb 27, 2023

Currently, adding a model to eWaterCycle is cumbersome:

  • eWaterCycle-specific model code lives inside the eWaterCycle repo
  • Different aspects of models (setup, forcing data, etc.) live in different modules (ewatercycle.models, ewatercycle.forcing, ...)
  • Models, parameter sets, and forcing data all have versions that are supposed to match

Our current structure is as follows:

- ewatercycle 
  - models
    - modelA
    - modelB
    - ...
  - forcing
    - modelA
    - modelB
    - ...
  - ...

Alternatively, we could use something like

ewatercycle
  - modelA
    - model
    - forcing
    - ...
  - modelB
    - model
    - forcing
    - ...
  - ...

Then, all model-specific quirks are contained within one folder/module/repo. This would make it easier to add new models. It might also make sense to move all specific model implementations out of the main eWaterCycle package and support them as plugins instead. Such a structure would facilitate that:

  • A new release of the plugin has to ensure that model code, forcing, and parameter sets are compatible
  • Using a specific version of a model is as simple as changing the version of the plugin
  • Ownership of specific models/plugins lies with the model/plugin developer
  • The main eWaterCycle package would reduce to simply defining the interface, plus perhaps one or two "defaults" (lumped and distributed)

This requires some work on the package architecture:

@Peter9192
Copy link
Collaborator Author

Another feature that would be very helpful for development purposes is enabling models without containers. We could make a distinction between a LocalModel and a ContainerizedModel. Both would have to adhere to the eWaterCycle model interface, i.e. having a setup function that attaches BMI (even if this is not strictly necessary for the localmodel). As such, the ewatercycle localmodel is an intermediate step between a 'normal' BMI model and a containerized ewatercycle model.

@Peter9192
Copy link
Collaborator Author

Different constructors.

Currently our models are initialized with a version tag. While this is good for checking compatibility between model, forcing, and parameterset, it would also be very useful to be able to start a model directly from a container URI or image filename. One way to approach this is by having multiple constructors:

ContainerizedModel.from_version(version="2020.10", ...)

# One option is separate constructors for docker/apptainer
ContainerizedModel.from_docker(docker_uri="ewatercycle/wflow-grpc4bmi:2020.10", ...)
ContainerizedModel.from_apptainer(sif_file="wflow_grpc4bmi_2020-10.sif") 

# Or reconstruct from docker uri:
ContainerizedModel.from_image(image = "ewatercycle/wflow-grpc4bmi:2020.10")
# if config.container_engine == singularity: derive image filename from image

@Peter9192
Copy link
Collaborator Author

Peter9192 commented Mar 20, 2023

Interesting to check out for forcing: https://nasaaccess.readthedocs.io/en/latest/index.html

@Peter9192 Peter9192 mentioned this issue Apr 5, 2023
2 tasks
@sverhoeven
Copy link
Member

sverhoeven commented Apr 6, 2023

The public API consists out of

  • Generate forcing for a model
  • Load forcing for a model
  • Load parameter set for a model
  • Run a model with parameter set and forcing
  • List available models
  • List available parameter sets
  • Download example parameter set for a model
  • Non model specific public API:
    • Download observation data from GRDC or USGS
    • Configuration for container engine, root dir for parameter sets

By moving to a plugin architecture the public API can be refactored.
Some choices:

  1. Each model has own public API,
  • ewatercycle public API does not know about models
  • ewatercycle public API is used to construct model
  1. A model specific thing can be made available via the ewatercycle public API
  2. Each model has own public API and is reexported in the ewatercycle public API

@Peter9192
Copy link
Collaborator Author

@BSchilperoort
Copy link
Member

All tasks have been completed. Models are much simpler to add now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants