Skip to content

Developers Guide

G. Dylan Dickerson edited this page May 2, 2024 · 9 revisions

EarthWorks Developers' Guide

Table of Contents

Repositories and externals

The "main model repository" is EarthWorksOrg/EarthWorks. This repo is used to collect all of the related top-level configurations, certain meta documents (e.g. LICENSE) for the project, and to provide a list of externals with a mechanism to fetch them.

Externals

An "external" refers to a repo that isn't integrated with the one being considered (i.e. the code is in another repo). For EarthWorks, examples of externals include ccs_configs, CAM, CAM's externals, and MOSART. Like CESM, we use the manage_externals/checkout_externals tool to fetch these dependencies after cloning a fresh copy of the EarthWorks model. In later parts of this guide, any repo that contains an external is an "over-repo" of that external.

The externals EarthWorks uses are defined in the Externals.cfg file. A specific version of the external is fetched according to the URL and hash, branch, or tag provided.

EarthWorks externals

Further, some externals have been forked into EarthWorksOrg (EWOrg) versions - we call these "EarthWorks externals" or "EW externals." These forks also allow the EarthWorks team to pursue our own development that may not be desired in the upstreams.

There are also three EW externals that are unique: EWOrg/mpas-framework, EWOrg/mpas-ocean, and EWOrg/mpas-seaice. These EW externals were based on work in the E3SM v2.0 code and have no traditional upstream repos.

Upsteam synching

The upstream repos have their own development teams and priorities that may or may not be helpful to EarthWorks. To try to leverage that upstream work, we try to keep our forks as close to their upstreams as is "reasonably possible." When we synch is loosely defined on purpose, we may not always want the upstream changes.

We typically synch our EW externals with their upstreams whenever a new ESCOMP/CESM beta tag is released.

Branches

All EWOrg repos have main and development branches. The branch names may be slightly different in each repo (depending on if the repo is an external or the main model), but they serve two distinct purposes. The development branch exists for collaboration between the developers. The development branch is where new features are merged and testing is done to ensure the desired features are maintained. The main branch exists for users and is only updated by releases to ensure only (somewhat) stable code is fetched by default.

In the EarthWorks repo, the names are obvious: main is the main branch and develop is the development branch.

In EW externals, though, the main branch is always ew-main and the development branch is always ew-develop. This is a practice in EarthWorks to help ensure that we don't easily confuse our work and development with that going on in the upstream repos.

Tags

While the checkout_externals script can work with a branch, a tag, or a hash; a tag is preferred for use in the various EarthWorks repos' main and development branches. A branch can change without warning, particularly if it is from someone's personal fork. A hash is somewhat esoteric (non-human readable) but is very stable. An annotated tag gives the best of both - human readable, semi-stable reference points - and contain some of their own documentation, too.

Tags in EWOrg repos are in an adapted Semantic Versioning format. They are generally of the form {pre}-ew{X}.{Y}.{Z}, where:

  • {pre} stands for some repo- or tag-specific prefix.
  • -ew is used to ensure tags in externals aren't confused with upstream tags.
  • {X} is the major version number.
  • {Y} is the minor verion number.
  • {Z} is the patch version number. So far, the patch number has been zero-padded to three digits.

Example tag: cam-ew2.1.002.

EarthWorks also makes a distinction between two kinds of tags. Development tags almost always occur on the development branches. Release tags only occur on the main branches, and typically fit the format release-{pre}-ew{X}.{Y}. The patch number is dropped since most releases should only be for major and minor versions. In exceptional circumstances a "bug fix" or "patch release" may occur and would fit the format release-{pre}-ew{X}.{Y}.{Z}.

Development tags in EWOrg/EW

Since the main model repo doesn't have an upstream to confuse its tags with, EWOrg/EW development tags look like ewm-{X}.{Y}.{Z}. Here ewm stands for "EarthWorks Model." It is typically sufficient to provide only a short (even one-line) description of the changes being tagged in EWOrg/EW.

Example:

tag ewm-2.1.002
Tagger: G. Dylan Dickerson <[email protected]>
Date:   Tue Apr 16 11:01:09 2024 -0600

Fix GPU builds for compsets with multiple MPAS cores (PR #39)

commit 4b974c57bdb6544acddf0647f1598d6fefe0feb8 (tag: ewm-2.1.002)
Merge: 258a4d4 24948ad
Author: G. Dylan Dickerson <[email protected]>
Date:   Tue Apr 2 14:53:20 2024 -0600

    Merge branch 'framework-ext-refs' into develop (PR #39)

    Add changes in mpas-framework, mpas-seaice, and mpas-ocean to ensure
    unique module names across the different copies of the MPAS
    infrastructure (framework, operators, etc) that are compiled for each
    MPAS core.

Development tags in EW Externals

Development tags in EW externals have prefixes according to their repo name and the tag's description follows a specific format.

Besides the summary and description, the last line of any tag in an EW external must end with a line that describes the last point in common with the upstream repo. The format for the last line looks like Last changes from upstream '{UPSTREAM_OWNER/UPSTREAM_NAME}' tag:'{UPSTREAM_TAG}'. This line can be carried over from the previous tag, unless an upstream update is being done.

Example:

tag cam-ew2.1.002
Tagger: G. Dylan Dickerson <[email protected]>
Date:   Wed Mar 13 16:01:59 2024 -0600

Fix intel-oneapi builds

Last changes from upstream 'ESCOMP/CAM' tag:'cam6_3_148'

commit d5e3529e2ee4e61375d00aa5485a14fe8934ab8e (HEAD, tag: cam-ew2.1.002, origin/ew-develop)
Merge: f09ed7cd 3e9870d9
Author: G. Dylan Dickerson <[email protected]>
Date:   Wed Mar 13 16:01:46 2024 -0600

    Merge branch 'gdicker1/fix/perllogic_intel-oneapi' into ew-develop (PR #15)

    Fix ability to build CAM with intel-oneapi compilers.

Release tags

These are special cases of the above tagging schemes. Release tags are intended for general users and non-developers and have this format to make them recognizable (e.g. in error logs).

For the EarthWorks model repo, release tags typically look like: release-ew{X}.{Y}.

An example of a release tag from an EW external looks like: release-cam-ew2.1. Please remember that all tags in an EW external must describe the last tag merged from the upstream repo.

Testing requirements

As you develop, you should have been testing exercising your code changes along the way. This is the necessary first step for any change to be submitted.

Minimal testing

Then, most PRs will require the "Minimal Testing" which includes successful builds and runs of:

  • GPU run of FullyCoupled model on 120km grid for 1 day (NVIDIA compiler only).
  • (If GPU not possible) CPU run of FullyCoupled model on 120km (All supported compilers?).
  • Any other tests the developer mentions.

Verification testing

A release or some other changes may require "Verification Testing" (at discretion of Reviewer(s) and Leadership).

More details on this coming soon...

Workflows

Submit an Issue

If you notice a problem in any EWOrg repo please open an Issue in the main model repo only: create a new Issue in EWOrg/EW. Issues in EW externals may not be seen, only the main model repo is checked frequently.

We hope that a good Issue contains:

  • A summary (or title) line describing the problem and context in brief
  • A description that tries to answer the questions:
    • What did you observe? What did you expect instead?
    • What version(s) of EarthWorks (and/or externals) were you using? Were there any local modifications?
    • What is the exact text of any error messages that seem important?
    • Does this cause an issue on any supported system or within any supported compset?

This should trigger a discussion amongst the EarthWorks team of the problem which results in a decision of whether to address the problem (and hopefully how to as well). A member of the EarthWorks team will then take on the Issue and will be responsible for working on changes and submitting them by the PR process (see Changes to submit).

Changes to submit

If you have some changes to submit, the zeroth step is figuring out which repo (main model or which external) your changes go into. This can seem simple but if you are unsure, start talking with other EarthWorks Developers to determine which repo your changes belong in. The rest of this section assumes your changes will go into an EWOrg fork of some repo, and refers to this repo as {R}.

Rarely, if your change are only specific to the main model repo, you can just open a PR in EWOrg/EW. Examples of such changes include editing the meta documents like README.md or changing an Externals.cfg entry to use a new version (especially for non-EW externals).

Consider this diagram (and/or the following explanation):

Diagram of PR process, see the following instructions

Otherwise there are two other questions that help guide the process:

  • Is this change EarthWorks-specific?
    • I.e. is this a change that the upstream repo would likely reject but is necessary for EarthWorks? For example, EarthWorks requires high-resolution MPAS grids but those aren't needed in CAM yet; adding such grids would be an EarthWorks-specific change.
  • If no, what is the urgency of this change?

If the answer to the first question is "yes," then the process involves:

  1. basing your branch off of develop in the EWOrg/{R} repo
  2. creating a PR in EWOrg/{R}
  3. (recommended) creating branches and PRs in the over-repos that use these changes
    • NOTE: this step can help ensure that the Maintainer and others can easily fetch a sandbox with the changes you wish to have tested. Since there's no tag for these changes yet, you can use your development branch.
  4. working through the PR process with other EarthWorks Contributors
  5. once approved, working with the Maintainer to ensure your changes "run up the chain" of repos.

However, if the answer to the first question is "no," then the process includes steps to work with the upstream repo. This involves:

  1. basing your branch off of the last "upstream ref" on the develop branch in EWOrg/{R}
  • NOTE: this information should be in the most recent EarthWorks tag for your branch and will involve adding another remote to your local repo
  1. creating a PR in EWOrg/{R}
  2. (recommended) creating branches and PRs in the EWOrg over-repos that use these changes
  3. creating a PR (with the same branch as step 2) in the upstream repo of {R} and following their PR process
  4. making a decision based on the answer to the second question above
  5. if there's no urgency, following the upstream's PR process to completion (else skip to next step).
  6. synching changes from the upstream PR process to the EWOrg/{R} PR
  7. finalizing the PR in EWOrg and get approval(s) for the PR
  8. once approved, working with the Maintainer to ensure your changes "run up the chain" of repos.

Run changes up the chain

Whenever an external has a code change (and because we use tags in Externals.cfg), we must also update any over-repos that use the external (and recursively for the over-...-over-repos). The process of ensuring that changes make their way from an external all the way up to the EarthWorks model can be refered to as "running the change up the chain of repos."

This diagram below shows the process that this section explains:

Diagram of "run up chain" process, see the following instructions

Once a PR in an EarthWorks external has been accepted, there should be other related PRs in the over-repo(s) that use the external. Those PRs contain some entry in Externals.cfg that has so far pointed to a branch in the developer's fork of the external. It's the maintainer's responsibility to:

  1. Ensure changes are merged to the correct branch in the external (and pushed to GitHub). Typically develop branch of the external.
  2. Ensure a tag is made (and pushed) within the external with an appropriate tag name and body.
  3. Find an over-repo PR that uses the changes to the external.
    1. Edit the over-repo Externals.cfg to use the new external tag (from the EWorg fork).
    2. Approve the over-repo PR and ensure these steps repeat until the same process is done for the main model repo.

Release

As mentioned in Branches, releases are how we update our main branches in our EWOrg repos. Currently this process involves several people interacting to ensure the required steps are done.

This diagram attempts to show the process:

Diagram of Release process, see the following instructions

The release process begins when EarthWorks Leadership identify that a release is needed. This could be for various reasons, but is at least when "enough" development has happened and the EarthWorks team is eager to make the new features accessible to users.

Leadership will then call or create the Release Team who will be responsible for handling the release. This team should include multiple people, but will at least include the GitHub Maintainer.

After forming the release team, it is up to the Maintainer to ready the EWOrg repos for release. (This is where release candidate branches would be made and PR(s) to track the release.)

Then as the release begins, the Release Team will perform a few tasks in parallel:

  • Performing "Release Testing" and iterate on problems found to ensure quality
  • Update documentation (wiki, presentations, etc)
  • With Leadership, draft and revise Release Notes to communicate about the release

Then after sufficient testing and documenting, Leadership and Release Team will approve the release. At this point the Maintainer will go through the EWOrg repos merging the release onto the main branch, tagging the merge with a release tag, pushing the main branch and tag to GitHub, and ensuring that over-repos use the release tags (run release tags "up the chain").

At this point Leadership and the Release Team can post the Release Notes and make some sort of announcement.

The Maintainer will also need to "start a new dev cycle." Since the release should have incremented the Major or Minor version number (unless this is a "patch release"), new tags are needed on the develop branch. At this point, we also haven't ensured that any changes made by the Release Team have made their way onto the develop branch. After the release tags are made, the Maintainer should (in each repo) merge the main branch into the develop branch, create the new {pre}{X}.{Y}.000 tag, run these new tags "up the chain" of repos, and push the develop branch and tag to GitHub.

Other concepts

Glossary-ish

EW - Shorthand for EarthWorks.

EWOrg - Shorthand for EarthWorksOrg, the GitHub organization used to manage the code of the EarthWorks project.

main model repository - The EarthWorks source code at the top-level. Specifically this refers to the GitHub repo EarthWorksOrg/EarthWorks. Something is typically considered "merged into the main model repo" if it is in at least the develop branch of this repo. Sometimes also "EWOrg/EW."

over-repo - From the perspective of an external, an over-repo is the source code that uses the external. E.g. the EarthWorks repo is the over-repo of ccs_configs and cime (and others...), the over-repo of CLUBB and MPAS-A is CAM.

upstream ref - Last reference from the upstream of an EW external. This is almost always the last tag from upstream that was merged into the EW external.

GitHub labels

Coming soon...

Roles and responsibilities

Contributor

Adding value to EarthWorks, but not currently acting in another role. Every role is at least a Contributor and must fulfill these responsibilities as well. Responsible for upholding Code of Conduct.

Developer

Has changes to submit to EarthWorks either directly to the model or to an external that eventually makes it way to a model. They are responsible for their changes throughout the PR process, it is their duty to follow workflows and recommendations from other collaborators until the PR is merged into the main model repo.

Reviewer

Has been engaged in the PR process, typically by the developer. They are concerned with the quality of the changes as it concerns the project. A reviewer is responsible for timely responses, change evaluation (to accept, reject, or modify) , and providing constructive feedback.

GitHub Maintainer

May also be a Reviewer of a PR. They are concerned with the Git and GitHub "presence" of EarthWorks repos. It is their responsibility to maintain a helpful Git history of all repos (including branch structure, tags, descriptions, etc.), maintain a productive and sensible GitHub environment (branches allowed in main repo, Issues triage), collaborating on and creating "support documents" (Code_of_Conduct, Guides, workflows), and provide some user support (if it occurs). Within the context of a PR, the maintainer is responsible for adding the approved changes to the develop branch(es) of the involved repo(s).

Leadership

Guides the overall EarthWorks project. They are responsible for forward looking objectives, supporting development, fostering a community (eventually), and communication about the project. They are also responsible for mediating any conflict and having "final say" when a decision is needed.