-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/automated staging publish and configure on PR #165
base: main
Are you sure you want to change the base?
Changes from all commits
e479004
a6b78a9
02d0d4e
307780d
e0623de
8167761
dedc8b6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,282 @@ | ||
# This GitHub Actions workflow automates the process of | ||
# publishing dataset collections to a staging environment | ||
# and creating a pull request (PR) in the veda-config repository | ||
# with the dataset configuration. | ||
# It is triggered by a pull request to the main branch | ||
# that modifies any files within the ingestion-data/dataset-config/ directory | ||
# The workflow includes steps to | ||
# - publish the datasets, | ||
# - create a PR in veda-config repository, | ||
# - constantly updates the status of the workflow in the PR comment | ||
|
||
name: Publish collection to staging and create dataset config PR | ||
|
||
on: | ||
pull_request: | ||
branches: | ||
- main | ||
paths: | ||
# Run the workflow only if files inside this path are updated | ||
- ingestion-data/dataset-config/* | ||
|
||
jobs: | ||
dataset-publication-and-configuration: | ||
permissions: | ||
pull-requests: write | ||
contents: read | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
# Initializes the PR comment | ||
# Edits existing or creates new comment | ||
# Why? - Cleanliness! | ||
- name: Initialize PR comment with workflow start | ||
id: init-comment | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
run: | | ||
WORKFLOW_URL="${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}" | ||
body='### Workflow Status | ||
**Starting workflow...** [View action run]($WORKFLOW_URL) | ||
' | ||
|
||
# Get the PR number | ||
PR_NUMBER=${{ github.event.pull_request.number }} | ||
|
||
# Fetch existing comments | ||
COMMENTS=$(gh api repos/${{ github.repository }}/issues/${PR_NUMBER}/comments --jq '.[] | select(.body | contains("### Workflow Status")) | {id: .id, body: .body}') | ||
|
||
# Check if a comment already exists | ||
COMMENT_ID=$(echo "$COMMENTS" | jq -r '.id' | head -n 1) | ||
|
||
if [ -z "$COMMENT_ID" ]; then | ||
# No existing comment, create a new one | ||
COMMENT_ID=$(gh api repos/${{ github.repository }}/issues/${PR_NUMBER}/comments -f body="$body" --jq '.id') | ||
else | ||
# Comment exists, overwrite the existing comment | ||
gh api repos/${{ github.repository }}/issues/comments/$COMMENT_ID -X PATCH -f body="$body" | ||
fi | ||
|
||
echo "COMMENT_ID=$COMMENT_ID" >> $GITHUB_OUTPUT | ||
|
||
# Find only the updated files (file that differ from base) | ||
# Only .json files | ||
# The files are outputted to GITHUB_OUTPUT, which can be used in subsequent steps | ||
- name: Get updated files | ||
id: changed-files | ||
uses: tj-actions/changed-files@v44 | ||
with: | ||
files: | | ||
**.json | ||
|
||
# Uses service client creds to get token | ||
# No username/password needed | ||
- name: Get auth token | ||
id: get-token | ||
run: | | ||
response=$(curl -X POST \ | ||
${{ vars.STAGING_COGNITO_DOMAIN }}/oauth2/token \ | ||
-H "Content-Type: application/x-www-form-urlencoded" \ | ||
-d "grant_type=client_credentials" \ | ||
-d "client_id=${{ vars.STAGING_CLIENT_ID }}" \ | ||
-d "client_secret=${{ secrets.STAGING_CLIENT_SECRET }}" | ||
) | ||
|
||
access_token=$(echo "$response" | jq -r '.access_token') | ||
echo "ACCESS_TOKEN=$access_token" >> $GITHUB_OUTPUT | ||
|
||
# Makes request to /dataset/publish endpoint | ||
# Outputs only files that were successfully published | ||
# Used by other steps | ||
# If none of the requests are successful, workflow fails | ||
# Updates the PR comment with status of collection publication | ||
- name: Publish all updated collections | ||
id: publish-collections | ||
env: | ||
ALL_CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }} | ||
WORKFLOWS_URL: ${{ vars.STAGING_WORKFLOWS_URL }} | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
AUTH_TOKEN: ${{ steps.get-token.outputs.ACCESS_TOKEN }} | ||
COMMENT_ID: ${{ steps.init-comment.outputs.COMMENT_ID }} | ||
run: | | ||
if [ -z "$WORKFLOWS_URL" ]; then | ||
echo "WORKFLOWS_URL is not set" | ||
exit 1 | ||
fi | ||
|
||
if [ -z "$AUTH_TOKEN" ]; then | ||
echo "AUTH_TOKEN is not set" | ||
exit 1 | ||
fi | ||
|
||
publish_url="${WORKFLOWS_URL%/}/dataset/publish" | ||
bearer_token=$AUTH_TOKEN | ||
|
||
# Track successful publications | ||
all_failed=true | ||
success_collections=() | ||
status_message='### Collection Publication Status | ||
' | ||
|
||
for file in "${ALL_CHANGED_FILES[@]}"; do | ||
echo $file | ||
if [ -f "$file" ]; then | ||
dataset_config=$(jq '.' "$file") | ||
collection_id=$(jq -r '.collection' "$file") | ||
|
||
response=$(curl -s -w "%{http_code}" -o response.txt -X POST "$publish_url" \ | ||
-H "Content-Type: application/json" \ | ||
-H "Authorization: Bearer $AUTH_TOKEN" \ | ||
-d "$dataset_config" | ||
) | ||
|
||
status_code=$(tail -n1 <<< "$response") | ||
|
||
# Update status message based on response code | ||
if [ "$status_code" -eq 200 ] || [ "$status_code" -eq 201 ]; then | ||
echo "$collection_id successfully published ✅" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ❤️ |
||
status_message+="- **$collection_id**: Successfully published ✅ | ||
" | ||
success_collections+=("$file") | ||
all_failed=false | ||
else | ||
echo "$collection_id failed to publish ❌" | ||
status_message+="- **$collection_id**: Failed to publish ❌ | ||
" | ||
fi | ||
else | ||
echo "File $file does not exist" | ||
exit 1 | ||
fi | ||
done | ||
|
||
# Exit workflow if all the requests fail | ||
if [ "$all_failed" = true ]; then | ||
echo "All collections failed to publish." | ||
exit 1 | ||
fi | ||
|
||
# Output only successful collections to be used in subsequent steps | ||
echo "success_collections=$(IFS=','; echo "${success_collections[*]}")" >> $GITHUB_OUTPUT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth also publishing the failed ones? Saves someone having to find those manually? |
||
|
||
# Update PR comment | ||
CURRENT_BODY=$(gh api -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID --jq '.body') | ||
UPDATED_BODY="$CURRENT_BODY | ||
|
||
$status_message" | ||
gh api -X PATCH -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID -f body="$UPDATED_BODY" | ||
Comment on lines
+103
to
+169
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. General question - Do we have a preference for scripting and what language we use? Would we prefer this in Python for example? It's pretty involved. |
||
|
||
# Update PR comment | ||
- name: Update PR comment for PR creation | ||
if: success() | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
COMMENT_ID: ${{ steps.init-comment.outputs.COMMENT_ID }} | ||
run: | | ||
CURRENT_BODY=$(gh api -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID --jq '.body') | ||
UPDATED_BODY="$CURRENT_BODY | ||
|
||
**Creating a PR in veda-config...**" | ||
gh api -X PATCH -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID -f body="$UPDATED_BODY" | ||
Comment on lines
+172
to
+182
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I might be wrong, but if this is just adding a note to say it's creating the PR to the comment in the step above, why a separate statement? Seems like this 4 extra run lines could go in there, not sure a separate step gives much extra value? |
||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.9' | ||
cache: 'pip' | ||
|
||
# Creates a slim dataset mdx file for each collection based on the dataset config json | ||
- name: Create dataset mdx for given collections | ||
env: | ||
PUBLISHED_COLLECTION_FILES: ${{ steps.publish-collections.outputs.success_collections }} | ||
run: | | ||
pip install -r scripts/requirements.txt | ||
for file in "${PUBLISHED_COLLECTION_FILES[@]}" | ||
do | ||
python3 scripts/mdx.py "$file" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could the script accept a list of files and do the iteration in that? |
||
done | ||
|
||
- name: Set up Git | ||
run: | | ||
git config --global user.name "github-actions[bot]" | ||
git config --global user.email "github-actions[bot]@users.noreply.github.com" | ||
|
||
- name: Clone `veda-config` | ||
env: | ||
VEDA_CONFIG_GH_TOKEN: ${{ secrets.VEDA_CONFIG_GH_TOKEN }} | ||
run: git clone https://${{ env.VEDA_CONFIG_GH_TOKEN }}@github.com/${{ vars.VEDA_CONFIG_REPO_ORG }}/${{ vars.VEDA_CONFIG_REPO_NAME }}.git | ||
|
||
# Creates a PR in veda-config with the following changes: | ||
# 1. the mdx files for all published collections | ||
# 2. updates the stac/raster urls in .env file | ||
# This step needs a GH_TOKEN that has permissions to create a PR in veda-config | ||
- name: Create PR with changes | ||
id: create-pr | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
VEDA_CONFIG_GH_TOKEN: ${{ secrets.VEDA_CONFIG_GH_TOKEN }} | ||
COMMENT_ID: ${{ steps.publish-collections.outputs.COMMENT_ID }} | ||
PUBLISHED_COLLECTION_FILES: ${{ steps.publish-collections.outputs.success_collections }} | ||
run: | | ||
files_string=$(IFS=$'\n'; echo "${PUBLISHED_COLLECTION_FILES[*]}") | ||
hash=$(echo -n "$files_string" | md5sum | cut -d ' ' -f 1) | ||
NEW_BRANCH="add-dataset-$hash" | ||
cd ${{ vars.VEDA_CONFIG_REPO_NAME }} | ||
git fetch origin | ||
if git ls-remote --exit-code --heads origin $NEW_BRANCH; then | ||
git push origin --delete $NEW_BRANCH | ||
fi | ||
git checkout -b $NEW_BRANCH | ||
|
||
# Update the env vars to staging based on env vars | ||
sed -i "s|${{ vars.ENV_FROM }}|${{ vars.ENV_TO }}|g" .env | ||
cp -r ../datasets/* datasets/ | ||
git add . | ||
git commit -m "Add dataset(s)" | ||
git push origin $NEW_BRANCH | ||
PR_URL=$(GITHUB_TOKEN=$VEDA_CONFIG_GH_TOKEN gh pr create -H $NEW_BRANCH -B develop --title 'Add dataset [Automated workflow]' --body-file <(echo "Add datasets (Automatically created by Github action)")) | ||
|
||
echo "PR_URL=$PR_URL" >> $GITHUB_OUTPUT | ||
echo "PR creation succeeded" | ||
|
||
# Updates the comment with a link to the above PR | ||
- name: Update PR comment with PR creation result | ||
if: success() | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
COMMENT_ID: ${{ steps.init-comment.outputs.COMMENT_ID }} | ||
run: | | ||
PR_URL=${{ steps.create-pr.outputs.PR_URL }} | ||
CURRENT_BODY=$(gh api -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID --jq '.body') | ||
UPDATED_BODY="$CURRENT_BODY | ||
|
||
**A PR has been created with the dataset configuration: 🗺️ [PR link]($PR_URL)**" | ||
gh api -X PATCH -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID -f body="$UPDATED_BODY" | ||
|
||
- name: Update PR comment on PR creation failure | ||
if: failure() && steps.create-pr.outcome == 'failure' | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
COMMENT_ID: ${{ steps.init-comment.outputs.COMMENT_ID }} | ||
run: | | ||
CURRENT_BODY=$(gh api -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID --jq '.body') | ||
UPDATED_BODY="$CURRENT_BODY | ||
|
||
**Failed ❌ to create a PR with the dataset configuration. 😔 **" | ||
gh api -X PATCH -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID -f body="$UPDATED_BODY" | ||
|
||
# If the workflow fails at any point, the PR comment will be updated | ||
- name: Update PR comment on overall workflow failure | ||
if: failure() && steps.create-pr.outcome != 'failure' | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
COMMENT_ID: ${{ steps.init-comment.outputs.COMMENT_ID }} | ||
run: | | ||
WORKFLOW_URL="${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}" | ||
CURRENT_BODY=$(gh api -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID --jq '.body') | ||
UPDATED_BODY="$CURRENT_BODY | ||
|
||
** ❌ The workflow run failed. [See logs here]($WORKFLOW_URL)**" | ||
gh api -X PATCH -H "Authorization: token $GITHUB_TOKEN" /repos/${{ github.repository }}/issues/comments/$COMMENT_ID -f body="$UPDATED_BODY" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<Block> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain what the purpose of this file is? From what I can tell, we'll generate a file in the form:
Does this filler text get changed elsewhere? |
||
<Prose> | ||
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. | ||
</Prose> | ||
</Block> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen a few of our other workflows define
shell: bash
- Worth doing for theserun
's?