Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: move harbor backup schedule and make it configurable #2310

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

viktor-f
Copy link
Contributor

Warning

This is a public repository, ensure not to disclose:

  • personal data beyond what is necessary for interacting with this pull request, nor
  • business confidential information, such as customer names.

What kind of PR is this?

Required: Mark one of the following that is applicable:

  • kind/feature
  • kind/improvement
  • kind/deprecation
  • kind/documentation
  • kind/clean-up
  • kind/bug
  • kind/other

Optional: Mark one or more of the following that are applicable:

Important

Breaking changes should be marked kind/admin-change or kind/dev-change depending on type
Critical security fixes should be marked with kind/security

  • kind/admin-change
  • kind/dev-change
  • kind/security
  • kind/adr

What does this PR do / why do we need this PR?

Default harbor backup schedule moved to after defaut velero schedule. This should ensure that images needed for resources in velero backup are present in harbor backup.

Also made the schedule configurable.

Information to reviewers

Checklist

  • Proper commit message prefix on all commits
  • Change checks:
    • The change is transparent
    • The change is disruptive
    • The change requires no migration steps
    • The change requires migration steps
    • The change upgrades CRDs
    • The change updates the config and the schema
  • Documentation checks:
  • Metrics checks:
    • The metrics are still exposed and present in Grafana after the change
    • The metrics names didn't change (Grafana dashboards and Prometheus alerts are not affected)
    • The metrics names did change (Grafana dashboards and Prometheus alerts were fixed)
  • Logs checks:
    • The logs do not show any errors after the change
  • Pod Security Policy checks:
    • Any changed pod is covered by Pod Security Admission
    • Any changed pod is covered by Gatekeeper Pod Security Policies
    • The change does not cause any pods to be blocked by Pod Security Admission or Policies
  • Network Policy checks:
    • Any changed pod is covered by Network Policies
    • The change does not cause any dropped packets in the NetworkPolicy Dashboard
  • Audit checks:
    • The change does not cause any unnecessary Kubernetes audit events
    • The change requires changes to Kubernetes audit policy
  • Falco checks:
    • The change does not cause any alerts to be generated by Falco
  • Bug checks:
    • The bug fix is covered by regression tests

@viktor-f viktor-f added the kind/improvement Improvement of existing features, e.g. code cleanup or optimizations. label Oct 14, 2024
@viktor-f viktor-f self-assigned this Oct 14, 2024
Comment on lines +1699 to +1700
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
Copy link
Contributor

@simonklb simonklb Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
`schedule` defines when the backup job for Harbor will run.
This should be set to run shortly after Velero backups in the workload cluster, in order to ensure that images needed for Velero backups are backed up in Harbor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not also depend on how long time the Velero backup takes to complete?

It would be nice if they were triggered in sequence rather than on two different schedules. Issue worthy?

Copy link
Contributor Author

@viktor-f viktor-f Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does definitely depend on that. This mostly reflects that most velero backups should be done within 30 mins, it seemed like a sane default.
It would be very nice if that was triggered in sequence, but I'm not sure that it's worth making some controller that would be able to fix that. Especially since this is spread out across two different kubernetes clusters, harbor backups in sc and velero backups in wc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are so hesitant to create new issues to prevent flooding the backlog (which I totally understand) could we find some way of still keep "nice to haves" around? I think even if this isn't something we want to add to the current backlog this is something we want to solve in the future, i.e. that when you create a backup of your cluster all images that are currently in use should also be backed up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can bring that up to see what we can do 👍

description: |-
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
default: "30 0 * * *"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be nice to have a pattern for this in the schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did manage to find this pattern that seems to work in all general cases, but there might be some edge cases where it fails. But it is very complicated.

^((?<![\d\-\*])((\*\/)?([0-5]?[0-9])((\,|\-|\/)([0-5]?[0-9]))*|\*)[^\S\r\n]+((\*\/)?((2[0-3]|1[0-9]|[0-9]|00))((\,|\-|\/)(2[0-3]|1[0-9]|[0-9]|00))*|\*)[^\S\r\n]+((\*\/)?([1-9]|[12][0-9]|3[01])((\,|\-|\/)([1-9]|[12][0-9]|3[01]))*|\*)[^\S\r\n]+((\*\/)?([1-9]|1[0-2])((\,|\-|\/)([1-9]|1[0-2]))*|\*|(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec))[^\S\r\n]+((\*\/)?[0-6]((\,|\-|\/)[0-6])*|\*|00|(sun|mon|tue|wed|thu|fri|sat)))$|^@(annually|yearly|monthly|weekly|daily|hourly|reboot)$

Is this something we want, or is it just confusing to add this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kubernetes spec just has this:

        "schedule": {
          "description": "The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.",
          "type": "string"
        },

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really want something this seems less complicated:
/(@(annually|yearly|monthly|weekly|daily|hourly|reboot))|(@every (\d+(ns|us|µs|ms|s|m|h))+)|((((\d+,)+\d+|(\d+(\/|-)\d+)|\d+|\*) ?){5,7})/
Source: https://stackoverflow.com/questions/14203122/create-a-regular-expression-for-cron-statement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it does indeed look a bit confusing, maybe just adding a link on formatting in the description is good enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link now to all the config options I could find that uses a cron format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going with link only for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was thinking. But I'm open to the option of using the complicated regex as well.

Copy link
Contributor

@OlleLarsson OlleLarsson Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would like to have a regex to get some config validation before the k8s deploy stage. I don't really have a strong opinion on this though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now added a regex pattern. I had to exclude "(?<![\d-*])" from the pattern I mentioned above because negative lookahead is not supported in the schema. But I think that is fine.

Here is some validation of the regex pattern that I used: https://regex101.com/r/l56Too/2

helmfile.d/values/harbor/harbor-backup.yaml.gotmpl Outdated Show resolved Hide resolved
description: |-
`schedule` defines when the backup job for harbor will run.
This should be set to run shortly after velero backups in wc, in order to ensure that images needed for velero backups are backed up in harbor.
default: "30 0 * * *"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going with link only for now?

Default harbor backup schedule moved to after defaut velero schedule.
This should ensure that images needed for resources in velero backup are
present in harbor backup.
@viktor-f viktor-f requested a review from a team as a code owner November 6, 2024 07:56
Copy link
Contributor

@aarnq aarnq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one thing else it looks good to me, also do update the capitalisation that Simon noted.

Uses the Cron format, see https://en.wikipedia.org/wiki/Cron.
default: "30 0 * * *"
type: string
pattern: ^(((\*\/)?([0-5]?[0-9])((\,|\-|\/)([0-5]?[0-9]))*|\*)[^\S\r\n]+((\*\/)?((2[0-3]|1[0-9]|[0-9]|00))((\,|\-|\/)(2[0-3]|1[0-9]|[0-9]|00))*|\*)[^\S\r\n]+((\*\/)?([1-9]|[12][0-9]|3[01])((\,|\-|\/)([1-9]|[12][0-9]|3[01]))*|\*)[^\S\r\n]+((\*\/)?([1-9]|1[0-2])((\,|\-|\/)([1-9]|1[0-2]))*|\*|(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec))[^\S\r\n]+((\*\/)?[0-6]((\,|\-|\/)[0-6])*|\*|00|(sun|mon|tue|wed|thu|fri|sat)))$|^@(annually|yearly|monthly|weekly|daily|hourly|reboot)$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be added as a definition in the root and reused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/improvement Improvement of existing features, e.g. code cleanup or optimizations.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants