Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logic to fetch prometheus rules from clusters #290

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mlguerrero12
Copy link

Prometheus rules are obtained from managed clusters to subsequently fill out the alarm dictionaries

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 31, 2024
Copy link

openshift-ci bot commented Oct 31, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from mlguerrero12. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mlguerrero12 mlguerrero12 force-pushed the filloutdict branch 2 times, most recently from 8971136 to 5637e64 Compare November 1, 2024 08:56
@mlguerrero12 mlguerrero12 changed the title [WIP] Add logic to fetch prometheus rules from clusters Add logic to fetch prometheus rules from clusters Nov 1, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 1, 2024
Copy link
Collaborator

@pixelsoccupied pixelsoccupied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! Thanks!

internal/service/alarms/internal/dictionary/dictionary.go Outdated Show resolved Hide resolved
internal/service/alarms/internal/dictionary/dictionary.go Outdated Show resolved Hide resolved
internal/service/alarms/internal/dictionary/dictionary.go Outdated Show resolved Hide resolved
internal/service/alarms/internal/dictionary/dictionary.go Outdated Show resolved Hide resolved
internal/service/alarms/internal/dictionary/dictionary.go Outdated Show resolved Hide resolved
internal/service/alarms/internal/dictionary/suite_test.go Outdated Show resolved Hide resolved
internal/clients/clients.go Show resolved Hide resolved
Prometheus rules are obtained from managed clusters to
subsequently fill out the alarm dictionaries

Signed-off-by: Marcelo Guerrero <[email protected]>
Comment on lines +27 to +28
secretTypeLabel = "hive.openshift.io/secret-type"
secretTypeLabelValue = "kubeconfig"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add a comment about where these are coming from just to make it clear for the future?

Also are the timeouts related to these two consts if not maybe create new const block (nit)?

managedClusterVersionLabel = "openshiftVersion-major-minor"
localClusterLabel = "local-cluster"

resourceTypeCluster = "Cluster"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • nit: maybe a separate block since it's unrelated to acm values (this is o-ran specific stuff)? Also maybe a comment about what this is how we will have more types in the future?

}
}

return rules, nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a log see say which cluster, the version and count?

Comment on lines +121 to +122
versionSelector, _ := labels.NewRequirement(managedClusterVersionLabel, selection.Equals, []string{version})
localClusterRequirement, _ := labels.NewRequirement(localClusterLabel, selection.NotEquals, []string{"true"})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we return the errors?

var rules []monitoringv1.Rule
for _, promRule := range promRules {
for _, group := range promRule.Spec.Groups {
rules = append(rules, group.Rules...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be helpful to print out the alert names?

Comment on lines +51 to +58
hubClient, err := clients.NewClientForHub()
if err != nil {
slog.Error("error creating client for hub", "error", err)
os.Exit(1)
}

alarmsDict := dictionary.New(hubClient)
alarmsDict.Load(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any clean up we need to do after this...like closing the client maybe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants