Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reevaluate metrics and alerts exposed by the operator #956

Open
akrejcir opened this issue Apr 12, 2024 · 5 comments
Open

Reevaluate metrics and alerts exposed by the operator #956

akrejcir opened this issue Apr 12, 2024 · 5 comments

Comments

@akrejcir
Copy link
Collaborator

Let's reevaluate if the metrics and alerts exposed by the operator and template-validator are useful.

We have these metrics:

  • kubevirt_ssp_operator_reconcile_succeeded
  • kubevirt_ssp_vm_rbd_block_volume_without_rxbounce
  • kubevirt_ssp_common_templates_restored_total
  • kubevirt_ssp_template_validator_rejected_total

And these alerts:

  • SSPDown
  • SSPTemplateValidatorDown
  • SSPFailingToReconcile
  • SSPHighRateRejectedVms
  • SSPCommonTemplatesModificationReverted
  • VMStorageClassWarning
@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 11, 2024
@akrejcir
Copy link
Collaborator Author

/remove-lifecycle stale

@kubevirt-bot kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2024
@akrejcir
Copy link
Collaborator Author

@machadovilaca , some time ago, we were wondering if all the alerts and metrics in SSP are useful.
Do you know if some of these can be removed? Or do we have some place describing the usefulness?

I'm unsure where to start investigation.

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2024
@akrejcir
Copy link
Collaborator Author

/remove-lifecycle stale

@kubevirt-bot kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants