Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot (management) metrics #165

Open
ginkel opened this issue Mar 30, 2023 · 6 comments
Open

Snapshot (management) metrics #165

ginkel opened this issue Mar 30, 2023 · 6 comments

Comments

@ginkel
Copy link

ginkel commented Mar 30, 2023

Hi there,

we were wondering whether it would make sense to extend the prometheus-exporter-plugin-for-opensearch in such a way that it exports additional metrics about which snapshots have been created, when the last snapshot has been created and so on. The main use-case would be to monitor whether backups are created in a regular fashion (using Snapshot Management), so that disruptions of the snapshot creation can be detected early on using alerts.

Do you think that would make a worthwhile addition to the plugin?

Thanks,
Thilo

@lukas-vlcek
Copy link
Collaborator

lukas-vlcek commented Mar 30, 2023

Is this metric exposed by OpenSearch itself? If yes then adding it to Prom. exporter would be an easy task. Or are there at least some relevant metrics already exposed by OpenSearch?

@ginkel
Copy link
Author

ginkel commented Apr 3, 2023

One could retrieve the registered repositories using a GetRepositoriesRequest and then obtain details about each snapshot using GetSnapshotsRequest. Exposing a time series per snapshot could be tricky (metrics inflation), so one could limit the number of observed snapshots to the n latest. If a snapshot has been created by a Snapshot Management Policy this is indicated using the sm_policy metadata attribute, which one could group the metrics by (to just expose the metrics of the last snapshot created by the policy).

In the REST API this maps to:

GET _snapshot

GET _snapshot/<repo_name>/_all

@sandervandegeijn
Copy link

Agreed, would be nice to have :)

@patelsmit32123
Copy link

@lukas-vlcek I would like to take this up, we are implementing something similar in our forked repo, so we can contribute back the same. Please let me know if we still plan to add snapshot related metrics

@lukas-vlcek
Copy link
Collaborator

@patelsmit32123 I would love to take a look at any PR :-)

@patelsmit32123
Copy link

patelsmit32123 commented Aug 9, 2024

@lukas-vlcek PTAL at #295, I have tested them on our staging env, seems to be working fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants