Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: reduce per-timeline histogram metrics #8223

Closed
jcsp opened this issue Jul 1, 2024 · 0 comments · Fixed by #8245
Closed

pageserver: reduce per-timeline histogram metrics #8223

jcsp opened this issue Jul 1, 2024 · 0 comments · Fixed by #8245
Assignees
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver

Comments

@jcsp
Copy link
Contributor

jcsp commented Jul 1, 2024

The per-timeline histogram-per-op-type of page_service latencies makes up the vast majority of the metrics output from pageservers, and is very rarely used. We already have node-wide versions of these stats.

Let's keep a per-timeline getpage latency, and drop the rest.

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt labels Jul 1, 2024
@jcsp jcsp self-assigned this Jul 1, 2024
jcsp added a commit that referenced this issue Jul 3, 2024
## Problem

We record detailed histograms for all page_service op types, which
mostly aren't very interesting, but make our prometheus scrapes huge.

Closes: #8223 

## Summary of changes

- Only track GetPageAtLsn histograms on a per-timeline granularity. For
all other operation types, rely on existing node-wide histograms.
@jcsp jcsp closed this as completed in #8245 Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant