Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: add time based image layer creation check #8247

Merged
merged 3 commits into from
Jul 5, 2024

Conversation

VladLazar
Copy link
Contributor

Problem

Assume a timeline with the following workload: very slow ingest of updates to a small number of keys
that fit within the same partition (as decided by KeySpace::partition). These tenants will create small
L0 layers since due to time based rolling, and, consequently, the L1 layers will also be small.

Currently, by default, we need to ingest 512 MiB of WAL before checking if an image layer is required.
This scheme works fine under the assumption that L1s are roughly of checkpoint distance size, but
as the first paragraph explained, that's not the case for all workloads.

Summary of changes

Check if new image layers are required at least once every checkpoint timeout interval.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@VladLazar VladLazar requested a review from a team as a code owner July 3, 2024 14:28
@VladLazar VladLazar requested review from petuhovskiy, koivunej and skyzh and removed request for petuhovskiy July 3, 2024 14:28
@VladLazar VladLazar added the run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label label Jul 3, 2024
koivunej
koivunej previously approved these changes Jul 3, 2024
Copy link
Contributor

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work. Sadly I cannot see a way to test this, except in staging.

@VladLazar
Copy link
Contributor Author

Sadly I cannot see a way to test this, except in staging.

That was my plan: get it staging asap and validate the metric goes down

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and I probably also need to modify the metadata image layer generation trigger.

Copy link

github-actions bot commented Jul 3, 2024

3111 tests run: 2984 passed, 0 failed, 127 skipped (full report)


Flaky tests (1)

Postgres 14

  • test_basebackup_with_high_slru_count[github-actions-selfhosted-sequential-10-13-30]: release

Code coverage* (full report)

  • functions: 32.6% (6931 of 21275 functions)
  • lines: 50.0% (54495 of 108968 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
5c3cdff at 2024-07-05T12:28:31.364Z :recycle:

@VladLazar VladLazar requested review from jcsp, skyzh and koivunej July 4, 2024 09:44
@koivunej koivunej dismissed their stale review July 4, 2024 10:20

Dismissing because I am unsure about the time+size based

@VladLazar VladLazar changed the title pageserver: add time based imake layer creation check pageserver: add time based image layer creation check Jul 4, 2024
Copy link
Contributor

@jcsp jcsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major change here is that tiny tenants can generate image layers up to every 48 hours, whereas previously they'd never generate any at all: that's okay, but let's monitor it carefully to see how much extra work we're doing.

@VladLazar
Copy link
Contributor Author

The major change here is that tiny tenants can generate image layers up to every 48 hours, whereas previously they'd never generate any at all: that's okay, but let's monitor it carefully to see how much extra work we're doing.

Will keep an eye on it

@VladLazar VladLazar force-pushed the vlad/time-based-img-layer-check branch from 2c1884d to 5c3cdff Compare July 5, 2024 10:22
@VladLazar VladLazar merged commit 7dd2e44 into main Jul 5, 2024
69 checks passed
@VladLazar VladLazar deleted the vlad/time-based-img-layer-check branch July 5, 2024 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants