Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only support compressed reads if the compression setting is present #8238

Merged
merged 2 commits into from
Jul 3, 2024

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Jul 2, 2024

PR #8106 was created with the assumption that no blob is larger than 256 MiB. Due to #7852 we have checking for writes of blobs larger than that limit, but we didn't have checking for reads of such large blobs: in theory, we could be reading these blobs every day but we just don't happen to write the blobs for some reason.

Therefore, we now add a warning for reads of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a blob is compressed if the compression setting is present in the config. This also means that we can't back out of compression once we enabled it.

Part of #5431

@arpad-m arpad-m requested a review from a team as a code owner July 2, 2024 17:06
@arpad-m arpad-m requested a review from problame July 2, 2024 17:06
@arpad-m
Copy link
Member Author

arpad-m commented Jul 2, 2024

I'd like most of the changes of this PR to be temporary: that is, eventually the compression setting should disappear and we assume any blob larger than 256 MiB must be a compressed blob. That change can happen however months in the future.

Copy link

github-actions bot commented Jul 2, 2024

3000 tests run: 2885 passed, 0 failed, 115 skipped (full report)


Code coverage* (full report)

  • functions: 32.7% (6934 of 21215 functions)
  • lines: 50.0% (54354 of 108614 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
4220e2f at 2024-07-02T17:57:34.054Z :recycle:

@arpad-m arpad-m requested a review from koivunej July 3, 2024 09:10
Comment on lines +89 to +91
if compression_bits > BYTE_UNCOMPRESSED {
warn!("reading key above future limit ({len} bytes)");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean that we are reading a previously compressed as uncompressed?

Shouldn't we also or alternatively check if the read bytes start with the zstd fourcc/magic?

Copy link
Contributor

@koivunej koivunej Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I am having a hard time parsing this compression_bits. I get it that here it can be the u32 after anding it with 0x0f or 0x7f which means -- aha, it is never masked.

Related note: On line L332R341 the LEN_COMPRESSION_BIT_MASK is used as literal 0xf0:

-assert_eq!(len_buf[0] & 0xf0, 0);
+assert_eq!(len_buf[0] & LEN_COMPRESSION_BIT_MASK, 0);

Ok... So perhaps I know understand. Possible compression_bits are:

match compression_bits >> 4 {
  0 => /* image layer written before the compression support or small value? */,
  1..8 => /* reserved */,
  8 => /* uncompressed blob */,
  9 => /* zstd */,
  10..=15 => /* undefined or written before compression support and too large, which we warn here? */
  _ => unreachable!("u4"),
}

If this is correctly understood, then okay maybe ... The compression_bits > BYTE_COMPRESSED just looks so off, in my mind a bitfield doesn't support ordered comparison. It'd be nice to have enums and matches for these. Err nope that cannot be correct.

Did you test that this warning is produced with some hand-crafted image file?

EDIT: rust snippet had 1..8 and 10..=15 wrong way around, possibly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean that we are reading a previously compressed as uncompressed?

If the compression setting is disabled, yes. This has the consequence that we can't turn off compression easily any more, but I think it's okay to have it for a while, after which point we'll (mostly) revert this PR.

It'd be nice to have enums and matches for these

match is not good for that as FOOBAR => is equivalent to a variable capture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my match correct? Did you test if this is hit? Will that be used as a success criteria for the compression support? If so, what is the plan to read all image layers?

Other questions remain, magic/fourcc instead of reserving more bits?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my match correct?

almost correct, the 1..8 range doesn't have the highest bit set so it's an indicator for small uncompressed values.

match compression_bits >> 4 {
  0..8 => /* small, uncompressed value below 128 bytes */,
  8 => /* uncompressed blob */,
  9 => /* zstd */,
  10..=15 => /* reserved or written before compression support and too large, which we warn here */
  _ => unreachable!("u4"),
}

Will that be used as a success criteria for the compression support?

What do you mean, can you expand?

If so, what is the plan to read all image layers?

??

magic/fourcc instead of reserving more bits?

What do you mean by that? I do not think we should autodetect zstd magic here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Christian wants this to be merged as-is, doing that now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

success criteria
plan to read all image layers

Christian mentioned these in #8238 (comment). I was wondering similarly what are the next steps.

magic/fourcc

What do you mean by that? I do not think we should autodetect zstd magic here.

zstd always starts the compressed bytes with the same 4 magic bytes. I was thinking if we should instead use that knowledge instead of awkwardly reserving bits, as I had no idea what was the plan for the next step.. But yeah, seems there is a plan after all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, the slack thread where I proposed to merge asap is https://neondb.slack.com/archives/C074JQU7NER/p1720021276202079

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side question: why is PageServerConf::image_compression an Option? Like with EvictionPolicy, a simple ImageCompressionAlgorithm::Disabled and been designed as the #[default].


To make deploying compression less dangerous, we therefore only assume a blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled it.

I'm confused by this.

IMO once we have written the first compressed blob to S3, we'll have to support decompressing layers with compressed blobs into perpituity. The PageServerConf::image_compression conflig flag should only be controlling whether we write new blobs with compression or not. But not affect reading of exisitng blobs.

So, I guess I'm rejecting the fundamental idea behind this PR. Or I'm missing something.

@arpad-m
Copy link
Member Author

arpad-m commented Jul 3, 2024

why is PageServerConf::image_compression an Option?

That's what the option type is for, to indicate None. But of course it can be also inlined.

we'll have to support decompressing layers with compressed blobs into perpituity.

yes, this PR is only meant for a temporary period until compression is rolled out everywhere. at that point, I'd like to revert most of it. Of course, for the transition period, and maybe also beyond, it might make sense to have the ability to easily revert the config setting.

The risk is that we might be reading 256 MiB large blobs in production right now, and on deploy of current main we yield corrupted data. This PR is just to avoid that situation.

Maybe both of your suggestions could be combined, and one could change ImageCompressionAlgorithm to:

enum ImageCompressionAlgorithm {
    DisabledNoDecompress,
    Disabled,
    Zstd { level: Option<i8> },
}

ImageCompressionAlgorithm::DisabledNoDecompress would interpret any blobs larger than 256 MiB as uncompressed, while ImageCompressionAlgorithm::Disabled would still decompress them. DisabledNoDecompress would be the default at the start, for a couple of weeks until compression is rolled out. Then, we can switch to Zstd or Disabled as the default, and remove the DisabledNoDecompress variant.

How does that sound?

@problame
Copy link
Contributor

problame commented Jul 3, 2024

Ok, thanks for clarifying.

I think it has to go like so:

  1. roll out software that guarantees we don't write new blobs >= 256MiB
  2. then do a full scrubber run, ensuring that all blobs are < 256MiB
  3. invariant now: all length fields have the LEN_COMPRESSION_BIT_MASK bits zeroed, so, we can use them for compression
  4. start shipping/enabling code that interprets the LEN_COMPRESSION_BIT_MASK

But let's take that discussion into Slack.


Regarding this PR, I now understand its importance and we should definitely merge it before the code from #8106 hits prod, because that would mean doing (4) before invariant (3) is established.

Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this, the decompress code needs to be dead at runtime until we've established invariant (3), see my previous post.

@arpad-m arpad-m merged commit e0891ec into main Jul 3, 2024
64 checks passed
@arpad-m arpad-m deleted the arpad/compression_2 branch July 3, 2024 16:02
arpad-m added a commit that referenced this pull request Jul 4, 2024
As per @koivunej 's request in
#8238 (comment) ,
use a runtime param instead of monomorphizing the function based on the value.

Part of #5431
arpad-m added a commit that referenced this pull request Jul 4, 2024
This flattens the compression algorithm setting, removing the
`Option<_>` wrapping layer and making handling of the setting easier.

It also adds a specific setting for *disabled* compression with the
continued ability to read copmressed data, giving us the option to
more easily back out of a compression rollout, should the need arise,
which was one of the limitations of #8238.

Implements my suggestion from
#8238 (comment) ,
inspired by Christian's review in
#8238 (review) .

Part of #5431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants