Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🦬 Big file fixes #90

Merged
merged 3 commits into from
Sep 26, 2023
Merged

🦬 Big file fixes #90

merged 3 commits into from
Sep 26, 2023

Conversation

rowanseymour
Copy link
Member

Apologies for the delay getting to this, but I think this finally addresses #45

  • Fix erroring when creating archives bigger than 5GB
  • Fix checking S3 uploads so that we always check size, but only check hash for files uploaded as single part

@tybritten, @mzealey please take a look

@codecov
Copy link

codecov bot commented Sep 26, 2023

Codecov Report

Merging #90 (7bbf668) into main (13c73b1) will decrease coverage by 0.12%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##             main      #90      +/-   ##
==========================================
- Coverage   68.00%   67.88%   -0.12%     
==========================================
  Files           6        6              
  Lines        1222     1227       +5     
==========================================
+ Hits          831      833       +2     
- Misses        286      288       +2     
- Partials      105      106       +1     
Files Coverage Δ
archives/archives.go 72.59% <ø> (+0.32%) ⬆️
archives/config.go 100.00% <100.00%> (ø)
archives/messages.go 62.71% <28.57%> (-1.09%) ⬇️
archives/runs.go 50.27% <28.57%> (-0.86%) ⬇️
archives/s3.go 64.70% <58.33%> (+0.26%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@mzealey
Copy link

mzealey commented Sep 26, 2023

So I'm not convinced that this really addresses the issue we see in #87 / #88 - because it seems like if your S3 buckets are encrypted then ETag != MD5. So I think a bypass flag is also needed, although I agree that the size check are a good failsafe to have.

@tybritten
Copy link
Contributor

Any reason not to calculate the Etag locally before uploading to retain the same functionality >5GB?

@rowanseymour
Copy link
Member Author

rowanseymour commented Sep 26, 2023

@mzealey ahh so I guess it's working for us because we're using SSE-S3 encryption but you're using SSE-KMS? I'm not against adding that flag but obviously not checking any hashes is a little risky.

@tybritten we could lean on https://github.com/peak/s3hash/blob/master/s3hash.go to calculate the ETag.. but will that even be correct if the user is using a different encryption type?

@mzealey
Copy link

mzealey commented Sep 26, 2023

Yes apparently we use KMS. Given that according to the doc referenced in #87, 2 of the 3 options (KMS, and >5gb) DON'T do MD5 (and I can't see any docs to detail what they do use) it seems that a bypass (by default??) would be useful.

@rowanseymour
Copy link
Member Author

Yes apparently we use KMS. Given that according to the doc referenced in #87, 2 of the 3 options (KMS, and >5gb) DON'T do MD5 (and I can't see any docs to detail what they do use) it seems that a bypass (by default??) would be useful.

Well this PR already addresses the > 5GB case. I don't think we want no-checks to be the default because that is inherently riskier but we should add something to the docs to recommend people use SSE-S3 encryption - and I'll make the checking optional too.

@tybritten
Copy link
Contributor

That makes sense, IMHO

@rowanseymour rowanseymour merged commit 410c77e into main Sep 26, 2023
5 of 7 checks passed
@rowanseymour rowanseymour deleted the big_file_fix branch September 26, 2023 16:59
@github-actions github-actions bot locked and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants