Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly fix doc generation issue #3137

Open
finestructure opened this issue Jun 14, 2024 · 16 comments · Fixed by SwiftPackageIndex/DocUploader#27
Open

Properly fix doc generation issue #3137

finestructure opened this issue Jun 14, 2024 · 16 comments · Fixed by SwiftPackageIndex/DocUploader#27
Assignees

Comments

@finestructure
Copy link
Member

Follow-up to #3134

We fixed it by rolling back to DocUploader 1.6.3 but that likely re-introduced issues with swift-metrics (#3069) this change meant to address. Figure out how to use DocUploader 1.7.3 (or a modified version) in builder for zipping that doesn't create empty archives.

@finestructure finestructure self-assigned this Jun 14, 2024
@finestructure
Copy link
Member Author

FWIW, recently swift-metrics doc builds and uploads have succeeded:

CleanShot 2024-06-27 at 10 42 47@2x

but the doc upload record is still stuck in the uploading state:

2024-06-22 02:04:00.255569+00 | https://github.com/apple/swift-metrics.git | 578 | 4 | default_branch | main | uploading |   |   | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/7161559981 | a43bfe21-d63d-4069-8a19-da7ac00aaa92 | macos-spm | {"major": 5, "minor": 10, "patch": 0} | 0d9af2fc-20ea-4a11-979d-6ef68bf9d888

CleanShot 2024-06-27 at 10 44 41@2x

One caveat: while 2.5.0 clearly must have uploaded, the same can't necessarily be concluded for main.

@finestructure
Copy link
Member Author

Ok, main docs are actually a 404:

CleanShot 2024-06-27 at 10 50 23@2x

because there's no doc archive in S3:

CleanShot 2024-06-27 at 10 49 28@2x

@finestructure
Copy link
Member Author

What's truly bizarre is that 2.5.0 is the exact same git commit (e0165b53d49b413dd987526b641e05e246782685) as the tag 2.5.0. One works fine, the other doesn't?

@finestructure
Copy link
Member Author

My fix using os level unzip in case the library zip fails doesn't work. While the image we use to build the uploader has unzip, the actual runtime where it's being executed doesn't.

@finestructure
Copy link
Member Author

finestructure commented Jul 2, 2024

Current status:

  • marmelroy/Zip fails to unzip its own product (test_issue_3069)
  • weichsel/ZIPFoundation fails to recursively zip directories (test_zip_roundtrip + test_issue_3069)
  • OS level zip can't be used, because it's not available in the Lambda runtime
  • SwiftZip/SwiftZip needs explicit file iteration on unzip (doable, but not great), is only at 0.0.5 and seems unmaintained, fails to create archive in some tests (test_zip_roundtrip + test_issue_3069)

There isn't really a good choice here. Options are

  1. try and figure out why marmelroy/Zip can't roundtrip the metrics archive - I'll give this a timeboxed try to see if I can figure out what exactly it is in the archive that trips this up
  2. use marmelroy/Zip for zipping and weichsel/ZIPFoundation for unzipping - meh, but by far the easiest option
  3. use weichsel/ZIPFoundation and deal with the recursive directory zipping ourselves - doable but I don't love having to mess with lots of path building in a critical piece
  4. we don't have unzip in the Lambda env but we do have control over zip/unzip in the builder. I.e. we could either generally zip with the os level zip (assuming, to be tested, that an os level zip of metrics can be unzipped by marmelroy/Zip) or try roundtripping it before uploading and then rezip using /usr/bin/zip. The latter is probably more complicated that is worth it.

NB: marmelroy/Zip also has a path traversal CVE against it. It doesn't affect us, because we only unzip our own archives but it's still a warning we could do without. It's been there for weeks, so it's unlikely to be addressed. (It also doesn't look too complicated to provide a patch for but there's been no activity on the repo, so chances are it wouldn't be merged and we'd end up maintaining yet another a fork.)

Going to give 1. a brief try and then 4. seems like our best option.

@daveverwer
Copy link
Member

There isn't really a good choice here. Options are

  1. Explore potential other archive options like tar archives (with zipping, of course) that might open up the packages we could use. For example, from a quick look https://swiftpackageindex.com/tsolomko/SWCompression looks like it might be a possibility for a mature yet still maintained package. It also supports plain zip files too.

@daveverwer
Copy link
Member

For extraction, tar may be on the lambda image, too.

@finestructure
Copy link
Member Author

tar has the same problem as unzip, unfortunately - it's not available.

I'd also rather not change the format, because it's being used in two different components, making it trickier to deploy. However, SWCompression seems to support zip, so that's definitely another candidate to try with the problematic metrics zip file.

@finestructure
Copy link
Member Author

Unfortunately, SWCompression doesn't seem to have a file/URL API, which is the same problem that SwiftZip/SwiftZip has. The nice thing about marmelroy/Zip is that you can give it a list of file and folder URLs and it handles discovery and archiving of all content, pretty much exactly like the equivalent zip/unzip commands would.

@finestructure
Copy link
Member Author

Ok, the problem is apparently the 0 bytes theme-settings.json in that archive. Checking a few other archives that seems to be unusual. We can certainly make sure we don't leave those in the archive in the builder (although it's odd that the archiver should trip over it but 🤷‍♂️).

@finestructure
Copy link
Member Author

finestructure commented Jul 2, 2024

Odd, simply moving that file out of and back into the folder makes it round-trippable. This should be easy enough to detect in the builder and report back.

@finestructure
Copy link
Member Author

I can't reproduce the unzip failure with a newly created doc archive for the same revision in a builder test. Either the env is slightly different or something else has changed since I pulled the "broken" archive from the S3 inbox. FWIW, it didn't seem an isolated "broken" archive - the main revision docs failed to upload in multiple attempts. So whatever caused the "broken" archive then did it consistently.

I'll redeploy an updated builder with all the latest changes when Xcode 16b2 processing is done to see if we still produce these "broken" archives.

@finestructure
Copy link
Member Author

Ok, the problem is apparently the 0 bytes theme-settings.json in that archive. Checking a few other archives that seems to be unusual.

We're actually explicitly touching theme-settings.json in order to avoid 404s in the console when viewing DocC docs (#2707). So if this was truly the problem it'd be more widespread. It must have been something else about the archive and it's probably just the fact that I nudged the metadata when moving files that "fixed" the archive.

@finestructure
Copy link
Member Author

This is fascinating. I can finally reproduce the round-trip unzipFail exception with swift-metrics. The reason I couldn't initially is that I pinned it to main's reference e0165b53d49b413dd987526b641e05e246782685 so the test wouldn't drift with changes to swift-metrics' main. However that actually makes the round-trip work. It only raises the exception if I actually check out and doc gen with the reference main instead of the sha.

I'll see if the error persists in a fork of swift-metrics so I can still pin it in the test.

@finestructure
Copy link
Member Author

So I've reproduced the zip-roundtrip issue, fixed it via doc uploader 1.9.0 and builder 4.45.1, confirmed it's rezipping and when re-processing swift-metrics with both these fixed versions we get

2024-07-05T15:23:33+0000 warning Lambda : lifecycleIteration=0 [AWSLambdaRuntimeCore] lambda handler returned an error: unzipFail

https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logsV2:log-groups/log-group/$252Faws$252Flambda$252FDocUploaderLambda-Prod-UploadFunction-GGu55JqjJ0Lz/log-events/2024$252F07$252F05$252F$255B$2524LATEST$255Daf20f64812644dc6ba942134b612469d

It's maddening.

@finestructure
Copy link
Member Author

Ok, it's clear what's going wrong here. It's a path issue when re-zipping the input with zip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants