Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use streaming compression for files. #47

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jpeach
Copy link
Contributor

@jpeach jpeach commented Jun 20, 2021

Rather than read large files into memory and process them multiple
times (for hashing and for compression), use streaming compression
for files so that only the compressed output needs to be fully in
memory.

This has the side-effect of the object ID being generated by hashing
the compressed text (which is better because there's less to hash after
compression). This means that the function image needs to be regenerated
to match.

This updates #45.

Rather than read large files into memory and process them multiple
times (for hashing and for compression), use streaming compression
for files so that only the compressed output needs to be fully in
memory.

This has the side-effect of the object ID being generated by hashing
the compressed text (which is better because there's less to hash after
compression). This means that the function image needs to be regenerated
to match.

This updates nelhage#45.
@nelhage
Copy link
Owner

nelhage commented Jun 21, 2021

Seems like there's a test failure -- can you take a look?

This is also somehow causing OOMs for my LLVM build (again using remote preprocessing); I haven't figured out why yet.

Copy link
Owner

@nelhage nelhage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few line-by-line comments. Still investigating the memory issue.

}

if _, err := io.Copy(encoder, obj); err != nil {
encoder.Close()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use defer encoder.Close() above to ensure this is closed on every exit path

@@ -30,7 +31,9 @@ type GetRequest struct {
var ErrNotExists = errors.New("Requested object does not exist")

type Store interface {
Store(ctx context.Context, obj []byte) (string, error)
StoreBytes(ctx context.Context, obj []byte) (string, error)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default I would only have Store, and would make a StoreBytes helper somewhere that wraps the bytes in a bytes.Buffer.

If an implementation wants to do something more efficient when it knows it has a byte buffer, it can do a type assertion to *bytes.Buffer to check for that.

if file.Local.Path != "" {
pfile, err := files.ReadFile(ctx, store, file.Local.Path)
switch err {
case nil:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use an if test for a nilness check.

@jpeach
Copy link
Contributor Author

jpeach commented Jun 21, 2021

Seems like there's a test failure -- can you take a look?

yeh I broke some internal contract ... will take a look next weekend :)

This is also somehow causing OOMs for my LLVM build (again using remote preprocessing); I haven't figured out why yet.

That's weird! For me with (partial) local preprocessing this makes memory usage nice and stable.

@nelhage
Copy link
Owner

nelhage commented Jun 21, 2021

Hm, I think I understand the OOM now. For remote preprocessing, we see the same header files many, many, many times; with this change, we compress each time before we hash and look at the upload cache, which means we actually generate many times more garbage in that case than we did previously, and I think we end up in a similar situation where the GC fails to keep up.

I'm also not sure if zstd is deterministic – is there a risk that we end up uploading multiple versions of the same file if they get compressed differently?

@nelhage
Copy link
Owner

nelhage commented Jun 22, 2021

Looking a bit more, it looks like zstd has a massive per-encoder memory footprint -- at least a few MiB. For remote preprocessing, early builds upload hundreds of header files in a go, which results in us trying to create a new encoder for each one concurrently. It might make sense to rate-limit compression to one job per core or something, anyways, which would help with that…

@jpeach
Copy link
Contributor Author

jpeach commented Jun 22, 2021

Looking a bit more, it looks like zstd has a massive per-encoder memory footprint -- at least a few MiB. For remote preprocessing, early builds upload hundreds of header files in a go, which results in us trying to create a new encoder for each one concurrently. It might make sense to rate-limit compression to one job per core or something, anyways, which would help with that…

Maybe using a sync.Pool would help with that, but based on your comments about the remote compile use case, I probably want to revisit this PR. It's starting to feel like it causes more problems than it solves :)

I'm wondering whether just using mmap (for larger files) might be a better way to deal with memory costs of reading the inputs.

@aidansteele
Copy link

@nelhage Are you able to share your setup for compiling LLVM using llama? I have a few ideas regarding perf improvements that I'd like to try out and it would be good to have a similar baseline to what you currently see.

@nelhage
Copy link
Owner

nelhage commented Aug 17, 2021

Sure! The blog post should have most of it (https://blog.nelhage.com/post/building-llvm-in-90s/); I'm building on an AMD Ryzen 8 3900X 12-core / 24-thread processor, on Sonic fiber internet (but on wifi on my desktop). Client is Ubuntu 20.04 Focal. What else would be helpful for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants