Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded column read from GCS bug #236

Open
willcrichton opened this issue Dec 11, 2018 · 1 comment
Open

Multithreaded column read from GCS bug #236

willcrichton opened this issue Dec 11, 2018 · 1 comment

Comments

@willcrichton
Copy link
Member

Working on @Haotianz94 transcript aligner. Empirically, when column read is multithreaded within a file, getting nondeterministic issues where incorrect bits are being read. Sometimes the buffer isn't long enough, sometimes the bytes are corrupted (pickle has an error).

@willcrichton
Copy link
Member Author

From the email thread:

Been debugging this for a few hours. My experiments suggest that there is a race condition in the S3 API. What's happening is that when reading multiple metadata files from S3 in parallel in separate threads/clients, nondeterministically attempting to read a particular file will return the bytes for a different file being read at the same time. Like literally, executing read(LENGTH); seek(0); read(LENGTH) will return different bytes (also nondeterministically).
I'm at a complete loss as to how this is possible. This seems to only happen with metadata files (small, 48 bytes), not with data files. My guess is that the bug is related to reading sufficiently small files. We've never seen this bug before because we've never run jobs with as few outputs per file as Haotian's (I/O packet size = 4).

Just pushed a workaround that uses multiprocessing instead of multithreading (c275d03). Still need to figure out what the core issue is.

@willcrichton willcrichton added this to the Strata milestone Feb 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant