-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[auto-ranged PUT]: CompleteMultipartUpload called while 1 multipart upload is missing #269
Comments
Hi, Thank you for reporting the issue. Could you please attach the full CRT logs and provide some reproduction steps? Thank you! |
@waahm7 - the problem is not easily reproducible, it seems to be a timing-related problem. So far, after 10s of thousands of batch jobs, we have only data for this instance. (It is possible that this has happened more frequently in production, since batch jobs are retried on failure.). With regard to the CRT logs, we log at [ERROR] 2023-02-15 16:39:26.623 S3MetaRequest [140468667021056] id=0x7fc125951d00 Meta request cannot recover from error 14343 (Invalid response status from request). (request=0x7fc15e00a600, response status=400)
terminate called after throwing an instance of 'av::CheckException'
what(): Check failure at perception/s2a/dataset_extraction/data_extraction_module.cc:574:
Expected: 'x is ok', with x := 'output_blob_->close()' [av::status::Status]
x = PutObject() failed
where: cloud/aws/s3/s3_streambuf.cc:93
extra: s3://perception-prod-training-data/opt/a831200c/s2a/2023-02-15-bless-collect_dking_updateOverlapFeb10_latestIssues/test/36c6f8923fe514d6b5a28ac5dbdea034.rats: HTTP response code: 400
Resolved remote host IP address:
Request ID: 2BAQ8WRZGGH3PPG1
[... rest as above] The following is the summary of a multi-week-long effort to narrow down the cause of the problem:
Since the
|
Thank you for the details. Is the request getting paused and resumed later? |
No, it does not use |
@grrtrr there have been some update since this issue was opened. Are you still running into this with the latest version of this repo? |
@jmklix what are the updates and how do they fix the issue described here? In particular, have test cases been added to ensure the condition does not happen? |
CompleteMultipartUpload
in auto-ranged-PUT failed due to a missing second (and final)UploadPart
.We ran into this problem with
aws-c-s3
0.1.51 andaws-sdk-cpp
1.10.54 on LinuxProblem description
Our API issued an
S3CrtClient->PutObject
, and it resulted in the following error:The
PutObject
in frame 10 invokes theS3CrtClient->PutObject
call.Further investigation showed that the first UploadPart succeeded (visible in
list-parts
, there was no evidence (neitherlist-parts
nor API logs) that the secondUploadPart
completed.In our logs, there was no further
aws-c-s3
error indicating a failed operation.Open Question
The
CompleteMultipartUpload
request uses the ETags of the completed requests, so how could theCompleteMultiPartUpload
have been sent with the ETag of the second UploadPart?Perhaps it was sent with only 1 ETag (that of the first, successfully completed UploadPart).
The text was updated successfully, but these errors were encountered: