You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While running the following checkpoint code observing this failure
# Model Checkpointing checkpoint=s3torchconnector.S3Checkpoint(region=REGION, s3client_config=config)
model=torchvision.models.resnet18()
# Save to MinIOwithcheckpoint.writer(CHECKPOINT_URI+"epoch0.ckpt") aswriter:
torch.save(model.state_dict(), writer)
# Load from MinIOwithcheckpoint.reader(CHECKPOINT_URI+"epoch0.ckpt") asreader:
state_dict=torch.load(reader)
model.load_state_dict(state_dict)
thread '<unnamed>' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mountpoint-s3-client-0.11.0/src/s3_crt_client.rs:295:89:
called `Result::unwrap()` on an `Err` value: Error(46, "aws-c-common: AWS_ERROR_SYS_CALL_FAILURE, System call failure.")
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: core::result::unwrap_failed
3: mountpoint_s3_client::s3_crt_client::S3CrtClient::new
4: _mountpoint_s3_client::mountpoint_s3_client::MountpointS3Client::new_s3_client
5: _mountpoint_s3_client::mountpoint_s3_client::_::<impl pyo3::impl_::pyclass::PyMethods<_mountpoint_s3_client::mountpoint_s3_client::MountpointS3Client> for pyo3::impl_::pyclass::PyClassImplCollector<_mountpoint_s3_client::mountpoint_s3_client::MountpointS3Client>>::py_methods::ITEMS::trampoline
6: <unknown>
7: _PyObject_MakeTpCall
8: _PyEval_EvalFrameDefault
9: PyObject_CallOneArg
10: _PyObject_GenericGetAttrWithDict
11: PyObject_GetAttr
12: _PyEval_EvalFrameDefault
13: PyEval_EvalCode
14: <unknown>
15: <unknown>
16: _PyRun_SimpleFileObject
17: _PyRun_AnyFileObject
18: Py_RunMain
19: Py_BytesMain
20: __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
21: __libc_start_main_impl
at ./csu/../csu/libc-start.c:360:3
22: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Traceback (most recent call last):
File "/home/minio/s3pytorch.py", line 75, in <module>
with checkpoint.writer(CHECKPOINT_URI + "epoch0.ckpt") as writer:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minio/pytorchtest/lib/python3.12/site-packages/s3torchconnector/s3checkpoint.py", line 60, in writer
return self._client.put_object(bucket, key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minio/pytorchtest/lib/python3.12/site-packages/s3torchconnector/_s3client/_s3client.py", line 116, in put_object
return S3Writer(self._client.put_object(bucket, key, storage_class))
^^^^^^^^^^^^
File "/home/minio/pytorchtest/lib/python3.12/site-packages/s3torchconnector/_s3client/_s3client.py", line 65, in _client
self._real_client = self._client_builder()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/minio/pytorchtest/lib/python3.12/site-packages/s3torchconnector/_s3client/_s3client.py", line 83, in _client_builder
return MountpointS3Client(
^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Error(46, "aws-c-common: AWS_ERROR_SYS_CALL_FAILURE, System call failure.")
Relevant log output
No response
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
Thank you for reaching out and providing the detailed information about the issue you are facing. We appreciate your interest in using the S3 Connector for PyTorch.
The error you encountered seems to be related to the interaction between our library and MinIO, an Amazon S3-compatible object storage server. Our project's primary goal is to provide optimized access to Amazon S3, and we do not actively maintain compatibility with other S3-compatible storage systems.
Could you please try to run your example in your environment against Amazon S3 to understand if the issue is a regression in our underlying libraries or a compatibility issue specific to MinIO? This would help us narrow down the root cause of the problem.
While we strive for compatibility with other S3-compatible systems whenever possible, we do not dedicate resources specifically for this purpose. If our library happens to work with a compatible system, that's great, but we cannot guarantee support or prioritize addressing issues related to non-Amazon S3 storage providers.
We appreciate your understanding that our focus is solely on Amazon S3, and we may not be able to assist with issues related to other storage systems. However, if you discover a regression or bug in our library when used with Amazon S3, we will be happy to investigate and address it.
s3torchconnector version
s3torchconnector-1.2.7
s3torchconnectorclient version
s3torchconnectorclient-1.2.7
AWS Region
us-east-1
Describe the running environment
Not on EC2
What happened?
While running the following checkpoint code observing this failure
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: