-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecation fix: mktemp is deprecated #789
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #789 +/- ##
=======================================
Coverage 77.90% 77.90%
=======================================
Files 24 24
Lines 5826 5826
=======================================
Hits 4539 4539
Misses 1287 1287
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reproduced the Windows test failure locally on macOS, it seems all other CI jobs were just skipping the test as they are running without zarr
.
mkstmp
is apparently not quite a drop-in replacement for mktmp
, in particular as it specifically protects its file agains concurrent access. But what to_zarr
really needs is the directory to put its data and metadata in, so TemporaryDirectory
should do the job here – only remains to be clarified just how temporary it is, i.e. when can the cleanup safely be done.
filenum, filename = tempfile.mkstemp() | ||
with dask.config.set(**cube._scheduler_kwargs): | ||
cube._data.to_zarr(filename) | ||
cube._data = da.from_zarr(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filenum, filename = tempfile.mkstemp() | |
with dask.config.set(**cube._scheduler_kwargs): | |
cube._data.to_zarr(filename) | |
cube._data = da.from_zarr(filename) | |
tmpdir = tempfile.TemporaryDirectory() | |
with dask.config.set(**cube._scheduler_kwargs): | |
cube._data.to_zarr(tmpdir) | |
cube._data = da.from_zarr(tmpdir) | |
tmpdir.cleanup() |
Without the .cleanup()
the test raises a warning over the leftover directory; it's not quite clear to me if the saved data in tmp_dir are never to be accessed again...
Marked as 'draft' and may be a 'do-not-merge' b/c it appears the deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also understanding mktemp
less the more I look at it.
In the current version it is creating a temporary directory which then holds the 0.0.0
and .zarray
files, but all of them still exist after explicitly deleting cube
, and even after closing the python interpreter! Yet the open_files
test does not complain and there is no warning shown. Looks a bit like black magic to me, but maybe it is simply mktemp
's implementation not to care about what happens with the directory afterwards, and not giving a warning either...
With just using tempfile.TemporaryDirectory
the directory stays around after del cube
, but is removed upon terminating the python session, and the open_files
test is passing, but raising a
tempfile.py:821: ResourceWarning: Implicitly cleaning up <TemporaryDirectory ...
With the latest suggestion and the custom __del__
method the directory is removed upon del cube
, and the test passes without warnings.
So this would seem to solve everything, and there are similar use cases in some Astropy classes or Numpy's DataSource
, but I am a bit concerned about the reservations many people seem to have about using __del__
, e.g. in https://stackoverflow.com/a/2452895.
Still, if it works (should give this a try on the CI). Alternatively one might simply set the tests to ignore the warning, but properly cleaning up seems the better way, even if we need to recourse on __del__
...
filenum, filename = tempfile.mkstemp() | ||
with dask.config.set(**cube._scheduler_kwargs): | ||
cube._data.to_zarr(filename) | ||
cube._data = da.from_zarr(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filenum, filename = tempfile.mkstemp() | |
with dask.config.set(**cube._scheduler_kwargs): | |
cube._data.to_zarr(filename) | |
cube._data = da.from_zarr(filename) | |
cube._tmpdir = tempfile.TemporaryDirectory() | |
with dask.config.set(**cube._scheduler_kwargs): | |
cube._data.to_zarr(cube._tmpdir.name) | |
cube._data = da.from_zarr(cube._tmpdir.namefile) |
Slightly different option; this will enable the TemporaryDirectory
to be cleaned up on deletion of the cube by setting up a custom teardown method for DaskSpectralCubeMixin
:
def __del__(self):
# Clean up any `tempfile.TemporaryDirectory` created by `save_to_tmp_dir`.
if hasattr(self, '_tmpdir'):
self._tmpdir.cleanup()
https://docs.python.org/3/library/tempfile.html#deprecated-functions-and-variables