Skip to content

Commit

Permalink
Merge branch 'main' into improve-safe-chunk-validation
Browse files Browse the repository at this point in the history
  • Loading branch information
max-sixty committed Sep 30, 2024
2 parents 58f1866 + 7bdc6d4 commit 409cc1c
Show file tree
Hide file tree
Showing 32 changed files with 520 additions and 106 deletions.
2 changes: 1 addition & 1 deletion .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
github: numfocus
custom: http://numfocus.org/donate-to-xarray
custom: https://numfocus.org/donate-to-xarray
3 changes: 3 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ jobs:
python-version: "3.10"
os: ubuntu-latest
# Latest python version:
- env: "all-but-numba"
python-version: "3.12"
os: ubuntu-latest
- env: "all-but-dask"
# Not 3.12 because of pint
python-version: "3.11"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/nightly-wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
fi
- name: Upload wheel
uses: scientific-python/upload-nightly-action@b67d7fcc0396e1128a474d1ab2b48aa94680f9fc # 0.5.0
uses: scientific-python/upload-nightly-action@ccf29c805b5d0c1dc31fa97fcdb962be074cade3 # 0.6.0
with:
anaconda_nightly_upload_token: ${{ secrets.ANACONDA_NIGHTLY }}
artifacts_path: dist
4 changes: 2 additions & 2 deletions .github/workflows/pypi-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ jobs:
path: dist
- name: Publish package to TestPyPI
if: github.event_name == 'push'
uses: pypa/[email protected].1
uses: pypa/[email protected].2
with:
repository_url: https://test.pypi.org/legacy/
verbose: true
Expand All @@ -111,6 +111,6 @@ jobs:
name: releases
path: dist
- name: Publish package to PyPI
uses: pypa/[email protected].1
uses: pypa/[email protected].2
with:
verbose: true
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Xarray is a fiscally sponsored project of
[NumFOCUS](https://numfocus.org), a nonprofit dedicated to supporting
the open source scientific computing community. If you like Xarray and
want to support our mission, please consider making a
[donation](https://numfocus.salsalabs.org/donate-to-xarray/) to support
[donation](https://numfocus.org/donate-to-xarray) to support
our efforts.

## History
Expand Down
54 changes: 54 additions & 0 deletions ci/requirements/all-but-numba.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: xarray-tests
channels:
- conda-forge
- nodefaults
dependencies:
# Pin a "very new numpy" (updated Sept 24, 2024)
- numpy>=2.1.1
- aiobotocore
- array-api-strict
- boto3
- bottleneck
- cartopy
- cftime
- dask-core
- dask-expr # dask raises a deprecation warning without this, breaking doctests
- distributed
- flox
- fsspec
- h5netcdf
- h5py
- hdf5
- hypothesis
- iris
- lxml # Optional dep of pydap
- matplotlib-base
- nc-time-axis
- netcdf4
# numba, sparse, numbagg, numexpr often conflicts with newer versions of numpy.
# This environment helps us test xarray with the latest versions
# of numpy
# - numba
# - numbagg
# - numexpr
# - sparse
- opt_einsum
- packaging
- pandas
# - pint>=0.22
- pip
- pooch
- pre-commit
- pyarrow # pandas raises a deprecation warning without this, breaking doctests
- pydap
- pytest
- pytest-cov
- pytest-env
- pytest-xdist
- pytest-timeout
- rasterio
- scipy
- seaborn
- toolz
- typing_extensions
- zarr
4 changes: 2 additions & 2 deletions doc/user-guide/pandas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ DataFrames:
xr.DataArray.from_series(s)
Both the ``from_series`` and ``from_dataframe`` methods use reindexing, so they
work even if not the hierarchical index is not a full tensor product:
work even if the hierarchical index is not a full tensor product:

.. ipython:: python
Expand All @@ -126,7 +126,7 @@ Particularly after a roundtrip, the following deviations are noted:
To avoid these problems, the third-party `ntv-pandas <https://github.com/loco-philippe/ntv-pandas>`__ library offers lossless and reversible conversions between
``Dataset``/ ``DataArray`` and pandas ``DataFrame`` objects.

This solution is particularly interesting for converting any ``DataFrame`` into a ``Dataset`` (the converter find the multidimensional structure hidden by the tabular structure).
This solution is particularly interesting for converting any ``DataFrame`` into a ``Dataset`` (the converter finds the multidimensional structure hidden by the tabular structure).

The `ntv-pandas examples <https://github.com/loco-philippe/ntv-pandas/tree/main/example>`__ show how to improve the conversion for the previous ``Dataset`` example and for more complex examples.

Expand Down
3 changes: 3 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ New Features
`Tom Nicholas <https://github.com/TomNicholas>`_.
- Added zarr backends for :py:func:`open_groups` (:issue:`9430`, :pull:`9469`).
By `Eni Awowale <https://github.com/eni-awowale>`_.
- Added support for vectorized interpolation using additional interpolators
from the ``scipy.interpolate`` module (:issue:`9049`, :pull:`9526`).
By `Holly Mandel <https://github.com/hollymandel>`_.

Breaking changes
~~~~~~~~~~~~~~~~
Expand Down
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,6 @@ filterwarnings = [
"default:Using a non-tuple sequence for multidimensional indexing is deprecated:FutureWarning",
"default:Duplicate dimension names present:UserWarning:xarray.namedarray.core",
"default:::xarray.tests.test_strategies", # TODO: remove once we know how to deal with a changed signature in protocols
"ignore:__array__ implementation doesn't accept a copy keyword, so passing copy=False failed.",
]

log_cli_level = "INFO"
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ def __complex__(self: Any) -> complex:
return complex(self.values)

def __array__(
self: Any, dtype: DTypeLike | None = None, copy: bool | None = None
self: Any, dtype: np.typing.DTypeLike = None, /, *, copy: bool | None = None
) -> np.ndarray:
if not copy:
if np.lib.NumpyVersion(np.__version__) >= "2.0.0":
Expand Down
10 changes: 5 additions & 5 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -2224,12 +2224,12 @@ def interp(
Performs univariate or multivariate interpolation of a DataArray onto
new coordinates using scipy's interpolation routines. If interpolating
along an existing dimension, :py:class:`scipy.interpolate.interp1d` is
called. When interpolating along multiple existing dimensions, an
along an existing dimension, either :py:class:`scipy.interpolate.interp1d`
or a 1-dimensional scipy interpolator (e.g. :py:class:`scipy.interpolate.KroghInterpolator`)
is called. When interpolating along multiple existing dimensions, an
attempt is made to decompose the interpolation into multiple
1-dimensional interpolations. If this is possible,
:py:class:`scipy.interpolate.interp1d` is called. Otherwise,
:py:func:`scipy.interpolate.interpn` is called.
1-dimensional interpolations. If this is possible, the 1-dimensional interpolator is called.
Otherwise, :py:func:`scipy.interpolate.interpn` is called.
Parameters
----------
Expand Down
16 changes: 8 additions & 8 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3885,12 +3885,12 @@ def interp(
Performs univariate or multivariate interpolation of a Dataset onto
new coordinates using scipy's interpolation routines. If interpolating
along an existing dimension, :py:class:`scipy.interpolate.interp1d` is
called. When interpolating along multiple existing dimensions, an
along an existing dimension, either :py:class:`scipy.interpolate.interp1d`
or a 1-dimensional scipy interpolator (e.g. :py:class:`scipy.interpolate.KroghInterpolator`)
is called. When interpolating along multiple existing dimensions, an
attempt is made to decompose the interpolation into multiple
1-dimensional interpolations. If this is possible,
:py:class:`scipy.interpolate.interp1d` is called. Otherwise,
:py:func:`scipy.interpolate.interpn` is called.
1-dimensional interpolations. If this is possible, the 1-dimensional interpolator
is called. Otherwise, :py:func:`scipy.interpolate.interpn` is called.
Parameters
----------
Expand Down Expand Up @@ -10621,7 +10621,7 @@ def rolling(
--------
Dataset.cumulative
DataArray.rolling
core.rolling.DatasetRolling
DataArray.rolling_exp
"""
from xarray.core.rolling import DatasetRolling

Expand Down Expand Up @@ -10651,9 +10651,9 @@ def cumulative(
See Also
--------
Dataset.rolling
DataArray.cumulative
core.rolling.DatasetRolling
Dataset.rolling
Dataset.rolling_exp
"""
from xarray.core.rolling import DatasetRolling

Expand Down
38 changes: 35 additions & 3 deletions xarray/core/datatree.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
from xarray.core.dataset import calculate_dimensions

if TYPE_CHECKING:
import numpy as np
import pandas as pd

from xarray.core.datatree_io import T_DataTreeNetcdfEngine, T_DataTreeNetcdfTypes
Expand Down Expand Up @@ -156,6 +157,34 @@ def check_alignment(
check_alignment(child_path, child_ds, base_ds, child.children)


def _deduplicate_inherited_coordinates(child: DataTree, parent: DataTree) -> None:
# This method removes repeated indexes (and corresponding coordinates)
# that are repeated between a DataTree and its parents.
#
# TODO(shoyer): Decide how to handle repeated coordinates *without* an
# index. Should these be allowed, in which case we probably want to
# exclude them from inheritance, or should they be automatically
# dropped?
# https://github.com/pydata/xarray/issues/9475#issuecomment-2357004264
removed_something = False
for name in parent._indexes:
if name in child._node_indexes:
# Indexes on a Dataset always have a corresponding coordinate.
# We already verified that these coordinates match in the
# check_alignment() call from _pre_attach().
del child._node_indexes[name]
del child._node_coord_variables[name]
removed_something = True

if removed_something:
child._node_dims = calculate_dimensions(
child._data_variables | child._node_coord_variables
)

for grandchild in child._children.values():
_deduplicate_inherited_coordinates(grandchild, child)


def _check_for_slashes_in_names(variables: Iterable[Hashable]) -> None:
offending_variable_names = [
name for name in variables if isinstance(name, str) and "/" in name
Expand Down Expand Up @@ -374,7 +403,7 @@ def map( # type: ignore[override]


class DataTree(
NamedNode,
NamedNode["DataTree"],
MappedDatasetMethodsMixin,
MappedDataWithCoords,
DataTreeArithmeticMixin,
Expand Down Expand Up @@ -485,6 +514,7 @@ def _pre_attach(self: DataTree, parent: DataTree, name: str) -> None:
node_ds = self.to_dataset(inherited=False)
parent_ds = parent._to_dataset_view(rebuild_dims=False, inherited=True)
check_alignment(path, node_ds, parent_ds, self.children)
_deduplicate_inherited_coordinates(self, parent)

@property
def _coord_variables(self) -> ChainMap[Hashable, Variable]:
Expand Down Expand Up @@ -737,7 +767,9 @@ def __bool__(self) -> bool:
def __iter__(self) -> Iterator[str]:
return itertools.chain(self._data_variables, self._children) # type: ignore[arg-type]

def __array__(self, dtype=None, copy=None):
def __array__(
self, dtype: np.typing.DTypeLike = None, /, *, copy: bool | None = None
) -> np.ndarray:
raise TypeError(
"cannot directly convert a DataTree into a "
"numpy array. Instead, create an xarray.DataArray "
Expand Down Expand Up @@ -1350,7 +1382,7 @@ def map_over_subtree(
func: Callable,
*args: Iterable[Any],
**kwargs: Any,
) -> DataTree | tuple[DataTree]:
) -> DataTree | tuple[DataTree, ...]:
"""
Apply a function to every dataset in this subtree, returning a new tree which stores the results.
Expand Down
5 changes: 3 additions & 2 deletions xarray/core/formatting.py
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@ def inline_variable_array_repr(var, max_width):
"""Build a one-line summary of a variable's data."""
if hasattr(var._data, "_repr_inline_"):
return var._data._repr_inline_(max_width)
if var._in_memory:
if getattr(var, "_in_memory", False):
return format_array_flat(var, max_width)
dask_array_type = array_type("dask")
if isinstance(var._data, dask_array_type):
Expand Down Expand Up @@ -1102,7 +1102,8 @@ def _datatree_node_repr(node: DataTree, show_inherited: bool) -> str:
summary.append(f"{dims_start}({dims_values})")

if node._node_coord_variables:
summary.append(coords_repr(node.coords, col_width=col_width, max_rows=max_rows))
node_coords = node.to_dataset(inherited=False).coords
summary.append(coords_repr(node_coords, col_width=col_width, max_rows=max_rows))

if show_inherited and inherited_coords:
summary.append(
Expand Down
6 changes: 5 additions & 1 deletion xarray/core/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,11 @@ def values(self) -> range:
def data(self) -> range:
return range(self.size)

def __array__(self) -> np.ndarray:
def __array__(
self, dtype: np.typing.DTypeLike = None, /, *, copy: bool | None = None
) -> np.ndarray:
if copy is False:
raise NotImplementedError(f"An array copy is necessary, got {copy = }.")
return np.arange(self.size)

@property
Expand Down
Loading

0 comments on commit 409cc1c

Please sign in to comment.