Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GroupBy.shuffle(), Dataset.shuffle_by(), DataArray.shuffle_by() #9320

Open
wants to merge 51 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
3bc51bd
Add GroupBy.shuffle()
dcherian Aug 7, 2024
60d7619
Cleanup
dcherian Aug 7, 2024
d1429cd
Cleanup
dcherian Aug 7, 2024
31fc00e
fix
dcherian Aug 7, 2024
4583853
return groupby instance from shuffle
dcherian Aug 13, 2024
abd9dd2
Fix nD by
dcherian Aug 13, 2024
6b820aa
Merge branch 'main' into groupby-shuffle
dcherian Aug 14, 2024
0d70656
Skip if no dask
dcherian Aug 14, 2024
fafb937
fix tests
dcherian Aug 14, 2024
939db9a
Merge branch 'main' into groupby-shuffle
dcherian Aug 14, 2024
a08450e
Add `chunks` to signature
dcherian Aug 14, 2024
d0cd218
FIx self
dcherian Aug 14, 2024
4edc976
Another Self fix
dcherian Aug 14, 2024
0b42be4
Forward chunks too
dcherian Aug 14, 2024
c52734d
[revert]
dcherian Aug 14, 2024
8180625
undo flox limit
dcherian Aug 14, 2024
7897c91
[revert]
dcherian Aug 14, 2024
7773548
fix types
dcherian Aug 14, 2024
51a7723
Add DataArray.shuffle_by, Dataset.shuffle_by
dcherian Aug 15, 2024
cc95513
Add doctest
dcherian Aug 15, 2024
18f4a40
Refactor
dcherian Aug 15, 2024
f489bcf
tweak docstrings
dcherian Aug 15, 2024
ead1bb4
fix typing
dcherian Aug 15, 2024
75115d0
Fix
dcherian Aug 15, 2024
390863a
fix docstring
dcherian Aug 15, 2024
a408cb0
bump min version to dask>=2024.08.1
dcherian Aug 17, 2024
7038f37
Merge branch 'main' into groupby-shuffle
dcherian Aug 17, 2024
05a0fb4
Fix typing
dcherian Aug 17, 2024
b8e7f62
Fix types
dcherian Aug 17, 2024
6d9ed1c
Merge branch 'main' into groupby-shuffle
dcherian Aug 22, 2024
20a8cd9
Merge branch 'main' into groupby-shuffle
dcherian Aug 30, 2024
7a99c8f
remove shuffle_by for now.
dcherian Aug 30, 2024
5e2fdfb
Add tests
dcherian Aug 30, 2024
a22c7ed
Support shuffling with multiple groupers
dcherian Aug 30, 2024
2d48690
Revert "remove shuffle_by for now."
dcherian Sep 11, 2024
0679d2b
Merge branch 'main' into groupby-shuffle
dcherian Sep 12, 2024
63b3e77
Merge branch 'main' into groupby-shuffle
dcherian Sep 17, 2024
7dc5dd1
bad merge
dcherian Sep 17, 2024
bad0744
Merge branch 'main' into groupby-shuffle
dcherian Sep 18, 2024
91e4bd8
Add a test
dcherian Sep 18, 2024
0542944
Merge branch 'main' into groupby-shuffle
dcherian Nov 1, 2024
1e4f805
Add docs
dcherian Nov 1, 2024
ad502aa
bugfix
dcherian Nov 1, 2024
4b0c143
Refactor out Dataset._shuffle
dcherian Nov 2, 2024
2b2c4ab
Merge branch 'main' into groupby-shuffle
dcherian Nov 3, 2024
f624c8f
fix types
dcherian Nov 3, 2024
888e780
fix tests
dcherian Nov 3, 2024
47e5c17
Merge branch 'main' into groupby-shuffle
dcherian Nov 4, 2024
b100fb1
Handle by is chunked
dcherian Nov 4, 2024
978fad9
Merge branch 'main' into groupby-shuffle
dcherian Nov 7, 2024
d1a3fc1
Some refactoring
dcherian Nov 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 55 additions & 1 deletion xarray/core/groupby.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import annotations

import copy
import sys
import warnings
from collections.abc import Callable, Hashable, Iterator, Mapping, Sequence
from dataclasses import dataclass, field
Expand Down Expand Up @@ -45,17 +46,23 @@
peek_at,
)
from xarray.core.variable import IndexVariable, Variable
from xarray.namedarray.pycompat import is_chunked_array
from xarray.util.deprecation_helpers import _deprecate_positional_args

if TYPE_CHECKING:
from numpy.typing import ArrayLike

from xarray.core.dataarray import DataArray
from xarray.core.dataset import Dataset
from xarray.core.types import GroupIndex, GroupIndices, GroupKey
from xarray.core.types import GroupIndex, GroupIndices, GroupKey, T_Chunks
from xarray.core.utils import Frozen
from xarray.groupers import Grouper

if sys.version_info >= (3, 11):
from typing import Self
else:
from typing_extensions import Self
dcherian marked this conversation as resolved.
Show resolved Hide resolved


def check_reduce_dims(reduce_dims, dimensions):
if reduce_dims is not ...:
Expand Down Expand Up @@ -517,6 +524,53 @@ def sizes(self) -> Mapping[Hashable, int]:
self._sizes = self._obj.isel({self._group_dim: index}).sizes
return self._sizes

def shuffle(self, chunks: T_Chunks = "auto") -> Self:
"""
Shuffle the underlying object so that all members in a group occur sequentially.

The order of appearance is not guaranteed.

Use this method first if you need to map a function that requires all members of a group
be in a single chunk.
"""
from xarray.core.dataarray import DataArray

(grouper,) = self.groupers
dim = self._group_dim
size = self._obj.sizes[dim]
was_array = isinstance(self._obj, DataArray)
as_dataset = self._obj._to_temp_dataset() if was_array else self._obj
no_slices: list[list[int]] = [
list(range(*idx.indices(size))) if isinstance(idx, slice) else idx
for idx in self._group_indices
]

if grouper.name not in as_dataset._variables:
as_dataset.coords[grouper.name] = grouper.group1d

# Shuffling is only different from `isel` for chunked arrays.
# Extract them out, and treat them specially. The rest, we route through isel.
# This makes it easy to ensure correct handling of indexes.
is_chunked = {
name: var
for name, var in as_dataset._variables.items()
if is_chunked_array(var._data)
}
subset = as_dataset[
[name for name in as_dataset._variables if name not in is_chunked]
]
shuffled = subset.isel({dim: np.concatenate(no_slices)})
for name, var in is_chunked.items():
shuffled[name] = var._shuffle(
indices=list(self._group_indices), dim=dim, chunks=chunks
)
shuffled = self._maybe_unstack(shuffled)
new_obj = self._obj._from_temp_dataset(shuffled) if was_array else shuffled
return new_obj.groupby(
{grouper.name: grouper.grouper.reset()},
dcherian marked this conversation as resolved.
Show resolved Hide resolved
restore_coord_dims=self._restore_coord_dims,
)

def map(
self,
func: Callable,
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ def copy(
ZarrWriteModes = Literal["w", "w-", "a", "a-", "r+", "r"]

GroupKey = Any
GroupIndex = Union[int, slice, list[int]]
GroupIndex = Union[slice, list[int]]
GroupIndices = tuple[GroupIndex, ...]
Bins = Union[
int, Sequence[int], Sequence[float], Sequence[pd.Timestamp], np.ndarray, pd.Index
Expand Down
26 changes: 25 additions & 1 deletion xarray/core/variable.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,13 @@
maybe_coerce_to_str,
)
from xarray.namedarray.core import NamedArray, _raise_if_any_duplicate_dimensions
from xarray.namedarray.pycompat import integer_types, is_0d_dask_array, to_duck_array
from xarray.namedarray.parallelcompat import get_chunked_array_type
from xarray.namedarray.pycompat import (
integer_types,
is_0d_dask_array,
is_chunked_array,
to_duck_array,
)
from xarray.util.deprecation_helpers import deprecate_dims

NON_NUMPY_SUPPORTED_ARRAY_TYPES = (
Expand Down Expand Up @@ -998,6 +1004,24 @@ def compute(self, **kwargs):
new = self.copy(deep=False)
return new.load(**kwargs)

def _shuffle(
self, indices: list[list[int]], dim: Hashable, chunks: T_Chunks
) -> Self:
array = self._data
if is_chunked_array(array):
chunkmanager = get_chunked_array_type(array)
return self._replace(
data=chunkmanager.shuffle(
array,
indexer=indices,
axis=self.get_axis_num(dim),
chunks=chunks,
)
)
else:
assert False, "this should be unreachable"
return self.isel({dim: np.concatenate(indices)})

def isel(
self,
indexers: Mapping[Any, Any] | None = None,
Expand Down
35 changes: 35 additions & 0 deletions xarray/groupers.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,19 @@
from __future__ import annotations

import datetime
import sys
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Any, Literal, cast

import numpy as np
import pandas as pd

if sys.version_info >= (3, 11):
from typing import Self
else:
from typing_extensions import Self

from xarray.coding.cftime_offsets import _new_to_legacy_freq
from xarray.core import duck_array_ops
from xarray.core.dataarray import DataArray
Expand Down Expand Up @@ -90,6 +96,13 @@ def factorize(self, group: T_Group) -> EncodedGroups:
"""
pass

@abstractmethod
def reset(self) -> Self:
"""
Creates a new version of this Grouper clearing any caches.
"""
pass


class Resampler(Grouper):
"""
Expand All @@ -114,6 +127,9 @@ def group_as_index(self) -> pd.Index:
self._group_as_index = self.group.to_index()
return self._group_as_index

def reset(self) -> Self:
return type(self)()

def factorize(self, group1d: T_Group) -> EncodedGroups:
self.group = group1d

Expand Down Expand Up @@ -221,6 +237,16 @@ class BinGrouper(Grouper):
include_lowest: bool = False
duplicates: Literal["raise", "drop"] = "raise"

def reset(self) -> Self:
return type(self)(
bins=self.bins,
right=self.right,
labels=self.labels,
precision=self.precision,
include_lowest=self.include_lowest,
duplicates=self.duplicates,
)

def __post_init__(self) -> None:
if duck_array_ops.isnull(self.bins).all():
raise ValueError("All bin edges are NaN.")
Expand Down Expand Up @@ -302,6 +328,15 @@ class TimeResampler(Resampler):
index_grouper: CFTimeGrouper | pd.Grouper = field(init=False, repr=False)
group_as_index: pd.Index = field(init=False, repr=False)

def reset(self) -> Self:
return type(self)(
freq=self.freq,
closed=self.closed,
label=self.label,
origin=self.origin,
offset=self.offset,
)

def _init_properties(self, group: T_Group) -> None:
from xarray import CFTimeIndex

Expand Down
13 changes: 13 additions & 0 deletions xarray/namedarray/daskmanager.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,3 +251,16 @@ def store(
targets=targets,
**kwargs,
)

def shuffle(
self, x: DaskArray, indexer: list[list[int]], axis: int, chunks: T_Chunks
) -> DaskArray:
import dask.array

if not module_available("dask", minversion="2024.08.1"):
raise ValueError(
"This method is very inefficient on dask<2024.08.1. Please upgrade."
)
if chunks is not None:
raise NotImplementedError
return dask.array.shuffle(x, indexer, axis, chunks=None)
6 changes: 6 additions & 0 deletions xarray/namedarray/parallelcompat.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

if TYPE_CHECKING:
from xarray.namedarray._typing import (
T_Chunks,
_Chunks,
_DType,
_DType_co,
Expand Down Expand Up @@ -356,6 +357,11 @@ def compute(
"""
raise NotImplementedError()

def shuffle(
self, x: T_ChunkedArray, indexer: list[list[int]], axis: int, chunks: T_Chunks
) -> T_ChunkedArray:
raise NotImplementedError()

@property
def array_api(self) -> Any:
"""
Expand Down
1 change: 1 addition & 0 deletions xarray/tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ def _importorskip(
has_h5netcdf, requires_h5netcdf = _importorskip("h5netcdf")
has_cftime, requires_cftime = _importorskip("cftime")
has_dask, requires_dask = _importorskip("dask")
has_dask_ge_2024_08_0, _ = _importorskip("dask", minversion="2024.08.0")
with warnings.catch_warnings():
warnings.filterwarnings(
"ignore",
Expand Down
71 changes: 56 additions & 15 deletions xarray/tests/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import operator
import warnings
from typing import Literal
from unittest import mock

import numpy as np
Expand All @@ -21,7 +22,10 @@
assert_identical,
create_test_data,
has_cftime,
has_dask,
has_dask_ge_2024_08_0,
has_flox,
raise_if_dask_computes,
requires_cftime,
requires_dask,
requires_flox,
Expand Down Expand Up @@ -582,7 +586,22 @@

@pytest.mark.filterwarnings("ignore:Converting non-nanosecond")
@pytest.mark.filterwarnings("ignore:invalid value encountered in divide:RuntimeWarning")
def test_groupby_drops_nans() -> None:
@pytest.mark.parametrize("shuffle", [True, False])
@pytest.mark.parametrize(
"chunk",
[
pytest.param(
dict(lat=1), marks=pytest.mark.skipif(not has_dask, reason="no dask")
),
pytest.param(
dict(lat=2, lon=2), marks=pytest.mark.skipif(not has_dask, reason="no dask")
),
False,
],
)
def test_groupby_drops_nans(shuffle: bool, chunk: Literal[False] | dict) -> None:
if shuffle and chunk and not has_dask_ge_2024_08_0:
pytest.skip()
# GH2383
# nan in 2D data variable (requires stacking)
ds = xr.Dataset(
Expand All @@ -597,18 +616,22 @@
ds["id"].values[3, 0] = np.nan
ds["id"].values[-1, -1] = np.nan

if chunk:
ds = ds.chunk(chunk)
grouped = ds.groupby(ds.id)
if shuffle:
grouped = grouped.shuffle()

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

test_groupby_drops_nans[chunk0-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

test_groupby_drops_nans[chunk1-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

test_groupby_drops_nans[chunk0-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

test_groupby_drops_nans[chunk1-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

test_groupby_drops_nans[chunk0-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

test_groupby_drops_nans[chunk1-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

test_groupby_drops_nans[chunk0-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 623 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

test_groupby_drops_nans[chunk1-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

# non reduction operation
expected1 = ds.copy()
expected1.variable.values[0, 0, :] = np.nan
expected1.variable.values[-1, -1, :] = np.nan
expected1.variable.values[3, 0, :] = np.nan
expected1.variable.data[0, 0, :] = np.nan
expected1.variable.data[-1, -1, :] = np.nan
expected1.variable.data[3, 0, :] = np.nan
actual1 = grouped.map(lambda x: x).transpose(*ds.variable.dims)
assert_identical(actual1, expected1)

# reduction along grouped dimension
actual2 = grouped.mean()

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

test_groupby_drops_nans[chunk0-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

test_groupby_drops_nans[chunk1-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

test_groupby_drops_nans[chunk0-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

test_groupby_drops_nans[chunk1-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

test_groupby_drops_nans[chunk0-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

test_groupby_drops_nans[chunk1-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

test_groupby_drops_nans[chunk0-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Check failure on line 634 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

test_groupby_drops_nans[chunk1-False] pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
stacked = ds.stack({"xy": ["lat", "lon"]})
expected2 = (
stacked.variable.where(stacked.id.notnull())
Expand Down Expand Up @@ -732,7 +755,6 @@


def test_groupby_getitem(dataset) -> None:

assert_identical(dataset.sel(x=["a"]), dataset.groupby("x")["a"])
assert_identical(dataset.sel(z=[1]), dataset.groupby("z")[1])
assert_identical(dataset.foo.sel(x=["a"]), dataset.foo.groupby("x")["a"])
Expand Down Expand Up @@ -1293,11 +1315,27 @@
assert_allclose(expected_sum_axis1, grouped.reduce(np.sum, "y"))
assert_allclose(expected_sum_axis1, grouped.sum("y"))

@pytest.mark.parametrize("use_flox", [True, False])
@pytest.mark.parametrize("shuffle", [True, False])
@pytest.mark.parametrize(
"chunk",
[
pytest.param(
True, marks=pytest.mark.skipif(not has_dask, reason="no dask")
),
False,
],
)
@pytest.mark.parametrize("method", ["sum", "mean", "median"])
def test_groupby_reductions(self, method) -> None:
array = self.da
grouped = array.groupby("abc")
def test_groupby_reductions(
self, use_flox: bool, method: str, shuffle: bool, chunk: bool
) -> None:
if shuffle and chunk and not has_dask_ge_2024_08_0:
pytest.skip()

array = self.da
if chunk:
array.data = array.chunk({"y": 5}).data
reduction = getattr(np, method)
expected = Dataset(
{
Expand All @@ -1315,14 +1353,14 @@
}
)["foo"]

with xr.set_options(use_flox=False):
actual_legacy = getattr(grouped, method)(dim="y")

with xr.set_options(use_flox=True):
actual_npg = getattr(grouped, method)(dim="y")
with raise_if_dask_computes():
grouped = array.groupby("abc")
if shuffle:
grouped = grouped.shuffle()

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[sum-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[mean-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-True] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

Check failure on line 1359 in xarray/tests/test_groupby.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

TestDataArrayGroupBy.test_groupby_reductions[median-True-True-False] ValueError: This method is very inefficient on dask<2024.08.1. Please upgrade.

assert_allclose(expected, actual_legacy)
assert_allclose(expected, actual_npg)
with xr.set_options(use_flox=use_flox):
actual = getattr(grouped, method)(dim="y")
assert_allclose(expected, actual)

def test_groupby_count(self) -> None:
array = DataArray(
Expand Down Expand Up @@ -2534,6 +2572,9 @@
codes = group.copy(data=codes_).rename("year")
return EncodedGroups(codes=codes, full_index=pd.Index(uniques))

def reset(self):
return type(self)()

da = xr.DataArray(
dims="time",
data=np.arange(20),
Expand Down
Loading