Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: split xarray into multi-ouput #113

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 13 additions & 12 deletions README.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions build-locally.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

122 changes: 78 additions & 44 deletions recipe/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
{% set name = "xarray" %}
{% set version = "2024.7.0" %}
{% set min_python = "python >=3.9" %}

# handle undefined PYTHON in `noarch: generic` outputs
{% if PYTHON is not defined %}{% set PYTHON = "$PYTHON" %}{% endif %}

package:
name: {{ name|lower }}
name: xarray-suite
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this mean that we'd get a separate page here: https://anaconda.org/conda-forge/xarray If so, it might be worth figuring out if we can keep this as xarray

version: {{ version }}

source:
Expand All @@ -11,50 +15,80 @@ source:

build:
noarch: python
script: {{ PYTHON }} -m pip install . -vv
number: 0

requirements:
host:
- python >=3.9
- setuptools >=42
- setuptools-scm >=7
- pip
run:
- python >=3.9
- numpy >=1.23
- pandas >=2.0
- packaging >=23.1
run_constrained:
- bottleneck >=1.3
- cartopy >=0.21
- cftime >=1.6
- dask-core >=2023.4
- distributed >=2023.4
- flox >=0.7
- h5netcdf >=1.1
- h5py >=3.8
- hdf5 >=1.12
- iris >=3.4
- matplotlib-base >=3.7
- nc-time-axis >=1.4
- netcdf4 >=1.6.0
- numba >=0.56
- pint >=0.22
- scipy >=1.10
- seaborn >=0.12
- sparse >=0.14
- toolz >=0.12
- zarr >=2.14
number: 1

test:
imports:
- xarray
- xarray.backends
requires:
- pip
commands:
- pip check
outputs:
- name: xarray-core
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A while back we went from a fully loaded xarray to a "core" but the upstream devs decided for a breaking change and keep the xarray name instead of creating xarray-core. In a way, this is another breaking change. While the patter here is common in conda-forge*, I would go for xarray to be the same and avoid another breaking change and add a xarray-extras (or better yet, use the optional names upstream.)

* The main reason for that is b/c we, conda-forge, tried to ship packages with as much extras we could but we did not had the multiple outputs at the time. When multiple outputs landed, we used the -core pattern to avoid breaking changes. Xarray was an exception at the request of the upstream devs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see pydata/xarray#9149 for the reasoning of the change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue is that current xarray is just the minimal

    - python >=3.9
    - numpy >=1.23
    - pandas >=2.0
    - packaging >=23.1

while the rest of the packages is just introduced via run_constrained.

I see. So the impact for current users is, that they would get the new xarray-core plus the new

        # accel
        - flox >=0.7
        - opt_einsum
        - numbagg
        # io & cloud
        - fsspec
        - netcdf4 >=1.6.0
        - zarr >=2.14
        # plotting
        - matplotlib-base >=3.7
        # misc
        - pooch

Not sure what to do now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish they were at this position a few years back, when I did not want to make a breaking change. So, here we go, another breaking change! Go for it. At this point I lost the will to try to communicate what the implications of these changes are to the users and the maintainers here and upstream need to figure this out by themselves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to do now.

@kmuehlbauer, my suggestion is to avoid another breaking change, do the multiple outputs but instead of creating xarray-core, create xarray-optional-deps-names-here for the optional deps that are in the pyproject.toml. That would make xarray experience much closer to PyPI's.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcherian, @shoyer, @max-sixty, do you have any thoughts here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm concerned I would be good with

LGTM. I would not create the name -minimal to avoid confusion but as long as the xarray one doesn't change and the others reflect what is in the pyproject.toml, I'm OK. That not only avoid another breaking change but also mimics the original intention in the PyPI package.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the extent that some burden falls on conda, I can see that our choices are annoying. It sounds like this isn't the first our of changes, which should bring us pause.

That said, even taking into account that pause, I'm fairly confident that the default experience when using xarray should be a package with enough dependencies for it to operate well, and that means including the "highly recommended dependencies".

I don't see how we can realistically make something like xarray-plus the default experience, so I don't see any other alternative than making xarray have the "highly recommended dependencies". Do others?

(even if I'm correct, it doesn't mean that we did this well, and so apologies to @ocefpaf...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ignore what it was but focus on what it can be. My only question is: why not the same as the package published on PyPI to ensure conda-forge and pip users will have a similar experience?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much thought this would be the same change for PyPI and conda.

(Unless someone thought differently? Strong agree with @ocefpaf that they should be the same...)

build:
noarch: generic
script: {{ PYTHON }} -m pip install . -vv
requirements:
host:
- {{ min_python }}
- setuptools >=42
- setuptools-scm >=7
- pip
run:
- {{ min_python }}
- numpy >=1.23
- packaging >=23.1
- pandas >=2.0
test:
imports:
- xarray
requires:
- pip
commands:
- pip check
- name: xarray
build:
noarch: generic
requirements:
host:
- {{ min_python }}
- setuptools >=42
- setuptools-scm >=7
- pip
run:
- {{ min_python }}
- {{ pin_subpackage("xarray-core", max_pin="x.x.x") }}
# accel
- flox >=0.7
- opt_einsum
- numbagg
# io & cloud
- fsspec
- netcdf4 >=1.6.0
- zarr >=2.14
# plotting
- matplotlib-base >=3.7
# misc
- pooch
run_constrained:
- bottleneck >=1.3
- cartopy >=0.21
- cftime >=1.6
- dask-core >=2023.4
- distributed >=2023.4
- h5netcdf >=1.1
- h5py >=3.8
- hdf5 >=1.12
- iris >=3.4
- nc-time-axis >=1.4
- numba >=0.56
- pint >=0.22
- scipy >=1.10
- seaborn >=0.12
- sparse >=0.14
- toolz >=0.12
test:
imports:
- xarray
- xarray.backends
requires:
- pip
commands:
- pip check

about:
home: https://github.com/pydata/xarray
Expand Down