Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(fix): ExtensionArray + DataArray roundtrip #9520

Merged

Conversation

ilan-gold
Copy link
Contributor

I drive-by added name to the Series since it wasn't there before.

@ilan-gold ilan-gold changed the title (fix): fix extension array + dataarray roundtrip (fix): ExtensionArray + DataArray roundtrip Sep 19, 2024
xarray/core/variable.py Outdated Show resolved Hide resolved
xarray/core/dataarray.py Outdated Show resolved Hide resolved
@dcherian dcherian added the plan to merge Final call for comments label Sep 19, 2024
@dcherian dcherian merged commit e649e13 into pydata:main Sep 21, 2024
34 checks passed
@ilan-gold ilan-gold deleted the ig/fix_extension_array_dataarray_roundtrip branch September 23, 2024 10:21
hollymandel pushed a commit to hollymandel/xarray that referenced this pull request Sep 23, 2024
* (fix): fix extension array + dataarray roundtrip

* (fix): satisfy mypy

* (refactor): move check out of `Variable.values`

* (fix): ensure `mypy` is happy with `values` typing

* (fix): setter with `mypy`

* (fix): remove case of `values`
@shoyer
Copy link
Member

shoyer commented Oct 1, 2024

It appears that this PR may have broken some upstream pandas tests, specifically testing round-trips with various index types:
https://github.com/pandas-dev/pandas/blob/e78ebd3f845c086af1d71c0604701ec49df97228/pandas/tests/generic/test_to_xarray.py#L32

Here's a minimal test case:

import pandas as pd
import numpy as np

cat = pd.Categorical(list("abcd"))
df = pd.DataFrame({"f": cat}, index=cat)
restored = df.to_xarray().to_dataframe()
print(restored.index)  # Index(['a', 'b', 'c', 'd'], dtype='object', name='index')
print(df.index)  # CategoricalIndex(['a', 'b', 'c', 'd'], categories=['a', 'b', 'c', 'd'], ordered=False, dtype='category')

I'm not sure if this is a pandas or xarray issue, but it's one or the other!

(My guess is that most of these tests in pandas should probably live in xarray instead, given that we implement all the conversion logic.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

to_pandas on DataArray with extension array data type
3 participants