Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordination with the gridded project? #36

Closed
ChrisBarker-NOAA opened this issue Apr 20, 2022 · 3 comments
Closed

Coordination with the gridded project? #36

ChrisBarker-NOAA opened this issue Apr 20, 2022 · 3 comments
Labels
community support Discussion new feature New feature or request

Comments

@ChrisBarker-NOAA
Copy link

I just noticed a post on the UGRID gitHub project referring to uxarray, and it reminded me to reach out.

I'm the primary developer behind gridded:

https://github.com/NOAA-ORR-ERD/gridded

In a way, its goals are pretty similar to UXarray, except:

It's completely different API -- more purpose specific, and not like the xarray API at all. This is for two reasons:

  1. xarray was immature and not ready for real use when we started gridded

  2. xarray is (was?) very array / index oriented, and despite numerous conversations with the xarray team, we didn't see a way to apply the same API to the problems we needed to solve.

What problems are those? fundamentally, being able to work in natively "world" coordinates -- not having to know or care about the underlying arrays or indexes, etc.

However, I gave a talk about gridded at an AMS conference a few years ago, and a common question was "can I do [this thing] like I do with xarray? So folks do indeed want the xarray API.

So -- XArray has come a LONG way in the last ten years, and it may be time to revisit the whole thing.

Also -- we've had trouble maintaining gridded to support features that are really useful, but not what we actively needed for our work -- the whole enterprise needs a broader community. And we've been thinking a major refactor is in order anyway.

All that being said, there's some useful code and ideas in gridded that may be helpful -- I hope you've at least looked at it, and let us know if you have any questions / ideas, etc, and/or are open to more collaboration.

@ChrisBarker-NOAA ChrisBarker-NOAA added the new feature New feature or request label Apr 20, 2022
@erogluorhan
Copy link
Member

Hi @ChrisBarker-NOAA thanks very much for reaching out and providing a quite transparent overview of gridded!

Your reasonings with going without xarray at the time make much sense. Though, as you mentioned, Xarray, indeed Pangeo stack in broader view, has come a LONG way.

Xarray and Dask are thus considered as the cornerstones of our UXarray for the good of community. However, that shouldn't prevent us from having collaboration since there is much overlapping between our targeted problems and proposed solutions.

We are already inspired by a number of community discussions and tools such as @Huite's Xarray Extension proposal: extension for UGRID and unstructured mesh utils, gridded, etc. That said, more coolaboration is always welcome!

@ChrisBarker-NOAA
Copy link
Author

Xarray and Dask are thus considered as the cornerstones of our UXarray for the good of community

Exactly -- where I'm not sure is whether a package like gridded (i.e. an standard API For working with data on arbitrary types of grids) can reasonably be built using the Xarray API directly -- or if it should be a wrapper around an xarray Dataset,

The challenge is that the xarray API is very much about the arrays themselves -- working with indexes, etc -- and when you get to unstructured grids, that whole way of thinking is not applicable.

For example, a gridded.Variable is NOT an array at all -- is has an array underneath that the user can access if they want, but the idea is that that's not the usual use case. Rather, it is an abstraction for a field,and you can get the value of that an any lat-lon (or x,y) location, without knowing, (or caring) about how the information is stored.

But I'll poke a bit more into what you're doing with uxarray, and see where you are going with it.

-CHB

@anissa111 anissa111 added the community support Discussion label Jun 10, 2022
@erogluorhan erogluorhan moved this to 📝 To Do in UXarray Development Sep 9, 2022
@erogluorhan erogluorhan moved this from 📝 To Do to 🩺 Triage in UXarray Development Sep 9, 2022
@benbovy
Copy link

benbovy commented Nov 24, 2022

Stumbled on this discussion (also some similar things discussed in NOAA-ORR-ERD/gridded#55).

where I'm not sure is whether a package like gridded (i.e. an standard API For working with data on arbitrary types of grids) can reasonably be built using the Xarray API directly -- or if it should be a wrapper around an xarray Dataset

My take on this:

(Disclaimer -- Having worked on Xarray indexes for a while I certainly have a biased point of view on this! Also a lot of progress as been made on the Xarray side since @ChrisBarker-NOAA's last comment of this discussion):

While the Xarray API is indeed very much about the arrays themselves, all the recent developments in Xarray turned Dataset / DataArray into very flexible and extensible containers.

With a combination of coordinates (data + metadata), custom Xarray indexes and accessors, it is possible to extend xarray DataArray or Dataset with a lot of capabilities and API way beyond the array-centric Xarray API. You could store and implement almost anything in Xarray indexes and/or accessors, even things that are not strictly array based.

I agree that in theory Xarray Dataset and DataArray are probably not the right level of abstraction for representing grid fields and that a higher level of abstraction certainly makes more sense. That said, it is convenient to deal with objects that we are already familiar with. Building on the various extensibility mechanisms provided by Xarray may be a good practical choice. You still get a very array-based representation (repr) but you could have all the high-level "physical world" API at your fingertips, organized in a tidy way (I think).

@ChrisBarker-NOAA since you mention "world coordinates" you might be interested by this discussion sunpy/ndcube#222 where there's an example of an Xarray "WCSIndex" that may be relevant here too. Many other examples of indexes are gathered here: pydata/xarray#7041 (still very much work in progress).

@philipc2 philipc2 moved this from 📝 To Do to 📚 Backlog in UXarray Development Aug 22, 2023
@UXARRAY UXARRAY locked and limited conversation to collaborators Aug 29, 2023
@philipc2 philipc2 converted this issue into discussion #422 Aug 29, 2023
@github-project-automation github-project-automation bot moved this from 📚 Backlog to ✅ Done in UXarray Development Aug 29, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
community support Discussion new feature New feature or request
Projects
Status: ✅ Done
Development

No branches or pull requests

4 participants