Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

De-duplicate inherited coordinates & dimensions #9510

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented Sep 17, 2024

Comment on lines +497 to +504
@contextlib.contextmanager
def _with_conflicting_coords_mode(self, mode: ConflictingCoordsMode):
"""Set handling of duplicated inherited coordinates on this object.

This private context manager is an indirect way to control the handling
of duplicated inherited coordinates, which is done inside _post_attach()
and hence is difficult to directly parameterize.
"""
Copy link
Contributor

@TomNicholas TomNicholas Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to allow two modes for this? My understanding from the conversation in #9475 (comment) is that we would add a flag for use once during initial construction by open_datatree, but not a global flag or one that sets state of the tree object. Like I would have thought that this could just be a function/method parameter rather than a contextmanager or an attribute.

Copy link
Contributor

@TomNicholas TomNicholas Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing discussion from the meeting just now: this is an internal implementation detail, not something that is meant to be publicly accessible. The aim is still that all public DataTree objects are de-duplicated.

The issue is that the natural place to put de-duplication in is in ._post_attach, but ._post_attach doesn't accept any parameters (so we can't pass mode). So this global flag & associated context manager are just ways to pass state into the ._post_attach method.

@shoyer maybe the least-worst thing to do here is just to change the signature of ._post_attach to

def _post_attach(self: Tree, parent: Tree, name: str, mode: bool | None) -> None:

where TreeNode._post_attach doesn't use mode?


def _dedup_inherited_coordinates(self):
removed_something = False
for name in self._coord_variables.parents:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a moment here to realise that .parents was ChainMap.parents, not DataTree.parents.

def _dedup_inherited_coordinates(self):
removed_something = False
for name in self._coord_variables.parents:
if name in self._node_coord_variables:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the condition for considering two coordinates to be duplicates is just that their names are the same? Isn't that going to cause problems when you do want to override a parent's coordinate with a new coordinate that has the same name but different values?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

2 participants