Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: set __module__ on IndexSlice #60269

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

shengjie2013
Copy link

@simonjayhawkins
Copy link
Member

@jorisvandenbossche this one is a bit different?

@simonjayhawkins simonjayhawkins added Output-Formatting __repr__ of pandas objects, to_string Sprints Sprint Pull Requests labels Nov 9, 2024
@simonjayhawkins simonjayhawkins added this to the 3.0 milestone Nov 9, 2024
Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shengjie2013 for the PR

@@ -98,6 +101,7 @@


# the public IndexSlicerMaker
@set_module("pandas")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this....

>>> pandas.IndexSlice
<pandas._IndexSlice object at 0x7ff1e9305bd0>
>>> pandas._IndexSlice
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'pandas' has no attribute '_IndexSlice'. Did you mean: 'IndexSlice'?
>>> ```

since the purpose of #55178 is that the repr should be the path end users access the objects. This is an odd case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual IndexSlice object is defined a bit lower as IndexSlice = _IndexSlice(), so we would need to update the __module__ of that object, I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See

IndexSlice = _IndexSlice()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, for a class instance, that doesn't actually seem to work:

In [3]: pd.IndexSlice.__module__ = "pandas"

In [4]: pd.IndexSlice
Out[4]: <pandas.core.indexing._IndexSlice at 0x7fa7d01c3610>

In this case we might need to add a def __repr__ to the _IndexSlice class if we want to improve this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we would need to update the __module__ of that object, I think

so it looks like the way this would be done is instance.__class__.__module__ = 'new_module_name' which changes the module attribute for the class which is what has been done anyway.

the output for a class is of the form <class 'pandas.core.indexing._IndexSlice'>

whereas the output for a class instance is of the form <pandas.core.indexing._IndexSlice object at 0x7fda1576c280>

now changing the __module__ attribute on the class (as done already in this PR) gives <pandas._IndexSlice object at 0x7f6442deff70> which is not correct since `_IndexSlice is not a top level class.

I'm assuming that if we want to change this, we would want <pandas.IndexSlice object at 0x7f6442deff70>? This would be technically incorrect also?

In this case we might need to add a def repr to the _IndexSlice class if we want to improve this.

in another discussion it was mentioned that the repr is not being used in IPython?

It appears that changing the __name__ attribute on the _IndexSlice class (or alias) doesn't get the desired result either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in another discussion it was mentioned that the repr is not being used in IPython?

I think that was about function objects, but here we have an instance of a class, and that should use its repr (that's how we define repr of other pandas objects)

But it seems you have found another solution anyway

@jorisvandenbossche jorisvandenbossche changed the title IndexSliceupdate ENH: set __module__ on IndexSlice Nov 11, 2024
@simonjayhawkins
Copy link
Member

i'll try something that does give the correct output and is technically correct but we'll see what else breaks!

>>> pd.IndexSlice
<pandas.IndexSlice object at 0x7f2528c70d60>

@simonjayhawkins
Copy link
Member

i'll try something that does give the correct output and is technically correct

in jupyter notebook the output is <pandas.IndexSlice at 0x7f2d681cd870>

but we'll see what else breaks!

seems fine so far except for the pre-commit error

@@ -145,9 +149,6 @@ def __getitem__(self, arg):
return arg


IndexSlice = _IndexSlice()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should keep a _IndexSlice = IndexSlice here for backwards compatibility (I know this is private, but you never know ..)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick github search I don't find much public use, but for example the stubs use the current import to get the class for type annotations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem. can do that if you are happy with the rest of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick github search I don't find much public use, but for example the stubs use the current import to get the class for type annotations

hmm. I'll look in more detail (probs tomorrow now) before making the changes since if the class is needed for type annotations then perhaps the class should be exposed in pandas.api.typing and the repr should point to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string Sprints Sprint Pull Requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants