Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise on __eq__ binary operations between unsupported types. #11609

Draft
wants to merge 4 commits into
base: branch-24.06
Choose a base branch
from

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Aug 26, 2022

Description

This PR fixes a bug where calling __eq__ / == binary operations in unsupported cases (Series of lists, Series of structs -- soon to be supported) called through to object.__eq__ instead of raising as expected, resulting in a cryptic error TypeError: object of type 'bool' has no len(). In this PR, we alter the BinaryOperand class to have a default __eq__ method (actually a descriptor) that is defined as a mixin Operation, so that classes inheriting from this mixin can override its behavior.

To-do:

  • Add tests that == raises when called with Series of lists and Series of structs (not yet supported in Python)
  • Fix behavior and add tests that cudf.Series(["a", "b"]) == cudf.Series([1, 2]) raises the same error

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the Python Affects Python cuDF API. label Aug 26, 2022
@bdice bdice added bug Something isn't working non-breaking Non-breaking change labels Aug 26, 2022
@bdice bdice self-assigned this Aug 26, 2022
Comment on lines 62 to 72
if op == "__eq__":
raise TypeError(
"'==' not supported between instances of "
f"'{type(self).__name__}' and '{type(other).__name__}'"
)
if op == "__ne__":
raise TypeError(
"'!=' not supported between instances of "
f"'{type(self).__name__}' and '{type(other).__name__}'"
)
raise NotImplementedError()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be cleaner to leave BinaryOperand._binaryop as-is, and just patch the __eq__ and __ne__ methods? Or do we need to change all three methods?

@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.02@ccd72f2). Click here to learn what that means.
Patch has no changes to coverable lines.

❗ Current head 31412db differs from pull request most recent head ef0f676. Consider uploading reports for the commit ef0f676 to get more accurate results

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-23.02   #11609   +/-   ##
===============================================
  Coverage                ?   88.21%           
===============================================
  Files                   ?      137           
  Lines                   ?    22592           
  Branches                ?        0           
===============================================
  Hits                    ?    19930           
  Misses                  ?     2662           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@github-actions
Copy link

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

@shwina shwina changed the base branch from branch-22.10 to branch-23.02 November 22, 2022 18:03
@bdice bdice changed the base branch from branch-23.02 to branch-23.04 January 26, 2023 16:59
@vyasr
Copy link
Contributor

vyasr commented Jan 23, 2024

@bdice now that we have full nested type support is this PR still relevant?

@bdice
Copy link
Contributor Author

bdice commented May 2, 2024

This is still an issue -- the wrong error is raised when comparing nested types.

>>> cudf.Series([[1, 2, 3], [4, 5]]) == cudf.Series([{"a": 1}, {"a": 2}])
...
TypeError: object of type 'bool' has no len()

>>> pd.Series([[1, 2, 3], [4, 5]]) == pd.Series([{"a": 1}, {"a": 2}])
0    False
1    False
dtype: bool

This is also a problem for non-nested equality comparisons:

>>> cudf.Series(["a", "b"]) == cudf.Series([1, 2])
...
TypeError: object of type 'bool' has no len()

>>> pd.Series(["a", "b"]) == pd.Series([1, 2])
0    False
1    False
dtype: bool

@bdice bdice changed the base branch from branch-23.04 to branch-24.06 May 2, 2024 22:30
Copy link

copy-pr-bot bot commented May 2, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bdice
Copy link
Contributor Author

bdice commented May 3, 2024

I'm going to try and refactor

void apply_sorting_struct_binary_op(mutable_column_view& out,
into an equality/comparator operator for cudf::binary_operation that works with lists and structs. That should help unblock the lists/structs cases in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants