Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add cat.starts_with/cat.ends_with #20257

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mcrumiller
Copy link
Contributor

@mcrumiller mcrumiller commented Dec 11, 2024

Follow up to #20211.

Edit 2024-12-18: expression implementation has been removed, along with the slow path. cat.starts_with and cat.ends_with now require str inputs and do not except expressions.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Dec 11, 2024
Copy link

codecov bot commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 96.77419% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.10%. Comparing base (117a0ba) to head (c23630d).
Report is 42 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-plan/src/dsl/function_expr/cat.rs 93.54% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #20257      +/-   ##
==========================================
- Coverage   79.65%   79.10%   -0.56%     
==========================================
  Files        1565     1572       +7     
  Lines      218281   220015    +1734     
  Branches     2475     2467       -8     
==========================================
+ Hits       173878   174046     +168     
- Misses      43836    45401    +1565     
- Partials      567      568       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mcrumiller mcrumiller marked this pull request as ready for review December 11, 2024 14:00
@ritchie46
Copy link
Member

I am really not sure about the cast. Will have to think about this for a bit.

@mcrumiller
Copy link
Contributor Author

mcrumiller commented Dec 16, 2024

I am really not sure about the cast. Will have to think about this for a bit.

We could simply disallow nonscalar expression inputs. cat.starts_with is a distinct namespace from str.starts_with so I think it's reasonable to have different rules in place regarding the allowable inputs. It would definitely simplify things as well.

Having nonscalar expression inputs does lean towards the "categoricals as string optimizations" (and in this case, it wouldn't even be an optimization) which I know is not their intended behavior. So--I propose only allowing scalar inputs here. This would completely remove the slow path. What do you think?

@mcrumiller mcrumiller marked this pull request as draft December 18, 2024 22:44
@mcrumiller mcrumiller marked this pull request as ready for review December 19, 2024 14:20
@mcrumiller
Copy link
Contributor Author

@Ritchie I've removed the slow path, and switched to only allow str input arguments (similar to something like, say, str.strptime). I think this makes a lot more sense for categoricals. Since we no longer accept expressions, we have no more slow path. This also means that None is no longer accepted as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants