Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add polars selectors in filters example in the doc #20360

Open
ShootingStarD opened this issue Dec 19, 2024 · 2 comments
Open

Add polars selectors in filters example in the doc #20360

ShootingStarD opened this issue Dec 19, 2024 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@ShootingStarD
Copy link

Description

In the current doc of the selectors, there is no example on how to use filters with selectors. At first I thought it was not possible but there was a PR that made it possible.

Here is the working example :

    import polars as pl
    import polars.selectors as cs
    df = pl.DataFrame({"a": [1, 2, 3]})

    df = df.filter(pl.all_horizontal((cs.by_name("^.*$") & cs.integer()) <= 2))
    expected_df = pl.DataFrame({"a": [1, 2]})

    assert_frame_equal(df, expected_df)

I think it would be important to add at least one example to help readers understand how to use them together, I made several attempts before finding the solution

Link

https://docs.pola.rs/api/python/stable/reference/selectors.html

@ShootingStarD ShootingStarD added the documentation Improvements or additions to documentation label Dec 19, 2024
@rodrigogiraoserrao
Copy link
Collaborator

rodrigogiraoserrao commented Dec 19, 2024

Hey, thanks for opening this issue.

I need your help to understand what you are saying better.
Before finding the PR you linked, why did you think you couldn't use polars.selectors inside filter?
In what scenarios did you think you could use polars.selectors?
Your answers to these questions will help us improve the documentation.

Thanks for your time.

@ShootingStarD
Copy link
Author

I didn't thought it was possible because :

  • there were no example of pl.selectors being used with a .filter in the doc page
  • I didn't find examples on stackoverflow
  • I tried to do something like ((pl.selectors.starts_with("test_)>3).any()) or other combination

Now that I know how to do it, it seems indeed obvious that I had to test it using a horizontal function, however it was not obvious for me at first and it could be the same for other people

I think I could therefore use the polars.selectors to select multiple columns in a .select(), .with_columns() or .group_by and .agg() but those are straightforward and do not need more examples

We saw that we can use selectors with filters, but we have to be aware that it will create one boolean expression for each column selected and we should specify how to combine them (using pl.any_horizontal() or pl.all_horizontal()) (this is not specified in the page)

At some point I tried to use pl.when() with the selectors but did not succeeded , maybe I should try again with a horizontal function , but this could be nice to show an example of how to do it, or explain that it cannot be used

Thanks for your amazing work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants