Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some sort of Strict option #20

Open
pdeffebach opened this issue Jun 8, 2020 · 1 comment
Open

Add some sort of Strict option #20

pdeffebach opened this issue Jun 8, 2020 · 1 comment

Comments

@pdeffebach
Copy link

This issue is motivated by a recent PR to DataFrames here. We would like to add the functionality

julia> df = DataFrame(a = rand(2), b = rand(2));

julia> select(df, Not(:c))
4×2 DataFrame
│ Row │ a        │ b         │
│     │ Float64  │ Float64   │
├─────┼──────────┼───────────┤
│ 1   │ 0.916099 │ 0.0552436 │
│ 2   │ 0.998861 │ 0.310562  │

This currently errors. It would be nice if it didn't error since often you want to drop columns automatically just to "clean things up" and not worry about if the column really exists.

This would create inconsistent behavior with other usage of InvertedIndices, obviously. Indexing columns of a DataFrame would be different than indexing rows in a data frame.

One solution is to have some option in InvertedIndices which would allow the user to specify if they care about selecting things that don't exist in the DataFrame. Perhaps a constructor

Not(:c, strict = false)

Then this is stored in the field somehow so we can specialize behavior based off of this option.

Let me know what you think, It's certainly not the only path to getting the behavior we want but it might be fruitful.

@ararslan
Copy link
Member

ararslan commented Dec 3, 2024

I think this is more of a property of the downstream use of the inverted index than of the index itself. For example, how would you perform a non-strict select using a regular (i.e. not Not) index, like select(df, :c) but allowing :c to be ignored if it doesn't exist in df? Clearly that functionality can't be baked into the object used to index, since you can't make a non-strict Symbol, String, Int, etc. Baking it into Not but not having it similarly available to other index types feels a bit odd.

I agree with the desire for the functionality you describe, as I've run into that myself. However, given the above, I think the solution should be implemented in DataFrames as part of its API rather than in InvertedIndices as part of Not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants