-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't specify AbstractArray
for column types in DataFrame rcopy
#528
base: main
Are you sure you want to change the base?
Conversation
This allows other packages to hook into the conversion system when loading R DataFrames.
It looks like this change breaks single-row DataFrame conversion. Is there another way to get around this, maybe a |
I'm looking. I can merge. Why is CI in such a poor state though? |
This particular change breaks one of the CI examples, specifically: rcopy(R"data.frame(a=1,b=2)") which internally receives a Float64 for each column rather than a vector for each column. I'm not 100% sure how to solve this (though it should definitely be solved before merging). The issue is that since we aren't passing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental problem is that R doesn't actually have native scalar values -- a "scalar" in R is just a vector of length 1.
I'm not familiar with sf
, but I'm guessing it's from https://r-spatial.github.io/sf/ ? The correct fix wouldn't be to change the behavior of a general R dataframe, but rather to change the conversion behavior for the relevant type. Notably, R objects can have multiple types, so I suspect sf
objects are also data.frame
? Let's see how we can implement this with a simpler example
We start by defining the conversion when given an R object and a target Julia type.
julia> using RCall
struct
julia> struct Custom end
R> df <- data.frame(a=1)
julia> rcopy(R"df")
1×1 DataFrame Row │ a
│ Float64 ─────┼─────────
1 │ 1.0
julia> RCall.rcopy(::Type{Custom}, s::Ptr{Sxp}) = Custom()
julia> rcopy(Custom, R"df")
Custom()
This may already do what you need to do, but requires a bit more manual control for the caller. We can also define a default copy type:
julia> RCall.rcopytype(::Type{RCall.RClass{:custom}}, s::Ptr{VecSxp}) = Custom
The way to give an object multiple types in R is simply assigning a vector:
R> class(df) <- c("data.frame", "custom")
And now the default conversion just works:
julia> rcopy(R"df")
Custom()
We can also still force the conversion to DataFrame
julia> rcopy(RCall.DataFrame, R"df")
1×1 DataFrame
Row │ a
│ Float64
─────┼─────────
1 │ 1.0
These conversion methods would probably be great as a package extension somewhere (unsure in here in RCall or in the relevant spatial package would be better).
Huh! OK, I didn't realize that other classes are able to override On a more general level, though, this does mean that anything within a |
You can always define a custom |
I see. I understand better now. |
Anyways, I want to see CI passing here. If you need to make the change to CI to make it pass, do it. We can then consider how much breakage is involved and consider if we need to jump major versions. |
Please no, don't change the tests. The tests here caught the introduction of arguably incorrect behavior. The correct behavior is using the explicit conversion target type and defining an appropriate method. |
This allows other packages to hook into the conversion system when loading R DataFrames.
My usecase here is to provide seamless interop between R and Julia for
sf
dataframes, so people can take advantage of all of R's tooling without having to re-implement it in Julia.An example of the use is that by defining
rcopytype
andrcopy
forsfc_MULTIPOLYGON
, which is a simple features collection of multipolygons, I can get those geometries converted to the relevant GeoInterface.jl geometry representation for free.