-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
as.data.frame(<RPolarsDataFrame>)
seems slow
#1079
Comments
The current implementation of the next branch is much slower. # Construct an Arrow array from an R vector
long_vec_1 <- 1:10^6
bench::mark(
arrow = {
arrow::as_arrow_array(long_vec_1)
},
nanoarrow = {
nanoarrow::as_nanoarrow_array(long_vec_1)
},
polars = {
polars::as_polars_series(long_vec_1)
},
neopolars = {
neopolars::as_polars_series(long_vec_1)
},
check = FALSE,
min_iterations = 5
)
#> # A tibble: 4 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 arrow 2.62ms 2.92ms 328. 19.82MB 2.04
#> 2 nanoarrow 496.13µs 644.87µs 1252. 458.41KB 2.03
#> 3 polars 2.06ms 2.26ms 405. 6.33MB 0
#> 4 neopolars 84.6ms 90.1ms 10.9 1.59MB 0 # Export Arrow data as an R vector
arrow_array_1 <- arrow::as_arrow_array(long_vec_1)
nanoarrow_array_1 <- nanoarrow::as_nanoarrow_array(long_vec_1)
polars_series_1 <- polars::as_polars_series(long_vec_1)
neopolars_series_1 <- neopolars::as_polars_series(long_vec_1)
bench::mark(
arrow = {
as.vector(arrow_array_1)
},
nanoarrow = {
as.vector(nanoarrow_array_1)
},
polars = {
as.vector(polars_series_1)
},
neopolars = {
as.vector(neopolars_series_1)
},
check = TRUE,
min_iterations = 5
)
#> # A tibble: 4 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 arrow 13.94µs 15.84µs 46309. 4.59KB 4.63
#> 2 nanoarrow 559.9µs 1.85ms 513. 3.85MB 72.8
#> 3 polars 6.45ms 8.79ms 112. 5.93MB 9.13
#> 4 neopolars 148.82ms 164.65ms 6.02 5.24MB 0 Created on 2024-09-05 with reprex v2.1.1 This is strange because the construction process seems to be almost identical (the main branch branches with or without r-polars/src/rust/src/conversion_r_to_s.rs Lines 138 to 150 in f55eade
r-polars/src/rust/src/series/construction.rs Lines 22 to 28 in 72897e5
Is the superior export speed of |
It seems to take 100 times longer than the conversion from
arrow::Table
.Could
arrow
be using ALTREP to make the materialization later?Details
Created on 2024-05-06 with reprex v2.1.0
The text was updated successfully, but these errors were encountered: