Skip to content

Commit

Permalink
Fix feather build failure
Browse files Browse the repository at this point in the history
And add a bit more on other types of data
  • Loading branch information
hadley committed Jul 11, 2016
1 parent 0bd0021 commit 436d578
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 8 deletions.
1 change: 0 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ Imports:
broom,
dplyr,
DSR,
feather,
gapminder,
ggplot2,
hexbin,
Expand Down
29 changes: 22 additions & 7 deletions import.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -567,30 +567,45 @@ This makes csvs a little unreliable for caching interim results - you need to re
1. The feather package implements a fast binary file format that can
be shared across programming languages:
```{r}
```{r, eval = FALSE}
library(feather)
write_feather(challenge, "challenge.feather")
read_feather("challenge.feather")
#> # A tibble: 2,000 x 2
#> x y
#> <dbl> <date>
#> 1 404 <NA>
#> 2 4172 <NA>
#> 3 3004 <NA>
#> 4 787 <NA>
#> 5 37 <NA>
#> 6 2332 <NA>
#> # ... with 1,994 more rows
```
feather tends to be faster than rds and is usable outside of R. `rds` supports list-columns (which you'll learn about in [[Many models]]), which feather does not yet.
```{r, include = FALSE}
file.remove("challenge-2.csv")
file.remove("challenge.rds")
file.remove("challenge.feather")
```

## Other types of data

We have worked on a number of packages to make importing data into R as easy as possible. These packages are certainly not perfect, but they are the best place to start because they behave as similar as possible to readr.
To get other types of data into R, we recommend starting with the packages listed below. They're certainly not perfect, but they are a good place to start as they are fully fledged members of the tidyverse.

Two packages helper
For rectanuglar data:

* haven reads files from other SPSS, Stata, and SAS files.
* haven reads SPSS, Stata, and SAS files.

* readxl reads excel files (both `.xls` and `.xlsx`).

There are two common forms of hierarchical data: XML and json. We recommend using xml2 and jsonlite respectively. These packages are performant, safe, and (relatively) easy to use. To work with these effectively in R, you'll need to x
* DBI, along with a database specific backend (e.g. RMySQL, RSQLite,
RPostgreSQL etc) allows you to run SQL queries against a database
and return a data frame.

For hierarchical data:

* jsonlite (by Jeroen Ooms) reads json

If your data lives in a database, you'll need to use the DBI package. DBI provides a common interface that works with many different types of database. R's support is particularly good for open source databases (e.g. RPostgres, RMySQL, RSQLite, MonetDBLite).
* xml2 reads XML.

0 comments on commit 436d578

Please sign in to comment.