Tidy chapter updates

svigneau · Jul 25, 2016 · a2ff3ec · a2ff3ec
1 parent f1cc208
commit a2ff3ec
Show file tree

Hide file tree

Showing 4 changed files with 382 additions and 250 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -35,6 +35,7 @@ Imports:
   tidyr
 Remotes:
   hadley/modelr,
+  hadley/tidyr,
   hadley/readr,
   hadley/stringr,
   rstudio/bookdown,

diff --git a/tibble.Rmd b/tibble.Rmd
@@ -32,6 +32,17 @@ tibble(x = 1:5, y = 1, z = x ^ 2 + y)
 
 `tibble()` automatically recycles inputs of length 1, and you can refer to variables that you just created. Compared to `data.frame()`, `tibble()` does much less: it never changes the type of the inputs (e.g. it never converts strings to factors!), it never changes the names of variables, and it never creates row names.
 
+It's possible for a tibble to have column names that are not valid R variables, or __non-syntactic__ names. For example, they might not start with a letter, or they might contain unusual values like a space. To refer to these variables, you need to surround them with backticks, `` ` ``:
+
+```{r}
+tb <- tibble(
+  `:)` = "smile", 
+  ` ` = "space",
+  `2000` = "number"
+)
+tb
+```
+
 Another way to create a tibble is with `frame_data()`, which is customised for data entry in R code. Column headings are defined by formulas (`~`), and entries are separated by commas:
 
 ```{r}
@@ -48,6 +59,21 @@ frame_data(
 
 1.  What does `enframe()` do? When might you use it?
 
+1.  Practice referring to non-syntactic names by:
+
+    1.  Plotting a scatterplot of `1` vs `2`.
+
+    1.  Creating a new column called `3` which is `2` divided by `1`.
+
+    1.  Renaming the columns to `one`, `two` and `three`. 
+
+    ```{r}
+    annoying <- tibble(
+      `1` = 1:10,
+      `2` = `1` * 2 + rnorm(length(`1`))
+    )
+    ```
+
 ## Tibbles vs. data frames
 
 There are two main differences in the usage of a data frame vs a tibble: printing, and subsetting.
@@ -84,6 +110,12 @@ nycflights13::flights %>%
 
 You can see a complete list of options by looking at the package help: `package?tibble`.
 
+Remember, you can also get a nicer view of the data set using RStudio's built-in data viewer. This is often useful at the end of a long chain of manipulations.
+
+```{r, eval = FALSE}
+nycflights13::flights %>% View()
+```
+
 ### Subsetting
 
 Tibbles are stricter about subsetting. If you try to access a variable that does not exist, you'll get a warning. Unlike data frames, tibbles do not use partial matching on column names: