-
Notifications
You must be signed in to change notification settings - Fork 1
/
Effects-plots-lmer.Rmd
499 lines (391 loc) · 26.4 KB
/
Effects-plots-lmer.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
---
title: "Plotting lmer/glmer using the effects package"
date: "22 May 2019"
author: "Dan Villarreal"
output:
html_document:
number_sections: yes
toc: yes
toc_float: yes
df_print: paged
---
```{r}
knitr::opts_chunk$set(comment=NA)
```
# Introduction
The [`effects` package](https://cran.r-project.org/package=effects) includes tools for constructing and plotting fixed effects of a large variety of models, including mixed-effects linear regression models (generated by `lmer()` from the `lme4` package) and logistic regression models (`glmer()`). I find that visualizing effects is a *much* better and easier way to understand a model than squinting at a summary table, especially when interactions are involved.
In this short guide, we'll walk through how to use the `effects` package (including how to deal with some finicky features of `effects`):
1. How to run your model so `effects` won't complain
1. How to construct an effects dataframe
1. How to plot effects in `ggplot2`
To demonstrate these functions, we'll use some data from Lynn's and my research on [vowel priming in QuakeBox](https://www.academia.edu/36925969/), which asked whether there was evidence of priming in phonetic variation across the NZE short front vowel shift. The `PrimingVowels.Rds` file in the same folder as this RMarkdown file contains data that we can use to re-create a version of our final model. Let's read this data into our R environment:
```{r}
primingVowels <- readRDS("PrimingVowels.Rds")
```
Briefly, here's what's in this data: information on speakers (including social characteristics), words, two "Shift Index" columns that store the degree of shiftedness of the target and prime, the target and prime vowel categories, target and prime durations, the time difference between prime and target, and whether or not prime and target are different instances of the same word.
```{r}
summary(primingVowels)
```
Let's also load some packages: `tidyverse`, a suite of packages that makes R code faster, more readable, and safer; `lmerTest`, which imports `lme4` and adds *p*-values to `lmer()` summaries; and the `effects` package itself.
```{r}
pkgTest <- function(x) {
if (!require(x, character.only=TRUE)) {
install.packages(x)
if (!require(x, character.only=TRUE)) stop(paste("Package", x, "not found"))
}
}
pkgTest("tidyverse")
pkgTest("lmerTest")
pkgTest("effects")
```
# How to run your model
`effects` is a little finicky in its requirements for the formatting of the models it runs. In particular:
* `effects` only accepts numeric and factor predictors in fixed effects; if any fixed effects aren't numeric or factor, it throws an error and won't proceed. This would be a problem for our `PrimeTargetSameWord` predictor, which has the type "logical".
* For boring internal reasons, it fails when models are run using data that's piped into `(g)lmer()`, rather than spelled out in the `data` argument to `(g)lmer()`.
Fortunately, both of these requirements are easy to deal with. To satisfy the first requirement, we'll use the `mutate_if()` function from `dplyr` (part of `tidyverse`) to turn any character columns into factor columns, and any logical columns into factor columns. Here's what that looks like:
```{r}
primingVowels %>%
mutate_if(is.character, factor) %>%
mutate_if(is.logical, factor)
```
To satisfy the second requirement, we specify the data in the `data` argument of `(g)lmer()`. So rather than using syntax like the following:
```
Some-data %>%
lmer(Some-formula, data=.)
```
We'll use syntax like the following:
```
lmer(Some-formula, data=Some-data)
```
(The first version might look ugly and unnecessary if our data are simple, but if we're doing a bunch of manipulations to our data in a long pipe chain, it's a lot clearer to have that pipe chain outside the `(g)lmer()` call rather than within it.)
Let's put this together and run our `lmer()` model. Our actual model was more complicated than this one, but we'll use a somewhat pared-down version to keep things simpler (for that reason, some of the plots will look different from what we've presented in the past). This model took about 90 seconds to run on my laptop:
```{r}
primingMod <-
lmer(TargetShiftIndex ~ PrimeShiftIndex * Target * Prime *
(log(PrimeTargetTimeDiff) + log(TargetVowelDur) + log(PrimeVowelDur) + AgeGroup + Gender) +
PrimeShiftIndex * Target * PrimeTargetSameWord +
(1 | Speaker) + (1 | Word),
data=primingVowels %>%
mutate_if(is.character, factor) %>%
mutate_if(is.logical, factor),
REML=FALSE)
```
Let's take a look at the model summary:
```{r}
primingMod %>% summary()
```
Wow, what a mess! We've got 150 different fixed-effects terms to try to interpret, including 4-way interactions, while trying to hold in our head the fact that KIT is the reference level for `Target` & `Prime` and figure out what that means. Clearly, trying to glance over this summary table, on its own, is no way to do analysis. Instead, let's build some effects plots.
# How to construct an effects dataframe
Now that we've got our model, we'll do two things to it: construct fitted model effects, and plot these effects. Both of these can be accomplished via functions in `effects`, but I prefer to take the fitted model effects constructed by `effects` and use the flexible and powerful plotting capabilities of `ggplot2` instead.
This is pretty straightforward: we pipe the model to the function `Effect()`, make a data frame out of that, and then we've got something that works nicely with `ggplot2`. (FYI, the `effects` package provides a few functions for constructing effects, differing only in their argument syntax; I find `Effect()`'s argument syntax easiest to deal with, but your mileage may vary.)
So if we want to plot a main effect of `PrimeShiftIndex`, we pipe our priming model to `Effect()`, then pipe that to `as.data.frame()`. The result is a data frame of fitted effects; at different values of `PrimeShiftIndex`, `fit` is the fitted mean of the dependent variable (here, `TargetShiftIndex`), `se` is the standard error of the fit, and `lower`/`upper` are the bounds of the 95% confidence interval (FYI, you can adjust the confidence level via the `se` argument to `Effect()`).
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex"), .) %>%
as.data.frame()
```
If we want to plot an interaction effect, we simply pass a vector of multiple predictors to the first argument of `Effect()`, like so. The following dataframe has fitted effects at different values of `PrimeShiftIndex`, `Target`, and `PrimeTargetSameWord`:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "Target", "PrimeTargetSameWord"), .) %>%
as.data.frame()
```
One minor wrinkle is that `effects`'s function for turning the output of `Effect()` into a data frame doesn't respect the ordering of factor levels, instead rearranging factor levels alphabetically. (I find this a bit bizarre given that `effects` forces you to use factors rather than character predictors.) This is evident if you look at the summary of the dataframe we just created; note that the levels of `Target` have been re-arranged to DRESS, then KIT, then TRAP:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "Target", "PrimeTargetSameWord"), .) %>%
as.data.frame() %>%
summary()
```
We probably wanted our original factor levels in the order they were in for a reason---in this case, the original factor level order reflected the historical order of changes in NZE short front shift---and we'll want our plots to reflect this order, as well. To fix this behavior, I've created a new function, `df.eff()`, that tweaks the code for `effects:::as.data.frame.eff()` to respect the original ordering of factor levels:
```{r}
df.eff <- function (x, row.names = NULL, optional = TRUE, transform = x$transformation$inverse,
...) {
if (class(x)!="eff") stop("x must be of class eff")
xx <- x$x
for (var in names(xx)) {
if (is.factor(xx[[var]])) {
##Fix the issue where levels(xx[[var]]!=(unique(xx[[var]])==levels(origDF[[var]])))
xx[[var]] <- addNA(factor(xx[[var]], levels=unique(xx[[var]])))
}
}
x$x <- xx
result <- if (is.null(x$se))
data.frame(x$x, fit = transform(x$fit))
else data.frame(x$x, fit = transform(x$fit), se = x$se,
lower = transform(x$lower), upper = transform(x$upper))
attr(result, "transformation") <- transform
result
}
```
As you can see, `df.eff()` preserves the original order of `Target`:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "Target", "PrimeTargetSameWord"), .) %>%
df.eff() %>%
summary()
```
Now we've got something that can be passed to `ggplot2`.
# How to plot effects in `ggplot2`
If you know how to use `ggplot2`, it's straightforward to turn these dataframes into nice ggplots---and if you don't know how to use `ggplot2`, there are [heaps](https://r4ds.had.co.nz/data-visualisation.html) of [great](https://jofrhwld.github.io/avml2012/) [resources](http://joeystanley.com/blog/making-vowel-plots-in-r-part-1) out there.
Recall that `ggplot2` is all about mapping columns in your dataframe to *aesthetics* of your plot. So what should we map to what? First, `fit` is the fitted mean for that combination of factor levels and/or numeric values, so we'll map it to the `y` aesthetic (for `geom_line()`, `geom_point()`, etc.). Since `lower` and `upper` are 95% confidence interval bounds, we'll map these to the `ymin` and `ymax` aesthetics (for `geom_errorbar()`, `geom_ribbon()`, etc.). And depending on the predictors and whether or not you've got main effects or interaction effects, you might want to map some predictors to the `x` aesthetic, others to the `group` aesthetic, and others still as faceting variables.
## Main effect, continuous predictor
Let's apply these to the simple case of the main effect of `PrimeShiftIndex`, a continuous predictor. Here's what that looks like if we just plot a line of the fitted means:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex"), .) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit)) +
geom_line()
```
Okay, that's a start, but we can do better. Just to name two changes, "fit" isn't a very informative y-axis title, and the y-axis limits are overfit to our line. Fortunately, since we're using `ggplot2`, we can use the same functions for this plot as we would for any other ggplot.
```{r}
ylabel <- "Target shift index (model prediction)"
ylim1 <- c(-1, 1)
primingMod %>%
Effect(c("PrimeShiftIndex"), .) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit)) +
geom_line() +
scale_y_continuous(ylabel, limits=ylim1)
```
Now, if we try to add confidence bands with `geom_ribbon()`, we find some issues with the default settings:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex"), .) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, ymin=lower, ymax=upper)) +
geom_line() +
geom_ribbon() +
scale_y_continuous(ylabel, limits=ylim1)
```
The ribbon defaults to the same color and shade as the line, so we can't tell the line from the ribbon! To deal with this, let's pass some nice color and shading values to the line and ribbon:
```{r}
lineCol <- rgb(0, 128, 255, maxColorValue=255)
ribbonFill <- rgb(217, 236, 255, maxColorValue=255)
ribbonAlpha <- 0.5
primingMod %>%
Effect(c("PrimeShiftIndex"), .) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, ymin=lower, ymax=upper)) +
geom_line(color=lineCol) +
geom_ribbon(fill=ribbonFill, alpha=ribbonAlpha) +
scale_y_continuous(ylabel, limits=ylim1)
```
That's much better! Note that `ggplot2` will draw geoms in the order that you call them; in the previous plot, the ribbon covers up the line a bit, but if we call the ribbon first and then the line, the line stands out better:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex"), .) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, ymin=lower, ymax=upper)) +
geom_ribbon(fill=ribbonFill, alpha=ribbonAlpha) +
geom_line(color=lineCol) +
scale_y_continuous(ylabel, limits=ylim1)
```
## Main effect, factor predictor
Here's another simple example: the main effect of `Gender`. Here, our `x` aesthetic is a factor rather than continuous, so instead of using `geom_line()` and `geom_ribbon()`, we'll use `geom_point()` and `geom_errorbar()`. To maintain consistency, we'll use the same y-axis title and limits as in the previous plot.
```{r}
primingMod %>%
Effect(c("Gender"), .) %>%
df.eff() %>%
ggplot(aes(x=Gender, y=fit, ymin=lower, ymax=upper)) +
geom_point() +
geom_errorbar() +
scale_y_continuous(ylabel, limits=ylim1)
```
## Interactions
Let's make things a little more complicated by considering a three-way interaction between `PrimeShiftIndex`, `Target`, and `PrimeTargetSameWord`. The challenge here is how to most clearly convey the combined effects of one numeric predictor (`PrimeShiftIndex`) and two factors (`Target` and `PrimeTargetSameWord`) on our dependent variable. The truth is, there's not necessarily one right way to do it; what's best for you will depend on your research questions, your results, and the effects you want to highlight.
In this case, since we are most interested in the effect of `PrimeShiftIndex`, I decided to consistenly map it to the `x` aesthetic in every plot where it was involved. Since our hypothesis was that `PrimeTargetSameWord` would mediate the effect of `PrimeShiftIndex`, it's clearest to assess that by overlaying lines that differed only by `PrimeTargetSameWord`---in other words, mapping `PrimeTargetSameWord` to the `color` aesthetic. In plots where I had lines overlaid, I excluded the confidence ribbons to avoid busying the plot. And `Target`, as a factor, is a good candidate to be used as a faceting variable (i.e., defining sub-plots); in the call to `facet_grid()`, the formula `. ~ Target` means "don't create row facets; create column facets on the variable `Target`", and `labeller=label_both` means "create labels like 'Target: TRAP' as opposed to just 'TRAP'".
Here's how that looks. The main takeaway from this plot is "across Target vowel categories, the repetition effect was stronger when prime and target were the same word".
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "Target", "PrimeTargetSameWord"), .) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, color=PrimeTargetSameWord)) +
geom_line() +
facet_grid(. ~ Target, labeller=label_both) +
scale_y_continuous(ylabel, limits=ylim1) +
scale_color_discrete("Prime target\nsame word")
```
Here's another example with two factor predictors: `Gender` and `Target`. Since we've just used `Target` for faceting, we could do the same here and map `Gender` to `x`.
```{r}
primingMod %>%
Effect(c("Gender", "Target"), .) %>%
df.eff() %>%
ggplot(aes(x=Gender, y=fit, ymin=lower, ymax=upper)) +
geom_point() +
geom_errorbar() +
facet_grid(. ~ Target, labeller=label_both) +
scale_y_continuous(ylabel, limits=ylim1)
```
But this plot hides the key insight that men lag women to a consistent degree with respect to the shifted vowels. To make this trend jump off the page better, I've mapped `Target` to the `x` aesthetic and `Gender` to `group` and `color`, adding a `geom_path()` to make the comparison between adjacent vowels more evident. Mapping `Gender` to `group` and `color` also pulls the points and errorbars more tightly together by `Gender`, emphasizing that `Gender`, not `Target`, is the salient comparison here:
```{r}
primingMod %>%
Effect(c("Gender", "Target"), .) %>%
df.eff() %>%
ggplot(aes(x=Target, y=fit, ymin=lower, ymax=upper, group=Gender, color=Gender)) +
geom_point(position=position_dodge(width=0.6)) +
geom_line(position=position_dodge(width=0.6)) +
geom_errorbar(width=0.6, position="dodge") +
scale_y_continuous(ylabel, limits=ylim1)
```
## Adjusting `xlevels`
If you have continuous predictors modeled as nonlinear (e.g., via a log-transformation), they can show up as non-smooth line segments in the effects plot. This is because for each numeric predictor, by default `Effect()` samples "five values equally spaced over its range and then rounded to 'nice' numbers"; for non-linear predictors, sampling over five values makes for bumpy lines. This is evident in the plot of the interaction of `Target` and `log(TargetVowelDur)` below.
```{r}
primingMod %>%
Effect(c("Target", "TargetVowelDur"), .) %>%
df.eff() %>%
ggplot(aes(x=TargetVowelDur, y=fit, ymin=lower, ymax=upper)) +
geom_ribbon(fill=ribbonFill, alpha=ribbonAlpha) +
geom_line(color=lineCol) +
scale_y_continuous(ylabel, limits=c(-1, 2.5)) +
facet_grid(. ~ Target, labeller=label_both)
```
To smooth out these lines, we can ratchet up the resolution of the `log(TargetVowelDur)` sampling. We do this by specifying the `xlevels` argument of `Effect()` as a list with one element, named `TargetVowelDur`, that represents an arbitrarily large number of sampling points. This results in much smoother lines:
```{r}
primingMod %>%
Effect(c("Target", "TargetVowelDur"), ., xlevels=list(TargetVowelDur=1000)) %>%
df.eff() %>%
ggplot(aes(x=TargetVowelDur, y=fit, ymin=lower, ymax=upper)) +
geom_ribbon(fill=ribbonFill, alpha=ribbonAlpha) +
geom_line(color=lineCol) +
scale_y_continuous(ylabel, limits=c(-1, 2.5)) +
facet_grid(. ~ Target, labeller=label_both)
```
Another spot where `xlevels` comes in handy is overriding the default 'nice' values that `Effect()` produces for continuous predictors, in particular when you want to treat these continuous predictors as quasi-factors (that is, showing how effects are different at different values of continuous predictors). As an example, let's explore the interaction of `PrimeShiftIndex` and `PrimeTargetTimeDiff`, since our hypothesis is that `PrimeTargetTimeDiff` mediates the effect of `PrimeShiftIndex`, with longer differences between prime and target leading to less potent repetition effects. A few plots ago, we explored a similar example of `PrimeTargetSameWord` mediating the effect of `PrimeShiftIndex` by comparing slopes at different levels of `PrimeTargetSameWord` (mapping `PrimeTargetSameWord` to `color`), so we'll look to do the same with `PrimeTargetTimeDiff`. Here's what it looks like when we use the defaults (I've zoomed in some y values to make the differences between the next few plots clearer).
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
scale_y_continuous(ylabel, limits=c(-0.1, 0.1)) +
scale_color_continuous(guide="legend")
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
scale_y_continuous(ylabel, limits=c(-0.1, 0.1)) +
scale_color_continuous(guide="colorbar")
```
This plot shows the effect of `PrimeShiftIndex` getting weaker (shallower slope) as `PrimeTargetTimeDiff` increases, with a shallower slope at 5.0 seconds of delay than 2.5 seconds of delay, and a shallower slope still at 7.5 seconds of delay. (Ignore for now the fact that there are 5 lines in the plot but only 4 in the legend.) The problem is that these values aren't representative of the `PrimeTargetTimeDiff` distribution in our data, which has a median of 0.8 and a Q3 of 1.7. The lines at 2.5, 5.0, and 7.5 seconds represent the 86th, 97th, and 99.6th percentiles of `PrimeTargetTimeDiff`.
```{r}
primingVowels$PrimeTargetTimeDiff %>% summary()
mean(primingVowels$PrimeTargetTimeDiff < 2.5)
mean(primingVowels$PrimeTargetTimeDiff < 5)
mean(primingVowels$PrimeTargetTimeDiff < 7.5)
```
In my opinion, the effects should sample `PrimeTargetTimeDiff` at values representative of its distribution, so the reader can get a better sense of the shape of the data. Here's how we do that. First, pass the five-number summary of `PrimeTargetTimeDiff` to the `xlevels` argument of `Effect()`. As you can see below, this produces an effects dataframe where `PrimeTargetTimeDiff` is sampled at more appropriate values:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff()
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
scale_y_continuous(ylabel, limits=c(-0.1, 0.1)) +
scale_color_continuous(guide="legend")
```
To remedy that situation, here's what we'll do. First, get a vector of breaks for the legend by pulling the unique values out of the effects dataframe. (For some inscrutable reason, defining `brks` as `primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4) %>% unname()` leads to different results in the plot.)
```{r}
brks <-
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff() %>%
pull(PrimeTargetTimeDiff) %>%
unique() %>%
sort()
brks
```
Then pass this vector of breaks to the `labels` and `breaks` arguments of `scale_color_continuous`.
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
scale_y_continuous(ylabel, limits=c(-0.1, 0.1)) +
scale_color_continuous(guide="legend", labels=brks, breaks=brks)
```
For some strange reason, the labels show up with weird decimals at the end, so explicitly round the `labels` argument to 3 decimal places:
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
scale_y_continuous(ylabel, limits=c(-0.1, 0.1)) +
scale_color_continuous(guide="legend", labels=brks %>% round(3), breaks=brks)
```
Finally, the colors for the first 4 lines are indistinguishable, thanks to the fact that the numbers themselves are bunched so closely together and `ggplot2` maps this numerical gradient to a color gradient via a linear mapping; this is why the lines for 2.5, 5.0, 7.5, and 10 had nicely-spaced-out hues in the earlier plot. To make these colors easier to tell apart, we can pass a `trans="log"` argument to `scale_color_continuous`, which maps the `PrimeTargetTimeDiff` to a color gradient via a log mapping rather than a linear mapping. We also pass `brks + .Machine$double.eps` to the `breaks` argument, because otherwise the line for 0.02 goes away.
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
scale_y_continuous(ylabel, limits=c(-0.1, 0.1)) +
scale_color_continuous(guide="legend", labels=brks %>% round(3),
breaks=brks + .Machine$double.eps, trans="log")
```
There you have it. This plot effectively conveys the finding that a longer delay between prime and target lessens the potency of the repetition effect, while remaining faithful to both the visual language we've built up across plots (if `PrimeShiftIndex` is in a plot, it always gets mapped to the `x` aesthetic) and the shape of the data (by not distorting the distribution of `PrimeTargetTimeDiff`).
## A four-way interaction?!?!
Yes, you can even plot four-way interactions that make sense! In our model, a lot of the lower-order effects are mediated by the interaction of the Target and Prime vowel categories---and this includes the interaction of `PrimeShiftIndex` and `PrimeTargetTimeDiff` that we just plotted. You can turn the previous plot into a four-way interaction plot by minimally changing the code (all I've changed is adding `Prime` and `Target` to the `Effect()` call, adding a `facet_grid()` call, and changing the `limits` in `scale_y_continuous()`):
```{r}
primingMod %>%
Effect(c("PrimeShiftIndex", "Prime", "Target", "PrimeTargetTimeDiff"),
.,
xlevels=list(PrimeTargetTimeDiff = primingVowels$PrimeTargetTimeDiff %>% quantile(0:4/4))) %>%
df.eff() %>%
ggplot(aes(x=PrimeShiftIndex, y=fit, group=PrimeTargetTimeDiff, color=PrimeTargetTimeDiff)) +
geom_line() +
facet_grid(Prime ~ Target, labeller=label_both) +
scale_y_continuous(ylabel, limits=c(-2, 2)) +
scale_color_continuous(guide="legend", labels=brks %>% round(3),
breaks=brks + .Machine$double.eps, trans="log")
```
# What about glmer?
The `effects` package works for mixed-effects models of binary response data run with `glmer()` too! And actually, it works for lots of different models:
```{r}
methods(Effect)
```
With `glmer()`, the syntax is the same. Let's say we want to run a model predicting the likelihood of DRESS (as opposed to TRAP) given the vowel's duration:
```{r}
dummyGlmer <- glmer(Target ~ TargetVowelDur + (1|Speaker),
primingVowels %>%
filter(!is.na(AgeGroup)) %>%
filter(Target %in% c("DRESS", "TRAP")) %>%
droplevels(),
family="binomial")
dummyGlmer %>% summary()
```
We pipe this model to `Effect()`, then to `df.eff()`, then to `ggplot()` the same way we did with `lmer()`s.
```{r}
dummyGlmer %>%
Effect(c("TargetVowelDur"), ., xlevels=list(TargetVowelDur=1000)) %>%
df.eff() %>%
ggplot(aes(x=TargetVowelDur, y=fit, ymin=lower, ymax=upper)) +
geom_line(color=lineCol) +
geom_ribbon(fill=ribbonFill, alpha=ribbonAlpha) +
scale_y_continuous("Probability of DRESS (model prediction)", limits=c(0, 1))
```
Note that `effects` automatically back-transforms the model's effects (which are reported in the summary in log-odds) into [0, 1] probability space.
# Conclusion
In conclusion, putting together effects plots that are easy to understand can be time-consuming, but I think it's an indispensable part of analyzing your data and communicating your findings. So while perfecting these plots is an art, hopefully with this guide you'll at least avoid any issues in implementing your vision!