forked from hadley/ggplot2-book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
toolbox.Rmd
67 lines (45 loc) · 3.74 KB
/
toolbox.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# (PART) Toolbox {-}
# Introduction {#toolbox .unnumbered}
The layered structure of ggplot2 encourages you to design and construct graphics in a structured manner. You've learned the basics in the previous chapter, and in this chapter you'll get a more comprehensive task-based introduction. The goal here is not to exhaustively explore every option of every geom, but instead to show the most important tools for a given task. For more information about individual geoms, along with many more examples illustrating their use, see the documentation.
It is useful to think about the purpose of each layer before it is added. In general, there are three purposes for a layer: \index{Layers!strategy}
* To display the __data__. We plot the raw data for many reasons, relying on
our skills at pattern detection to spot gross structure, local structure, and
outliers. This layer appears on virtually every graphic. In the earliest
stages of data exploration, it is often the only layer.
* To display a statistical __summary__ of the data. As we develop and explore
models of the data, it is useful to display model predictions in the context
of the data. Showing the data helps us improve the model, and showing the
model helps reveal subtleties of the data that we might otherwise miss.
Summaries are usually drawn on top of the data.
* To add additional __metadata__: context, annotations, and references. A
metadata layer displays background context, annotations that help to give
meaning to the raw data, or fixed references that aid comparisons across
panels. Metadata can be useful in the background and foreground.
A map is often used as a background layer with spatial data. Background
metadata should be rendered so that it doesn't interfere with your
perception of the data, so is usually displayed underneath the data and
formatted so that it is minimally perceptible. That is, if you concentrate
on it, you can see it with ease, but it doesn't jump out at you when you
are casually browsing the plot.
Other metadata is used to highlight important features of the data. If you
have added explanatory labels to a couple of inflection points or
outliers, then you want to render them so that they pop out at the
viewer. In that case, you want this to be the very last layer drawn.
This chapter is broken up into the following sections, each of which deals with a particular graphical challenge. This is not an exhaustive or exclusive categorisation, and there are many other possible ways to break up graphics into different categories. Each geom can be used for many different purposes, especially if you are creative. However, this breakdown should cover many common tasks and help you learn about some of the possibilities.
* Basic plot types that produce common, 'named' graphics like scatterplots and
line charts, Section \@ref(basics).
* Displaying text, Section \@ref(labelling).
* Adding arbitrary additional anotations, Chapter \@ref(annotations).
* Surface plots to display 3d surfaces in 2d, Section \@ref(surface).
* Drawing maps, Section \@ref(maps).
* Revealing uncertainty and error, with various 1d and 2d intervals,
Section \@ref(uncertainty).
* Weighted data, Section \@ref(weighting).
* In Section \@ref(diamonds), you'll learn about the diamonds dataset.
The final three sections use this data to discuss techniques for visualising larger datasets:
* Displaying distributions, continuous and discrete, 1d and 2d, joint and
conditional, Section \@ref(distributions).
* Dealing with overplotting in scatterplots, a challenge with large datasets,
Section \@ref(overplotting).
* Displaying statistical summaries instead of the raw data,
Section \@ref(summary).