This course provides a half day introduction to visualisation in R and Python using the grammar of graphics approach. In the course, participants will learn:
- how the traditional approach to visualisation differs to the grammar of graphics one
- what aesthetics and geoms are
- how grammar of graphics allows for agile exploration of hierarchical datasets
The course consists of a series of short lectures and a problem set. The lectures use country-level reported suicide rates data to illustrate how the grammar of graphics approach works.
The exercise within the problem set invites participants to visualise gapminder data. Answers to this problem set (written in Python) are available here.
Note, this course is not meant to be a comprehensive introduction to ggplot2 (see references below).
The course doesn't really have any! I give two really simple "base R" plots in the lecture, but these types of plots are very similar in other languages, like Python or Matlab. All that you need is an interest in plotting data.
This course is not meant to be a comprehensive introduction to grammar of graphics, nor ggplot2. Rather, it is meant to illustrate the benefits of this approach. As such, I recommend the following references.
The best reference for this course is the ggplot2 book by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen which is freely available online. The whole book is great (and not that long). But, I'd especially recommend the following chapters for someone new to ggplot:
- introduction
- first steps
- individual and collective geoms
- statistical uncertainties
- scales
- faceting
- themes
Another book I'd recommend, more generally, is The visual display of quantitative information by Tufte. It's a beautifully crafted book with lots of excellent visualisations. My one issue with it is that, in my view, it occasionally advocates for style over substance: resulting in visualisations that look great but that perhaps don't provide as much insight as they could.