Skip to content

Latest commit

 

History

History
47 lines (35 loc) · 1.78 KB

index.md

File metadata and controls

47 lines (35 loc) · 1.78 KB
layout root permalink
lesson
.
index.html

So you have a new data set. Before you dive into running models and tests, you need to inspect your data. John Tukey, a prominent statistician, coined the term "exploratory data analysis". Data exploration can inform a number of decisions:

  • what methods are appropriate to use on your data
  • whether the data satisfy certain modeling assumptions
  • whether the data needs to be cleaned, reshaped, reduced, etc.

In this lesson, we begin with a messy version of the Gapminder data and explore it together. We will find some issues with the data and teach you how to correct them. After making the data tidy, you will be able to plot the variables in different ways and see patterns.

Prerequisites

Some experience with Python is helpful, but not strictly needed. {: .prereq}

Syllabus

Data Exploration Tidying, summarzing, and plotting data Lesson narrative Student notebook