Skip to content

Analysis of ride sharing data using MatPlotLib and Pandas

Notifications You must be signed in to change notification settings

mileslucey/pyber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pyber

Summary

  • A homework assignment for UC Berkeley's Data Analytics Bootcamp
  • Analysis studies data from a hypothetical ridesharing application (Pyber) using Pandas and MatPlotLib
  • The following conclusions can be reached from the analysis:
    • About two thirds of all rides and total fare value come from urban passengers
    • The average rural rider's ride tends to cost a lot, but rural riders only make up a small portion of overall riders
    • Even though rural and suburban rider fares make up almost 40% of total fares, rural and suburban drivers make up less than 20% of total drivers

Files

  • The "pyber.ipynb" Jupyter Notebook uses MatPlotLib to conduct a graphical analysis of the ride sharing application data
  • The "city_data.csv" and "ride_data.csv" files in the "data" folder are the data files used in the Jupyter Notebook analysis
  • All the PNG files in the "output" folder are the graphs created in the Jupyter Notebook

Specific Graphs Created

MatPlotLib is used to create the following graphs:

  • One bubble plot showing the relationship between the following variables:
    • Average Fare ($) Per City
    • Total Number of Rides Per City
    • Total Number of Drivers Per City
    • City Type (Urban, Suburban, Rural)
  • Three pie charts showing the following:
    • % of Total Fares by City Type
    • % of Total Rides by City Type
    • % of Total Drivers by City Type