The dataset used for the investigation contains information about bike trips, such as trip duration, user type, age, gender, and start and end stations. The dataset shows bike usage patterns and user characteristics, which will be studied in order to gain insights into user behavior and factors impacting trip duration.
In this project we will use Univariate, Bivariate, and Multivariate exploration in this project to identify patterns, correlations, and trends in the dataset. The findings will be utilized to create explanatory visualizations, as well as providing insights. This study provides insight into how to improve bike-sharing systems.
-
In this, the dataset will be examined for data quality or tidiness issues. This includes indentifying missing values, duplicates or inaccuracies in data.
-
By handling missing values by either dropping or filling values based on logical assumptions, correcting the datatypes.
-
Filtering or modification techniques will be used to resolve inconsistencies in data and outliers. This helps to the overall quality of the dataset.
- The majority of users in the dataset are subscribers.
- The trip duration is usually short, with most trips lasting less than 20 mins.
- There is a noticable difference in trip duration between subscribers and Customers, with Customers taking longer trips.
- The distribution of trip duration is right-skewed, that there are few trips with long durations.
- The bike usage are high in weekdays, on Thursdays and Tuesdays, while relativley low on weekends.
- There is a correlation between age and trip duration, with youger users tending to take longer trips.
- Gender does not appear to have a strong correlation with trip duration.
- The age distribution of users show a peak around late 20s and 30s, with decline in usage amoung older age groups
- Certain Start and End stations are more popular than others, indicating the presence of transportation hubs.
- Age has a correlation with trip duration, with youger user trnding to take longer trips.
- Gender does not seem to have significant impact on trip duration.
- It is presented with visualization in order to effectively convey the finding.
- Udacity GPT to correct few errors.
- https://seaborn.pydata.org/generated/seaborn.pairplot.html(pair plot).
- Stackoverflow to correct few errors.