Skip to content

Honors Project for IBM Exploratory Data Analysis Course via Coursera. - This project performs exploratory data analysis for factors associated with alcoholism in high school students, identifies the most probable causes and tests various hypotheses.

Notifications You must be signed in to change notification settings

qasimza/highschool-alchoholism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This is an optional honors project for the IBM Exploratory Data Analysis for Machine Learning Course on Coursera. The aim is to demonstrate the applications of skills and knowledge gained from this course such as Data Cleaning, Feature Engineering, Exploratory Data Visualization, and Hypothesis Testing.

Project Deliverables

  • Select a dataset that you are curious about.
  • Provide a brief description of the data set and a summary of its attributes.
  • Provide an initial plan for data exploration.
  • Describe actions taken for data cleaning and feature engineering.
  • Provide key findings and insights, which synthesizes the results of Exploratory Data Analysis in an insightful and actionable manner.
  • Formulate at least 3 hypothesis about this data.
  • Conduct a formal significance test for one of the hypotheses and discuss the results.
  • Provide suggestions for next steps in analyzing this data.
  • Include a paragraph that summarizes the quality of this data set and a request for additional data if needed.

Dataset

Using Kaggle Data set, High School Alcoholism and Academic Performance

Motivation

To explore what causes teenage alcoholism and its impact on academic performance, as well as factors that could reduce it.

Installation and Setup

  1. Download Kaggle dataset and extract contents into ./data.
  2. Create and activate virtual environment following this tutorial. https://docs.python.org/3/tutorial/venv.html
  3. Install requirements

    On Windows

    install -r .\requirements.txt
    

    On Linux

    install -r ./requirements.txt
    
  4. Run File

    On Windows

    python .\src\exploratory_data_analysis.py
    

    On Linux

    python src/exploratory_data_analysis.py
    

Code Structure

Results and Evaluation

https://medium.datadriveninvestor.com/how-to-write-a-good-readme-for-your-data-science-project-on-github-ebb023d4a50e

About

Honors Project for IBM Exploratory Data Analysis Course via Coursera. - This project performs exploratory data analysis for factors associated with alcoholism in high school students, identifies the most probable causes and tests various hypotheses.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages