Skip to content

TrilokiDA/EDA_on_Habermans_Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis(EDA) on Haberman's Survival Dataset

Description:

  • The dataset contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital on the survival of patients who had undergone surgery for breast cancer.

Dataset:

Attribute Information:

  • Age of patient at time of operation (numerical)
  • Patient's year of operation (year - 1900, numerical)
  • Number of positive axillary nodes detected (numerical)
  • Survival status (class attribute) 1 = the patient survived 5 years or longer and 2 = the patient died within 5 years

Number of data points:

  • 306

Number of Attributes:

  • 4 (including the class attribute)

Columns:'

  • age', 'year_of_operation', 'positive_axillary_nodes', 'survival_status'

Packages

  • numpy
  • pandas
  • matplotlib
  • seaborn
  • colorama

Observation of analysis

  • Histogram
  • Probability Density Function(PDF)
  • Cummulative Density Function(CDF)
  • Mean
  • Box plot & Whiskers
  • Violin plots
  • 2-D Scatter Plot & Pair Plot
  • Contour Graph