Breast Cancer Survival Rate

The data can be described as set of patient's clinical features. This dataset of breast cancer patients was obtained from SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included. This dataset was uploaded to U-BRITE for AI against CANCER DATA SCIENCE HACKATHON.

Goal

The goal was to create a (classifier) model that is able to take the data of the patient and output whether the model thinks the patient will survive or not and output how confident it is in such a prediction.

Metrics

The model (SVM) was evaluated on its accuracy which reached 99.5% accuracy on the test set. below is the model confusion matrix and classification report:

Challenges

The main challenge was to make sure the model is calibrated, so the output probability of the model matches the distribution in real life, to make sure of this calibration curve was utilized:

As we can see the (SVM) model follows the same distribution as the ideal calibarion, thusly the model is well calibrated.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Images		Images
Breast_Cancer_Rate_Classification.ipynb		Breast_Cancer_Rate_Classification.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breast Cancer Survival Rate

Goal

Metrics

Challenges

About

Releases 1

Packages

Languages

mkldhz/Breast-Cancer-Survival-Rate

Folders and files

Latest commit

History

Repository files navigation

Breast Cancer Survival Rate

Goal

Metrics

Challenges

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages