Skip to content

The aim of the project is to be able to predict whether a breast cancer patient is going to survive the disease or not, as well as predicting the probability of such prediction.

Notifications You must be signed in to change notification settings

mkldhz/Breast-Cancer-Survival-Rate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Breast Cancer Survival Rate

The data can be described as set of patient's clinical features. This dataset of breast cancer patients was obtained from SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included. This dataset was uploaded to U-BRITE for AI against CANCER DATA SCIENCE HACKATHON.

Goal

The goal was to create a (classifier) model that is able to take the data of the patient and output whether the model thinks the patient will survive or not and output how confident it is in such a prediction.

Metrics

The model (SVM) was evaluated on its accuracy which reached 99.5% accuracy on the test set. below is the model confusion matrix and classification report:


Challenges

The main challenge was to make sure the model is calibrated, so the output probability of the model matches the distribution in real life, to make sure of this calibration curve was utilized:

As we can see the (SVM) model follows the same distribution as the ideal calibarion, thusly the model is well calibrated.

About

The aim of the project is to be able to predict whether a breast cancer patient is going to survive the disease or not, as well as predicting the probability of such prediction.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published