Data Mining Project

1 minute read

Breast Cancer Study - A Machine Learning Case Study using Logistic Regression

An Exploratory Data Analysis of NWS Storm Data.

Analyze data to determine factors that increase survival of breast cancer patients.

Narrative

With data science so prevalent during the ongoing pandemic, I wanted to dive into something in the health field. Unfortunately, most of the health data I found was aggregated more than I wanted. I realize a lot of this is done for protection of individuals, but it’s difficult to perform analysis, without knowing all the factors that went into the outcome. I was looking for something that addresses individuals of varying demographics. I found data from a study on breast cancer patients, originally from the Dutch Cancer Institute (NKI), which had since been cleaned and readied by Devi Ramanan. This dataset contained treatment details on individuals, along with a death indicator. Using this data, I’d like to determine if some treatments were more effective than others. I’d also like to consider the impact on various sizes and types of tumors.

Data Source

NKI Breast Cancer Data Data source can be found here

Click here to read on for Methods and Results
Click here to access Jupyer Notebook

Updated: