Data Mining Project
Breast Cancer Study - A Machine Learning Case Study using Logistic Regression
An Exploratory Data Analysis of NWS Storm Data.
Analyze data to determine factors that increase survival of breast cancer patients.
Narrative
With data science so prevalent during the ongoing pandemic, I wanted to dive into something in the health field. Unfortunately, most of the health data I found was aggregated more than I wanted. I realize a lot of this is done for protection of individuals, but it’s difficult to perform analysis, without knowing all the factors that went into the outcome. I was looking for something that addresses individuals of varying demographics. I found data from a study on breast cancer patients, originally from the Dutch Cancer Institute (NKI), which had since been cleaned and readied by Devi Ramanan. This dataset contained treatment details on individuals, along with a death indicator. Using this data, I’d like to determine if some treatments were more effective than others. I’d also like to consider the impact on various sizes and types of tumors.
Data Source
NKI Breast Cancer Data
Data source can be found here
Click here to read on for Methods and Results
Click here to access Jupyer Notebook