Regression Capstone
Vet Hospital Wait Times
A machine learning case study for supervised regression.
Uses supervised learning to determine clusters of colleges. Given a specific school of interest, recommends others in the same cluster.
Abstract
This project was requested by a veterinary hospital to address wait times. They are unable to estimate how long a patient is expected to wait due to urgent cases that come in and require special attention. Currently, staff just provides a wild guess on how long the wait will be. The intent of this project is to demonstrate a use case in supervised regression. In it, you will see the problem defined and business objective set. I will guide you through my research and methodologies, resulting in a usable wait time estimator.
Introduction
Veterinarian clinics struggle with wait times. These clinics are unique since they provide specialized services, as well as emergency care for pets. Those with appointments often take a backseat to urgent cases. There are also unknown factors with appointment length and the staff required for each case since many are unique. Agrawal (2017) compares the operation of a hospital to that of an airport and poses several ways that operations can be streamlined. The key is to collect data at every point, so you can take advantage of machine learning.
Business Problem
The demand for emergency treatment of pets has created a backlog in veterinarian hospitals. Although facilities are staffed 24 hours 7 days a week, they are unable to keep up the pace. Customers have to wait with their pets an uncomfortable amount of time and get impatient without knowing how long they can expect to wait.
Business Objective 1: Let clients know anticipated wait time.
Business Objective 2: Understand factors that contribute to longer wait
time so they can be mitigated.
Data Source
DoveLewis Veterinary Emergency & Specialty Hospital; Portland, OR
Modeling Objective
I originally approached this problem as a multi-class classification problem with wait-time buckets. However, after I changed the approach to calculate throughput with the data provided, this turned into a regression problem using the throughput of patients as the target variable. I created a new derived column to store the patient change per interval to represent the throughput of the treatment area.
Modeling Objective: Perform supervised learning to predict patient throughput.
Click here to read on for Methods and Results
Click here if you’d rather see a presentation
Click here to access Source Code