Clustering Capstone
College Recommendation Engine
A machine learning case study for unsupervised clustering.
Uses unsupervised learning to determine clusters of colleges. Given a specific school of interest, recommends others in the same cluster.
Abstract
This project is catered to future and current college students and their parents. As a parent of a prospective student, I noticed a lot of data in disparate systems with inconsistent results. There are many resources to the point that it is overwhelming.
The intent of this project is to demonstrate a use case in unsupervised learning. In it, you will see the problem defined and business objective set. I will guide you through my research and methodologies, resulting in a usable recommendation engine.
Introduction
One of the greatest challenges faced by high school students and their parents is selecting a college. This is perhaps the first major decision in a young person’s life. It can also be the most costly. Like any big decision, it is imperative to perform research and not come to a hasty resolution.
Business Problem
Searching for a college is too time consuming and often only includes limited features. A college visit or interview is the only good way to find a college that “fits,” but it is impossible to visit all college campuses.
Business Objective: Make it easier to find a college that “fits.”
Data Source
Scorecard data from the US Department of Education
Click here to download data
Modeling Objective
The ultimate question that remains is, which college should we look at. Which colleges would be the best “fit?”
Modeling Objective: Perform unsupervised learning to form school clusters.
Click here to read on for Methods and Results
Click here if you’d rather see a presentation
Click here to access Source Code