Clustering Capstone

1 minute read

College Recommendation Engine

A machine learning case study for unsupervised clustering.

Uses unsupervised learning to determine clusters of colleges. Given a specific school of interest, recommends others in the same cluster.

Abstract

This project is catered to future and current college students and their parents. As a parent of a prospective student, I noticed a lot of data in disparate systems with inconsistent results. There are many resources to the point that it is overwhelming.

The intent of this project is to demonstrate a use case in unsupervised learning. In it, you will see the problem defined and business objective set. I will guide you through my research and methodologies, resulting in a usable recommendation engine.

Introduction

One of the greatest challenges faced by high school students and their parents is selecting a college. This is perhaps the first major decision in a young person’s life. It can also be the most costly. Like any big decision, it is imperative to perform research and not come to a hasty resolution.

Business Problem

Searching for a college is too time consuming and often only includes limited features. A college visit or interview is the only good way to find a college that “fits,” but it is impossible to visit all college campuses.
Business Objective: Make it easier to find a college that “fits.”

Data Source

Scorecard data from the US Department of Education
Click here to download data

Modeling Objective

The ultimate question that remains is, which college should we look at. Which colleges would be the best “fit?”
Modeling Objective: Perform unsupervised learning to form school clusters.

Click here to read on for Methods and Results
Click here if you’d rather see a presentation
Click here to access Source Code

Updated: