Classification Capstone

1 minute read

Zero Trust - Malicious Traffic Detector

A machine learning case study for supervised classification.

Uses supervised learning to classify network traffic. Reviews Wireshark logs to determine if traffic is coming from the DarkNet.

Abstract

As the amount of data we share increases, cybersecurity becomes more critical to protect it. We have also seen the need for accessibility increase during the pandemic. Users are connecting from multiple locations on multiple devices. It is necessary to provide this flexibility, while still securing our networks. To enable more access, we must review and validate all access points continually. Machine learning is critical to achieving this goal.

Introduction

Zero Trust is a cybersecurity framework which embraces the mantra “never trust, always verify.” There are multiple avenues that can be explored with Zero Trust. Identity and Asset Management is required to manage and secure human and non-human identities. Threat Protection is needed to provide automated security, stopping attacks before they begin. Information Protection is essential to protect sensitive data, and Cloud security is necessary to protect assets stored in the Cloud. For this project, I will focus on Threat Protection where incoming and outgoing traffic is reviewed to identify suspicious network activity.

Business Problem

Increased system accessibility invites an increase in malicious cybersecurity threats. Allowing access from outside the network makes it more difficult to identify what network traffic should be allowed and what should be blocked. We need to allow personnel to work from remote locations and continue to be productive. Access is required whether using company devices over a Virtual Private Network (VPN) or a personal device. This has become the new normal during the pandemic, but we must continue to adhere to strict protocols and keep our systems secure.
Business Objective: Identify network traffic that should be blocked, regardless of device or physical location.

Data Source

DarkNet 2020 dataset from the Canadian Institute for Cybersecurity at the University of New Brunswick.
Click here to download data

Modeling Objective

So, how do we know if network traffic is considered Tor?
Modeling Objective: Perform supervised learning to predict whether network traffic is Tor.

Click here to read on for Methods and Results
Click here if you’d rather see a presentation
Click here to access Source Code

Updated: