k-nearest neighbors (KNN)

$k$-nearest neighbors is a relatively simple non-parametric supervised learning algorithm. It can be used for both regression and classification. It works by finding the $k$ closest observations in a dataset to develop the training set.

A Brief History

$k$-Nearest Neighbors was first presented in the early 1950’s.

Code Examples

All of the code examples are written in Python, unless otherwise noted.

Containers

These are code examples in the form of Jupyter notebooks running in a container that come with all the data, libraries, and code you’ll need to run it. Click here to learn why you should be using containers, along with how to do so.
Quickstart: Download Docker, then run the commands below in a terminal.
#pull container, only needs to be run once
docker pull ghcr.io/thedatamine/starter-guides:k-nearest-neighbors

#run container
docker run -p 8888:8888 -it ghcr.io/thedatamine/starter-guides:k-nearest-neighbors

Need help implementing any of this code? Feel free to reach out to datamine-help@purdue.edu and we can help!