Nearest Neighbors

class sealion.nearest_neighbors.KNearestNeighbors(k=5, regression=False)

Arguably the easiest machine learning algorithm to make and understand. Simply looks at the k closest points (you give these points) for a data point you want to predict on, and if the majority of the closest k points are class X, it will predict back class X. The k number is decided by you, and should be odd (what happens if there’s a tie?). If used for regression (we support that), it will just take the average of all of their values. If you are going to use regression, make sure to set regression = True - otherwise you will get a very low score in the evaluate() method as it will assume it’s for classification (KNNs typically are.)

A great introduction into ML - maybe consider using this and then seeing if you can beat it with your own version from scratch.If you can - please send it on GitHub!

Other than that, enjoy!

__init__(k=5, regression=False)

Parameters

k – number of points for a prediction used in the algorithm
regression – are you using regression or classification (if classification - do nothing)

evaluate(x_test, y_test)

Parameters

x_test – testing data (2D)
y_test – testing labels (1D)

Returns

accuracy score (r^2 score if regression = True)

fit(x_train, y_train)

Parameters

x_train – 2D training data
y_train – 1D training labels

Returns

predict(x_test)

Parameters: x_test – 2D prediction data
Returns: predictions in 1D vector/list

visualize_evaluation(y_pred, y_test)

Parameters

y_pred – predictions given by model, 1D vector/list
y_test – actual labels, 1D vector/list

Visualize the predictions and labels to see where the model is doing well and struggling.