Skip to content

Instantly share code, notes, and snippets.

View TheCthulhuKid's full-sized avatar

TheCthulhuKid TheCthulhuKid

View GitHub Profile

Movie Recommendations with k-Nearest Neighbors and Cosine Similarity


Introduction

The k-nearest neighbors (k-NN) algorithm is among the simplest algorithms in the data mining field. Distances / similarities are calculated between each element in the data set using some distance / similarity metric ^[1]^ that the researcher chooses (there are many distance / similarity metrics), where the distance / similarity between any two elements is calculated based on the two elements' attributes. A data element’s k-NN are the k closest data elements according to this distance / similarity.


1. A distance metric measures distance; the higher the distance the further apart the neighbors. A similarity metric measures similarity; the higher the similarity the closer the neighbors.

A small social networking website

This database is a small example of a networking site where users can watch movies, subscribe to TV shows and comment and rate any of the previous media. Users may follow or block other users, just like any other networking website nowadays.

  • Purpose:

The theme was chosen because of the success these type of webs have all over the world, and because in general their structure can easily and naturally be displayed as a graph with very different types of relationships and very connected data. So, in a nutshell,

= Working examples for the 'Graph Databases' book
image::http://assets.neo4j.org/img/books/graphdatabases_thumb.gif["frontpage thumbnail",align="left"]
The examples in the 'Graph Databases' book don't work out of the box. I've modified them, so that they do work (for chapter 3, that is).
This is a graphgist version of my https://baach.de/Members/jhb/working-examples-for-the-graph-databases-book/[blog post].
If you click one of the green play buttons in the examples below, they will show in this console. Usually the code formatting is messed up, so it might be a bit ugly.