This GitHub Gist is intended to accompany the blog posts below:
Part 1: https://medium.com/shoprunner/fetching-better-beer-recommendations-with-collie-part-1-18c73ab30fbd Part 2: https://medium.com/shoprunner/fetching-better-beer-recommendations-with-collie-part-2-27930a421459 Part 3: https://medium.com/shoprunner/fetching-better-beer-recommendations-with-collie-part-3-6aaae9bad169
Data for this blog post can be found here, specifically in the Beeradvocate.txt.gz
and Ratebeer.txt.gz
files. In total, these files should be 3.29 GB.
It is highly recommended you run this code on the GPU for the fastest execution time. If you have extra time and patience, all code below this will still work on the CPU.
I ran the code below using collie_recs==0.1.3
in the pytorch/pytorch:1.8.1-cuda11.1-cudnn8-devel
base Docker image on a p3.2xlarge
EC2 instance that is equipped with a single Tesla V100, 16GB GPU.