Skip to content

Instantly share code, notes, and snippets.

View tgsmith61591's full-sized avatar

Taylor G Smith tgsmith61591

View GitHub Profile
@tgsmith61591
tgsmith61591 / download.py
Last active May 9, 2022 20:53
Download a file from the web using requests and a pretty progress bar
# -*- coding: utf-8 -*-
#
# Download a file from the web using requests with a nice progress bar.
from __future__ import print_function
from tqdm import tqdm
import requests
import warnings
@tgsmith61591
tgsmith61591 / collab_split.py
Last active April 27, 2023 02:46
Train/test split for collaborative filtering methods.
# -*- coding: utf-8 -*-
#
# Author: Taylor G Smith
#
# More scratch code in my collection of random recommender
# system utilities. Someday I'll get around to building
# an actual repository... in the meantime, here are some
# train/test split utilities for collaborative filtering
# with sparse matrices.
@tgsmith61591
tgsmith61591 / ranking.py
Last active March 21, 2024 06:36
Ranking metrics for recommender systems
# -*- coding: utf-8 -*-
#
# Author: Taylor G Smith
#
# Recommender system ranking metrics derived from Spark source for use with
# Python-based recommender libraries (i.e., implicit,
# http://github.com/benfred/implicit/). These metrics are derived from the
# original Spark Scala source code for recommender metrics.
# https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala
@tgsmith61591
tgsmith61591 / rdd_train_test_split.py
Last active March 28, 2020 05:15
This script defines a function for creating a train/test split in a sparse ratings RDD for use with PySpark collaborative filtering methods.
# -*- coding: utf-8 -*-
#
# Author: Taylor Smith
#
# This function provides an interface for splitting a sparse ratings
# matrix RDD into a train and test set for use in collaborative
# filtering in PySpark applications.
#
# Dependencies:
# * scikit-learn >= 0.18