Skip to content

Instantly share code, notes, and snippets.

@azymnis
azymnis / ItemSimilarity.scala
Created December 13, 2013 05:17
Approximate item similarity using LSH in Scalding.
import com.twitter.scalding._
import com.twitter.algebird.{ MinHasher, MinHasher32, MinHashSignature }
/**
* Computes similar items (with a string itemId), based on approximate
* Jaccard similarity, using LSH.
*
* Assumes an input data TSV file of the following format:
*
* itemId userId
@joemiller
joemiller / raid_ephemeral.sh
Last active October 23, 2023 21:53
detect all ephemeral disks on EC2 then stripe together in a raid-0 vol mounted at /mnt
#!/bin/bash
#
# this script will attempt to detect any ephemeral drives on an EC2 node and create a RAID-0 stripe
# mounted at /mnt. It should be run early on the first boot of the system.
#
# Beware, This script is NOT fully idempotent.
#
METADATA_URL_BASE="http://169.254.169.254/2012-01-12"