Skip to content

Instantly share code, notes, and snippets.

View andre3k1's full-sized avatar
🏠
Working from home

Andre Garrigo andre3k1

🏠
Working from home
View GitHub Profile
import java.util.Arrays;
import java.util.Date;
import java.util.List;
import java.util.UUID;
import com.amazonaws.AmazonServiceException;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.PropertiesCredentials;
import com.amazonaws.services.ec2.model.InstanceType;
import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce;
@andre3k1
andre3k1 / Levenshtein.scala
Created April 18, 2013 22:38
Calculate the Levenshtein distance between two strings
package com.senzari
object Levenshtein {
def distance(start: String, target: String): Int = {
val slen = start.length()
val tlen = target.length()
def inner(start: String, target: String): Array[Array[Int]] = {
export HADOOP_DIRECTORY=???
export HDFS_DIRECTORY=???
mkdir -p $HDFS_DIRECTORY/name $HDFS_DIRECTORY/data $HDFS_DIRECTORY/backup
echo "<?xml version=\"1.0\"?>
<?xml-stylesheet type=\"text/xsl\" href=\"configuration.xsl\"?>
<!-- Put site-specific property overrides in this file. -->
export HADOOP_MASTER_IP=???
export HADOOP_DIRECTORY=???
export HDFS_DIRECTORY=???
mkdir -p $HDFS_DIRECTORY/tmp $HDFS_DIRECTORY/system
echo "<?xml version=\"1.0\"?>
<?xml-stylesheet type=\"text/xsl\" href=\"configuration.xsl\"?>
<!-- Put site-specific property overrides in this file. -->
@andre3k1
andre3k1 / md5ToLong.scala
Created July 30, 2013 17:42
Truncate 128-byte md5 hash to a 64-byte long avoiding collisions.
def md5ToLong(md5: String) = {
var hash = 1125899906842597L
for (n <- 0 until md5.length) hash = 31 * hash + md5.charAt(n)
hash
}
@andre3k1
andre3k1 / 7digital.schema
Created July 31, 2013 17:55
Schema for 7digital music catalog
ARTIST CSV SCHEMA
-----------------
artistId,name,popularity,tags,image,url
RELEASE CSV SCHEMA
------------------
releaseId,title,version,artistId,artistAppearsAs,barcode,type,year,explicitContent,trackCount,duration,tags,licensorId,image,dateAdded,releaseDate,labelId,labelName,formats,price,rrp,url
TRACK CSV SCHEMA
----------------
@andre3k1
andre3k1 / 7digital.schema
Last active December 20, 2015 11:38
Schema for 7digital music catalog
ARTIST CSV SCHEMA
-----------------
artistId,name,popularity,tags,image,url
name
RELEASE CSV SCHEMA
------------------
releaseId,title,version,artistId,artistAppearsAs,barcode,type,year,explicitContent,trackCount,duration,tags,licensorId,image,dateAdded,releaseDate,labelId,labelName,formats,price,rrp,url
@andre3k1
andre3k1 / scala_installer.sh
Created August 1, 2013 01:24
Install Scala and add it to your machine's PATH environment variable. This script is compatible with all Unix-like machines.
#!/bin/sh
export SCALA_VERSION=2.9.2 # This is the only configuration setting
sudo curl -O http://www.scala-lang.org/files/archive/scala-$SCALA_VERSION.tgz
sudo tar -zxf scala-$SCALA_VERSION.tgz
sudo rm -rf scala-$SCALA_VERSION.tgz
sudo mv scala-$SCALA_VERSION /usr/local
sudo ln -s /usr/local/scala-$SCALA_VERSION/bin/scala /usr/bin/scala
sudo ln -s /usr/local/scala-$SCALA_VERSION/bin/scalac /usr/bin/scalac
sudo ln -s /usr/local/scala-$SCALA_VERSION/bin/fsc /usr/bin/fsc
sudo ln -s /usr/local/scala-$SCALA_VERSION/bin/sbaz /usr/bin/sbaz
#!/bin/bash -ev
# Download Scala
wget http://www.scala-lang.org/files/archive/scala-2.9.3.tgz
tar -zxf scala-2.9.3.tgz
rm -rf scala-2.9.3.tgz
export SCALA_HOME=/home/hadoop/scala-2.9.3
# Download Spark
wget http://spark-project.org/files/spark-0.7.2-sources.tgz
@andre3k1
andre3k1 / post-receive
Last active August 29, 2015 14:08
Push GitHub changes to Apache Web Server
#!/bin/bash
#
# This "post-receive" script receives from stdin three arguments in the form:
# <oldrev> <newrev> <refname>
#
while read oldrev newrev ref
do
branch=`echo $ref | cut -d/ -f3`
if [ "master" == "$branch" ]; then
git --work-tree=/var/www/default/public_html/ checkout -f $branch