Skip to content

Instantly share code, notes, and snippets.

@oeegee
oeegee / DotProduct.scala
Created January 31, 2018 15:49 — forked from samklr/DotProduct.scala
DotProduct matrix in scala and on spark
def dotProduct(vector: Array[Int], matrix: Array[Array[Int]]): Array[Int] = {
// ignore dimensionality checks for simplicity of example
(0 to (matrix(0).size - 1)).toArray.map( colIdx => {
val colVec: Array[Int] = matrix.map( rowVec => rowVec(colIdx) )
val elemWiseProd: Array[Int] = (vector zip colVec).map( entryTuple => entryTuple._1 * entryTuple._2 )
elemWiseProd.sum
} )
}
@oeegee
oeegee / MovieSimilarities.scala
Created January 31, 2018 05:01 — forked from MLnick/MovieSimilarities.scala
Movie Similarities with Spark
import spark.SparkContext
import SparkContext._
/**
* A port of [[http://blog.echen.me/2012/02/09/movie-recommendations-and-more-via-mapreduce-and-scalding/]]
* to Spark.
* Uses movie ratings data from MovieLens 100k dataset found at [[http://www.grouplens.org/node/73]]
*/
object MovieSimilarities {
#!/bin/bash
# Minimum TODOs on a per job basis:
# 1. define name, application jar path, main class, queue and log4j-yarn.properties path
# 2. remove properties not applicable to your Spark version (Spark 1.x vs. Spark 2.x)
# 3. tweak num_executors, executor_memory (+ overhead), and backpressure settings
# the two most important settings:
num_executors=6
executor_memory=3g
#!/bin/sh
BOOT2DOCKER_CERTS_DIR=/var/lib/boot2docker/certs
CERTS_DIR=/etc/ssl/certs
CAFILE=${CERTS_DIR}/ca-certificates.crt
for cert in $(/bin/ls -1 ${BOOT2DOCKER_CERTS_DIR}); do
SRC_CERT_FILE=${BOOT2DOCKER_CERTS_DIR}/${cert}
CERT_FILE=${CERTS_DIR}/${cert}
HASH_FILE=${CERTS_DIR}/$(/usr/local/bin/openssl x509 -noout -hash -in ${SRC_CERT_FILE} 2>/dev/null)