Skip to content

Instantly share code, notes, and snippets.

@sujee
sujee / spark-one-hot.py
Last active February 1, 2021 19:16
Spark one hot encoding sample
## Step 3 : encode the indexes into a vector
from pyspark.ml.feature import OneHotEncoder
encoder = OneHotEncoder(inputCols=["statusIndex"], outputCols=["statusVector"], dropLast=False)
encoded = encoder.fit(indexed).transform(indexed)
encoded.show()
# View dense vectors in pandas
encoded_pd = encoded.toPandas()
print(encoded_pd)
@sujee
sujee / output
Created December 25, 2020 23:42
qa-allen-nlp.py
Loading models...
Loaded model 'transformer-qa' in 31,693.4 milli seconds
Loaded model 'bidaf-model' in 1,633.8 milli seconds
/Users/sujee/opt/anaconda3/envs/teachbot-nlp/lib/python3.7/site-packages/torch/nn/modules/container.py:435: UserWarning: Setting attributes on ParameterList is not supported.
warnings.warn("Setting attributes on ParameterList is not supported.")
Loaded model 'bidaf-elmo-model' in 13,811.0 milli seconds
quesion: Who stars in The Matrix?
model transformer-qa predicted in 794.4 milli seconds
answer: Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, and Joe Pantoliano
@sujee
sujee / cnn-mnist-1-train.ipynb
Created May 21, 2020 07:13
CNN for Mnist, tweaked for Tensorflow v2 GPU
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@sujee
sujee / cnn-mnist-1-train-gpu-minimal.py
Last active April 2, 2023 13:20
CNN model for mnist prediciton -- tweaked for Tensorflow2 GPU edition
#!/usr/bin/env python
## This CNN network to identify MNIST dataset
import time
import random
import numpy as np
from pprint import pprint
import os, sys
@sujee
sujee / run-reveal-in-docker.sh
Last active December 3, 2019 21:04
run-reveal-in-docker.sh
#!/bin/bash
## invoke with '-d' for dev mode
## this will mount utils directory from host for live debugging
port=2000
while getopts 'dp:' OPTION; do
case "$OPTION" in
d)
#!/bin/bash
if [ -z "$1" ] ; then
echo "Usage: $0 <image name> [optional args for docker image]"
echo "Missing Docker image id. exiting"
exit 1
fi
#image_id="$1"
#shift
#!/bin/bash
## usage:
## cluster-cmd.sh <-h hosts_file> [-u user] cmd
## cluster-cmd.sh -h hosts ls -la
function usage()
{
echo "usage : $0 -h <hosts file> [-u user] [-i ssh_key_file] cmd"
exit 1
#!/usr/bin/env bash
## XXX
export JAVA_HOME=/opt/java
if [[ `uname -a` == Darwin* ]]; then
# Assuming Mac OS X
export JAVA_HOME=${JAVA_HOME:-$(/usr/libexec/java_home)}
export TACHYON_RAM_FOLDER=/Volumes/ramdisk
export TACHYON_JAVA_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="
@sujee
sujee / named.scala
Created September 2, 2014 20:42
named RDD
override def runJob(sc: SparkContext, config: Config): Any = {
val fileName = config.getString("input.file")
logger.info("### fileName : " + fileName)
var rdd = this.namedRdds.get[String](fileName)
logger.info("### rdd load 1 : " + rdd)
if (rdd.isDefined) {
logger.info("### rdd %s isDefined".format(fileName))
}
else {
logger.info("### rdd %s doesn't exist... loading".format(fileName))
@sujee
sujee / gist:ff14fd602b76314e693d
Created August 17, 2014 05:20
play + spark error message
[info] application - ### Application has started
[info] play - Application started (Dev)
spark.app.name=Spark And Play
spark.home=/Users/sujee/hadoop/spark-1.0.2-hadoop2
spark.jars=/Users/sujee/dev/play-spark/target/scala-2.10/play-spark_2.10-1.0.jar
spark.master=spark://localhost:7077
[error] a.a.OneForOneStrategy - exception during creation
akka.actor.ActorInitializationException: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:164) ~[akka-actor_2.10-2.3.4.jar:na]
at akka.actor.ActorCell.create(ActorCell.scala:596) ~[akka-actor_2.10-2.3.4.jar:na]