Skip to content

Instantly share code, notes, and snippets.

View bradkarels's full-sized avatar

Brad Karels bradkarels

View GitHub Profile
# Spark local environment variables
export SPARK_HOME=/home/bkarels/spark/current
export SPARK_MASTER_IP=127.0.0.1
export SPARK_MASTER_PORT=7077
export SPARK_MASTER_WEBUI_PORT=8080
#SPARK_MASTER-OPTS=
export SPARK_LOCAL_DIRS=$SPARK_HOME/work
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=4G
#export SPARK_WORKER_WEBUI_PORT=8081
@bradkarels
bradkarels / embeddedH2.groovy
Created December 17, 2014 19:48
Simple example: Embed H2 database within a groovy script (file based persistence)
@GrabConfig(systemClassLoader=true)
@Grab(group='com.h2database', module='h2', version='1.3.176')
import java.sql.*
import groovy.sql.Sql
import org.h2.jdbcx.JdbcConnectionPool
println("More groovy...")
def sql = Sql.newInstance("jdbc:h2:things", "sa", "sa", "org.h2.Driver") // DB files for 'things' in current directory (./hello.h2.db)
@bradkarels
bradkarels / hdfsCmds
Created January 21, 2015 18:41
A brief history setting up Hadoop (HDFS) locally (psuedo distributed)
890 cd ~/Downloads/
891 wget http://mirror.cc.columbia.edu/pub/software/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
915 cd ~
917 tar xzf Downloads/hadoop-2.6.0.tar.gz
920 mv hadoop-2.6.0/ hadoop/
923 cd hadoop/
924 vim ~/.bashrc
Set HADOOP_HOME and add HADOOP_HOME/bin to PATH
925 . ~/.bashrc
926 ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
@bradkarels
bradkarels / gist:b874a0159b5aafa37528
Created March 23, 2015 14:28
YARN capacity scheduler - working example: 1
yarn.scheduler.capacity.maximum-am-resource-percent=0.2
yarn.scheduler.capacity.maximum-applications=10000
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.ds.acl_administer_jobs=*
yarn.scheduler.capacity.root.ds.acl_submit_applications=*
yarn.scheduler.capacity.root.ds.capacity=40
yarn.scheduler.capacity.root.ds.maximum-capacity=50
yarn.scheduler.capacity.root.eng.acl_administer_jobs=*
@bradkarels
bradkarels / gist:18038cbbab539b426b50
Created March 23, 2015 17:32
YARN Capacity Scheduler with ACLs added - example - Experimental
yarn.scheduler.capacity.maximum-am-resource-percent=0.2
yarn.scheduler.capacity.maximum-applications=10000
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.root.acl_administer_queue=bkarels hdpAdmins
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.ds.acl_administer_jobs=dsAdmin,bkarels,nadelman dsAdmins
yarn.scheduler.capacity.root.ds.acl_submit_applications=dsAdmin,dsUser0,dsUser1 dsAdmins,mlGroup,analyticsGroup
yarn.scheduler.capacity.root.ds.capacity=40
yarn.scheduler.capacity.root.ds.maximum-capacity=50
yarn.scheduler.capacity.root.eng.acl_administer_jobs=bkarels hdpAdmins
@bradkarels
bradkarels / chill.scala
Created April 30, 2015 19:34
Spark Chill Example
import java.io.ByteArrayOutputStream
import java.io.ObjectOutputStream
import java.io.Serializable
import com.twitter.chill.{Input, Output, ScalaKryoInstantiator}
class Person extends Serializable {
var name: String = ""
def this(name:String) {
this()
@bradkarels
bradkarels / kryoChill.scala
Created April 30, 2015 19:59
Fuddling with Kryo/chill on repl with Spark 1.2.1
// scala> :cp lib/chill_2.10-0.5.2.jar
//bkarels@ahimsa:~/spark/current$ ./bin/spark-shell --master local[*] --jars lib/mongo-java-driver-3.0.0.jar,lib/mongo-hadoop-core-1.3.2.jar,lib/chill_2.10-0.5.2.jar
import com.esotericsoftware.kryo.io.{Input, Output}
import com.twitter.chill.ScalaKryoInstantiator
import java.io.ByteArrayOutputStream
class Person(val name:String) extends Serializable
val p0:Person = new Person("p0")
val p1:Person = new Person("p1")
@bradkarels
bradkarels / tcpdump example
Created May 14, 2015 18:10
Using tcpdump on ubuntu local to access Kafka remote on OpenStack - just saving off the command.
sudo tcpdump -nn -i eth0 port 6667
@bradkarels
bradkarels / writeFile.scala
Created June 2, 2015 13:01
Write list of Strings to text file on filesystem using java.nio in Scala (2.10.5) REPL
import java.nio.file.Files
import java.nio.charset.Charset
import java.nio.charset.StandardCharsets
import java.nio.file.Paths
import java.nio.file.StandardOpenOption
import collection.JavaConverters._
val utf8:Charset = StandardCharsets.UTF_8
Files.write(Paths.get("foo.txt"), "foo".getBytes(utf8))
@bradkarels
bradkarels / mapUnion.scala
Created July 14, 2015 14:41
Creating a simple union of two Maps where values in 'core' will be overwritten by the 'overlay' and values in overlay that do not exist in core will be added to the resulitng Map
val key0 = ("0","0")
val key1 = ("1","0")
val key2 = ("2","0")
val key3 = ("3","0")
val core:Map[(String,String),Option[String]] = Map(key0 -> Some("a"), key1 -> Some("b"), key2 -> Some("c"))
val overlay:Map[(String,String),Option[String]] = Map(key2 -> Some("y"), key3 -> Some("z"))
//val expected = Map(key0 -> Some("a"), key1 -> Some("b"), key2 -> Some("y"), key3 -> Some("z"))