Skip to content

Instantly share code, notes, and snippets.

Avatar

Keegan Witt keeganwitt

View GitHub Profile
@tzachz
tzachz / dependency-report.gradle
Created Nov 17, 2016
Gradle: multi-project dependency graph (supports depth > 2)
View dependency-report.gradle
// Inspired by https://gist.github.com/abesto/cdcdd38263eacf1cbb51
// Task creates a .dot file with all inter-module dependencies
// Supports any depth of nested modules
task moduleDependencyReport {
doLast {
def file = new File("project-dependencies.dot")
file.delete()
file << "digraph {\n"
file << "splines=ortho\n"
@silasdavis
silasdavis / MultipleOutputs.scala
Last active Jan 18, 2022
Wrapping OutputFormat to produce multiple outputs with hadoop MultipleOutputs
View MultipleOutputs.scala
/**
* This file contains the core idea of wrapping an underlying OutputFormat with an OutputFormat
* with an augmented key that writes to partitions using MultipleOutputs (or something similar)
*/
package model.hadoop
import model.hadoop.HadoopIO.MultipleOutputer
import model.hadoop.HadoopIO.MultipleOutputer._
import org.apache.hadoop.io.{DataInputBuffer, NullWritable}
@mlehman
mlehman / MultipleOutputsExample.scala
Last active Apr 11, 2022
Hadoop MultipleOutputs on Spark Example
View MultipleOutputsExample.scala
/* Example using MultipleOutputs to write a Spark RDD to multiples files.
Based on saveAsNewAPIHadoopFile implemented in org.apache.spark.rdd.PairRDDFunctions, org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil.
val values = sc.parallelize(List(
("fruit/items", "apple"),
("vegetable/items", "broccoli"),
("fruit/items", "pear"),
("fruit/items", "peach"),
("vegetable/items", "celery"),
("vegetable/items", "spinach")
@kellyrob99
kellyrob99 / gradle-dependency-analyze.gradle
Created Dec 19, 2012 — forked from anonymous/gradle-dependency-analyze.gradle
A simple Gradle wrapper for the maven dependency:analyze goal
View gradle-dependency-analyze.gradle
import org.apache.maven.shared.dependency.analyzer.ClassAnalyzer
import org.apache.maven.shared.dependency.analyzer.DefaultClassAnalyzer
import org.apache.maven.shared.dependency.analyzer.DependencyAnalyzer
import org.apache.maven.shared.dependency.analyzer.ProjectDependencyAnalysis
import org.apache.maven.shared.dependency.analyzer.asm.ASMDependencyAnalyzer
import org.gradle.api.Project
import org.gradle.api.artifacts.ConfigurationContainer
import org.gradle.api.artifacts.ResolvedArtifact
import org.gradle.api.artifacts.ResolvedDependency
@hellerbarde
hellerbarde / latency.markdown
Created May 31, 2012 — forked from jboner/latency.txt
Latency numbers every programmer should know
View latency.markdown

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

@jboner
jboner / latency.txt
Last active May 16, 2022
Latency Numbers Every Programmer Should Know
View latency.txt
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD