Skip to content

Instantly share code, notes, and snippets.

import com.twitter.conversions.time._
import com.twitter.finagle.client._
import com.twitter.finagle.dispatch.SerialClientDispatcher
import com.twitter.finagle.util.DefaultTimer
import com.twitter.util.{Await, Future}
trait ExampleClient {
def client: Service[String, String]
def sa = new java.net.InetSocketAddress("localhost", 8080)
@mkolod
mkolod / OptimizedSparkInnerJoin.scala
Created July 6, 2015 18:11
Optimized Inner Join in Spark
/** Hive/Pig/Cascading/Scalding-style inner join which will perform a map-side/replicated/broadcast
* join if the "small" relation has fewer than maxNumRows, and a reduce-side join otherwise.
* @param big the large relation
* @param small the small relation
* @maxNumRows the maximum number of rows that the small relation can have to be a
* candidate for a map-side/replicated/broadcast join
* @return a joined RDD with a common key and a tuple of values from the two
* relations (the big relation value first, followed by the small one)
*/
private def optimizedInnerJoin[A : ClassTag, B : ClassTag, C : ClassTag]
#!/usr/bin/env ruby
require 'sequel'
require 'fileutils'
require 'uri'
require 'pp'
def home
ENV['HOME']
end
@kensipe
kensipe / gist:17eef1778de973c4003f
Last active September 8, 2015 22:03
Mesosphere Cluster with mesos-dns, hdfs and spark on GCE

Intro

this screen cast will demo how to setup an mesosphere cluster for the purposes of analytics. We will show how to provision mesosphere on Google Compute Platform along with installing Mesos-DNS, HDFS and Spark.

We will start with setting up mesosphere on GCE by directing our browser to google.mesosphere.com

GCE setup

  1. setup through wizard
  2. download and run openvpn
  3. see mesos ui
@philipithomas
philipithomas / docker_knapsack.jl
Last active August 29, 2015 14:20
Docker Container Scheduling as a Knapsack Problem in Julia/JuMP
using JuMP
using Cbc
#=
We pass in the variable "pools" in this format that goes through a separate
pre-processing script that pipes JSON to a Julia JSON loader.
{
"awesome-pool-prod": {

Internet Scale Services Checklist

A checklist for designing and developing internet scale services, inspired by James Hamilton's 2007 paper "On Desgining and Deploying Internet-Scale Services."

Basic tenets

  • Does the design expect failures to happen regularly and handle them gracefully?
  • Have we kept things as simple as possible?
@miguno
miguno / kafka-move-leadership.sh
Last active July 6, 2023 19:53
A simple Ops helper script for Apache Kafka to generate a partition reassignment JSON snippet for moving partition leadership away from a given Kafka broker. Use cases include 1) safely restarting a broker while minimizing risk of data loss, 2) replacing a broker, 3) preparing a broker for maintenance.
#!/usr/bin/env bash
#
# File: kafka-move-leadership.sh
#
# Description
# ===========
#
# Generates a Kafka partition reassignment JSON snippet to STDOUT to move the leadership
# of any replicas away from the provided "source" broker to different, randomly selected
# "target" brokers. Run this script with `-h` to show detailed usage instructions.
@alexvictoor
alexvictoor / pb-avro-test_pom.xml
Last active February 11, 2021 09:20
Demo of Protobuff integration within Avro
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.avro.is.great</groupId>
<artifactId>protobuff-avro-demo</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>Demo of protobuff integration with Avro</name>
<build>
<plugins>
@viktorklang
viktorklang / Gistard.scala
Last active June 9, 2017 07:27
Gistard — an sbt autoplugin for depending on Gists — such as Gistard itself
/*
Copyright 2015 Viktor Klang
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
@nightscape
nightscape / Plot graph with D3.snb
Last active December 4, 2015 12:47
Graphical exploration of Bayesian Networks in Spark Notebook
{
"metadata" : {
"name" : "Plot graph with D3",
"user_save_timestamp" : "2014-12-15T00:55:09.510Z",
"auto_save_timestamp" : "2014-12-15T00:50:41.883Z",
"language_info" : {
"name" : "scala",
"file_extension" : "scala",
"codemirror_mode" : "text/x-scala"
},