Skip to content

Instantly share code, notes, and snippets.

#!/usr/bin/env python
import argparse
import json
import random
import sys
def get_topic( args, data ):
topic = data['partitions'][0]['topic']
for partition in data['partitions']:
conf = pyspark.SparkConf()
conf.set("spark.sql.tungsten.enabled", "false")
sc = getOrCreateSparkContext(conf)
/usr/lib/jvm/java-8-oracle/jre/lib/jce.jar
/usr/lib/jvm/java-8-oracle/jre/lib/rt.jar
/usr/lib/jvm/java-8-oracle/jre/lib/plugin.jar
/usr/lib/jvm/java-8-oracle/jre/lib/jsse.jar
/usr/lib/jvm/java-8-oracle/jre/lib/jfr.jar
/usr/lib/jvm/java-8-oracle/jre/lib/management-agent.jar
/usr/lib/jvm/java-8-oracle/jre/lib/jfxswt.jar
/usr/lib/jvm/java-8-oracle/jre/lib/resources.jar
/usr/lib/jvm/java-8-oracle/jre/lib/javaws.jar
/usr/lib/jvm/java-8-oracle/jre/lib/deploy.jar

Internet Scale Services Checklist

A checklist for designing and developing internet scale services, inspired by James Hamilton's 2007 paper "On Desgining and Deploying Internet-Scale Services."

Basic tenets

  • Does the design expect failures to happen regularly and handle them gracefully?
  • Have we kept things as simple as possible?
import shutil, errno
def copyanything(src, dst):
try:
shutil.copytree(src, dst)
except OSError as exc: # python >2.5
if exc.errno == errno.ENOTDIR:
shutil.copy(src, dst)
else: raise
@anandnalya
anandnalya / css-js-stopwords
Created July 29, 2013 07:49
List of css and javascript identifiers/properties that can be used as stopwords while indexing
-moz-binding
-moz-border-bottom-colors
-moz-border-left-colors
-moz-border-radius
-moz-border-radius-bottomleft
-moz-border-radius-bottomright
-moz-border-radius-topleft
-moz-border-radius-topright
-moz-border-right-colors
-moz-border-top-colors
{
"filtered" : {
"query" : {
"range" : {
"c100" : {
"from" : "0",
"to" : "5293983999",
"include_lower" :true,
"include_upper" : true
}
[root@ct-0088 ~]# ./bin/elasticsearch
[root@ct-0088 ~]# ./logs/elasticsearchStaging_4hr.log
[2013-07-12 14:07:56,405][INFO ][node ] [CT-0088] {0.90.0.RC2}[3049]: starting ...
[2013-07-12 14:07:56,567][INFO ][transport ] [CT-0088] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.50.171:9300]}
[2013-07-12 14:07:59,598][INFO ][cluster.service ] [CT-0088] new_master [CT-0088][Zukt8ivLRd6LIzeZoriR8A][inet[/192.168.50.171:9300]]{max_local_storage_nodes=1}, reason: zen-disco-join (elected_as_master)
[2013-07-12 14:07:59,620][INFO ][discovery ] [CT-0088] elasticsearchStaging_4hr/Zukt8ivLRd6LIzeZoriR8A
[2013-07-12 14:07:59,659][INFO ][http ] [CT-0088] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.50.171:9200]}
[2013-07-12 14:07:59,660][INFO ][node ] [CT-0088] {0.90.0.RC2}[3049]: started
[2013-07-12 14:08:01,055][INFO ][gateway.local.state.meta ] [CT-0088] [mlivemas
@anandnalya
anandnalya / gist:3089221
Created July 11, 2012 09:16
Overriding one properties file with another optional properties file in Spring. Properties will be first searched in /path/local/config.properties and if not found, in classpath:config.properties
<bean id="propertyConfigurer" class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="locations">
<list>
<value>classpath:config.properties</value>
<value>file:/path/local/config.properties</value>
</list>
</property>
<property name="ignoreResourceNotFound" value="true" />
</bean>
x <- read.table("synthetic_control.data")
cat( "read", length(x[,1]), "records.\n")
# load clustering library
library(flexclust)
# get number of clusters from user
n <- as.integer( readline("Enter number of clusters: "))
# run kmeans clustering on the dataset