Skip to content

Instantly share code, notes, and snippets.

@yuanzhaoYZ
yuanzhaoYZ / Burrow config
Last active July 30, 2016 20:40
Working conf for Burrow (Kafka consumer lag monitoring )
[zookeeper]
hostname=slave1.example.com
hostname=slave2.example.com
hostname=slave3.example.com
port=2181
timeout=6
lock-path=/burrow/notifier
[kafka "XX-prod"]
broker=slave1.example.com
import org.elasticsearch.spark._
import org.apache.spark.sql._
//val sqlContext = new SQLContext(sc)
val options = Map("pushdown" -> "true", "es.nodes" -> "host_ip_here", "es.port" -> "9200",
"es.nodes.wan.only" -> "true")
sqlContext.read.format("es").options(options).load("index_name").write.mode(SaveMode.Overwrite).json("path_to_output")
sc.esRDD("index_name",options)
@yuanzhaoYZ
yuanzhaoYZ / wordnet_ES_install.sh
Created October 7, 2016 03:39
install wordnet synonym on Elastic
sudo su
mkdir -p /etc/elasticsearch/analysis
cd /etc/elasticsearch/analysis
wget http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.tar.gz
tar xvzf WNprolog-3.0.tar.gz
mv prolog/wn_s.pl .
rm -rf prolog
rm -f WNprolog-3.0.tar.gz
@yuanzhaoYZ
yuanzhaoYZ / ES_MissingFilterWithNestedObjects
Created October 25, 2016 16:25 — forked from Erni/ES_MissingFilterWithNestedObjects
Elasticsearch missing filter with nested objects
Elasticsearch missing filter with nested objects
@yuanzhaoYZ
yuanzhaoYZ / jupyter_notebook+spark 2.1(Ubuntu).md
Last active March 30, 2017 17:46 — forked from tommycarpi/ipython_notebook+spark.md
Link Apache Spark with IPython Notebook

How to link Apache Spark 2.1.0 with IPython notebook (Ubuntu)

Tested with

Python 2.7, Ubuntu 16.04 LTS, Apache Spark 2.1.0 & Hadoop 2.7

Download Apache Spark & Build it

Download Apache Spark and build it or download the pre-built version.

How to link Apache Spark 2.1.0 with IPython notebook (Mac OS X)

Tested with

Python 2.7, OS X 10.11.3 El Capitan, Apache Spark 2.1.0 & Hadoop 2.7

Download Apache Spark & Build it

Download Apache Spark and build it or download the pre-built version.

hadoop distcp -Dmapreduce.map.memory.mb=4096 -Dfs.s3a.awsAccessKeyId=XXX -Dfs.s3a.awsSecretAccessKey=XXXX -m 250 hdfs:///data/* s3a://api-v3-data-sources/output/
@yuanzhaoYZ
yuanzhaoYZ / OSRM_NA_12.04 -> 15.04.md
Last active April 16, 2017 16:17
OSRM with North america map installation and setup on Ubuntu 12.04 ~ 15.04

Install

sudo su
apt-get update -y
apt-get install -y software-properties-common python-software-properties || true
add-apt-repository -y ppa:ubuntu-toolchain-r/test
apt-get update -y
apt-get install -y zlib1g-dev curl libstdc++-5-dev make binutils libc-dev libgcc-5-dev git
cd /opt
mkdir /opt/osrm

anaconda2

  1. Download and install Anaconda https://www.continuum.io/downloads. Restart Terminal. Or, if you’d prefer to not get the full Anaconda software, check out this post.
wget https://repo.continuum.io/archive/Anaconda2-4.3.1-MacOSX-x86_64.sh
bash Anaconda2-4.3.1-MacOSX-x86_64.sh 
  1. In terminal, type
/Users/zeta/anaconda/bin/pip install matlab_kernel
pip install -t dependencies -r requirements.txt
cd dependencies
zip -r ../dependencies.zip .