Skip to content

Instantly share code, notes, and snippets.

View pmleveque's full-sized avatar

Pierre-Marie Lévêque pmleveque

View GitHub Profile
DSFLOW_WORKSPACE=$(pwd)
docker run -it \
-p 8888:8888 \
-p 4040:4040 \
-v $DSFLOW_WORKSPACE/datastore/:/home/jovyan/datastore \
-v $DSFLOW_WORKSPACE/notebooks/:/home/jovyan/notebooks \
-v $DSFLOW_WORKSPACE/notebook-templates/:/home/jovyan/notebook-templates \
-v $DSFLOW_WORKSPACE/transformations/:/home/jovyan/transformations \
pmdsflow/cli:natgas
SHOW TABLES;
SELECT category,
count(*) as user_count
FROM users
GROUP by category;
CREATE TEMPORARY VIEW crunchbase_latest
AS
SELECT * FROM crunchbase

Apache Spark installation + ipython notebook integration guide for Mac OS X

Tested with Apache Spark 1.3.1, Python 2.7.9 and Java 1.8.0_45

Install Java Development Kit

Download and install it from oracle.com

#!/usr/bin/env python
"""Simple HTTP Server With Upload.
This module builds on BaseHTTPServer by implementing the standard GET
and HEAD requests in a fairly straightforward manner.
"""