Skip to content

Instantly share code, notes, and snippets.

View joneswan's full-sized avatar

Jones Wan joneswan

View GitHub Profile
@joneswan
joneswan / Spark Dataframe Cheat Sheet.py
Created April 26, 2017 14:12 — forked from evenv/Spark Dataframe Cheat Sheet.py
Cheat sheet for Spark Dataframes (using Python)
# A simple cheat sheet of Spark Dataframe syntax
# Current for Spark 1.6.1
# import statements
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
#creating dataframes
df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data
@joneswan
joneswan / Dockerfile
Created April 26, 2017 14:12 — forked from aliciawyy/Dockerfile
config files of a python project using jupyter notebook in docker (server)
## Edited on 21 June, Alice Wang
## To build the docker image
## $ cd .
## $ docker build -t fc .
FROM debian:8.3
RUN DEBIAN_FRONTEND=noninteractive apt-get update --fix-missing
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y python2.7 python2.7-dev pkg-config
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y libxml2-dev libxslt-dev python-lxml
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.