Skip to content

Instantly share code, notes, and snippets.

Configuring Spark 1.6.1 to work with Jupyter 4.x Notebooks on Mac OS X with Homebrew

I've looked around in a number of places and I have found several blog entries on setting up IPython notebooks to work with Spark. However since most of the blog posts have been written both IPython and Spark have been updated. Today, IPython has been transformed into Jupyter, and Spark is near release 1.6.2. Most of the information is out there to get things working, but I thought I'd capture this point in time with a working configuration and how I set it up.

I rely completely on Homebrew to manage packages on my Mac. So Spark, Jupyter, Python, Jenv and other things are installed via Homebrew. You should be able to achieve the same thing with Anaconda but I don't know that package manager.

Install Java

Make sure your Java installation is up to date. I use jEnv to manage Java installations on my Mac, so that adds another layer to make sure is set up correctly. You can download/update Java from Oracle, have Homebrew

@rocket-ron
rocket-ron / gist:bf7d7ce3e5b8b7cd9197
Created August 3, 2015 05:21
Resize (increase) Root Partition on AWS EBS Volume Linux LVM
This is a resize of the actual EBS volume as opposed to adding additional disks using LVM
1. Follow the steps here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage_expand_partition.html
2. Use the instructions for gdisk further down the page to set the new partition table, not gparted or fdisk.
3. Reboot the instance once the partition table is written.
4. On the instance, execute:
sudo pvresize /dev/xvda2 (or whatever the device name is)
sudo pvdisplay
@rocket-ron
rocket-ron / AWS CLI command combinations
Created August 2, 2015 00:51
Extracting total size of AWS S3 bucket space
# Use the AWS CLI S3 tools to calculate the amount of space used in a bucket from a listing of the bucket contents
# The first awk extracts the 3rd column, which is the size of the key contents in KB
# The second awk sums the values and outputs the total multiplied by 1024 for total number of bytes
aws s3 ls s3://w205-project-twitter-streams/isis | awk '{ print $3 }' | awk '{s+=$1} END {print s*1024}'
# This command uses the AWS CLI S3 ls command to count the number of keys in a bucket
aws s3 ls s3://w205-project-twitter-streams/isis | tee >(wc -l)
# or this
@rocket-ron
rocket-ron / Clock.py
Created February 24, 2015 05:59
Clock.py
# Clock
#
# An analog clock that demonstrates drawing basic
# shapes with the scene module.
from scene import *
from time import localtime
ipad = False #will be set in the setup method