Skip to content

Instantly share code, notes, and snippets.

View JoshRosen's full-sized avatar

Josh Rosen JoshRosen

View GitHub Profile
@subsetpark
subsetpark / gist:367f0d3fde503a1e481c
Created June 16, 2015 15:48
Building Python 2.7.10 on Ubuntu 14.04 LTS
$ sudo apt-get install -y gcc-multilib g++-multilib libffi-dev libffi6 libffi6-dbg python-crypto python-mox3 python-pil python-ply libssl-dev zlib1g-dev libbz2-dev libexpat1-dev libbluetooth-dev libgdbm-dev dpkg-dev quilt autotools-dev libreadline-dev libtinfo-dev libncursesw5-dev tk-dev blt-dev libssl-dev zlib1g-dev libbz2-dev libexpat1-dev libbluetooth-dev libsqlite3-dev libgpm2 mime-support netbase net-tools bzip2
$ wget https://www.python.org/ftp/python/2.7.10/Python-2.7.10.tgz
$ tar xvf Python-2.7.10.tgz
$ cd Python-2.7.10/
$ ./configure --prefix /usr/local/lib/python2.7.10 --enable-ipv6
$ make
$ sudo make install
@djui
djui / compact-vmdk.sh
Created March 12, 2015 13:47
Compacts VMDK images by converting it temporarily to a VDI image, compacting it, and then converting it back.
#!/bin/sh -e
if [ "$#" -ne 2 ] ; then
echo "Usage: compact-vmdk <VM_NAME> <IMAGE_PATH>"
echo
echo "Compacts VMDK images by converting it temporarily to a VDI image, compacting it,"
echo "and then converting it back."
echo
echo "Example:"
echo
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@JoshRosen
JoshRosen / README.md
Last active August 29, 2015 14:08
Scala 2.11.1 SerialVersionUID behavior change

Confusing Scala serialization puzzle

This writeup describes a tricky Scala serialization issue that we ran into while porting Spark to Scala 2.11.

First, let's create a factory that builds anonymous functions:

class FunctionFactory extends Serializable {
  // This method returns a function whose $outer scope is this class.
  // Note that the function does not reference any variables in this scope,
@kfish
kfish / apply-patch.sh
Created November 12, 2013 03:59
Apply a patch file that was produced with "git format-patch" using the patch command, and commit it using the message from the original commit.
#!/bin/bash
apply () {
filename=$1
shift
patch_args=$*
gotSubject=no
msg=""
@wpm
wpm / spark_parallel_boost.py
Last active December 3, 2018 02:56
A simple example of how to integrate the Spark parallel computing framework and the scikit-learn machine learning toolkit. This script randomly generates test and train data sets, trains an ensemble of decision trees using boosting, and applies the ensemble to the test set. The ensemble training is done in parallel.
from pyspark import SparkContext
import numpy as np
from sklearn.cross_validation import train_test_split, Bootstrap
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
def run(sc):
@fperez
fperez / 00-Setup-IPython-PySpark.ipynb
Last active December 21, 2015 23:48
HowTo for starting an IPython Notebook server
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jasonbartz
jasonbartz / uwsgi_on_pypy.md
Last active August 16, 2017 11:02
uwsgi and pypy

Deploying pypy on uwsgi

Instructions are kind of sporadic around the internet, so I thought I would gather them all in one place. The following example uses a Flask app as the thing deployed. You should only have to this once, and then you can pass the bin around.

Requirements

  • pypy 2.x (tested)
  • uwsgi 1.9.11+ (trunk as of this gist)

Build translated pypy binary and pypy lib.

Orien is correct, it is the fork() system call triggered by ProcessBuilder or Runtime.exec or other means of the JVM executing an external process (e.g. another JVM running ant, a git command, etc.).

There have been some posts on the Jenkins mailing lists about this: Cannot run program "git" ... error=12, Cannot allocate memory

There is a nice description of the issue on the SCons dev list: fork()+exec() vs posix_spawn()

There is a long standing JVM bug report with solutions: Use posix_spawn, not fork, on S10 to avoid swap exhaustion. But I'm not sure if this actually made it into JDK7 as the comments suggest was the plan.

In summary, on Unix-like systems, when one process (e.g. the JVM) needs to launch another process (e.g. git) a system call is made to

@garlandkr
garlandkr / redis_es_ls.md
Created September 20, 2012 01:28
Installing Redis Elasticsearch and Logstash

This will be a copy/paste doc for installing redis, elasticsearch and logstash on ubuntu 12.04

Pre-Requisites

apt-get update
apt-get upgrade
apt-get install tcl8.5 tcl8.5-dev build-essential rubygems git \
htop python-dev openjdk-7-jre-headless libcurl4-openssl-dev \
bison ctags flex gperf libevent-dev libpcre3-dev libssl-dev libreadline6-dev \
libtokyocabinet-dev libncursesw5-dev libxml2-dev libxslt1-dev libsqlite3-dev \