Skip to content

Instantly share code, notes, and snippets.

@GEOFBOT
GEOFBOT / flink-debug.fish
Created June 6, 2016 15:28
A little fish shell function to run an Apache Flink job that waits for a JVM debugger to connect to localhost:5005
@GEOFBOT
GEOFBOT / shrug.py
Last active June 9, 2016 22:49
HexChat script for ¯\_(ツ)_/¯
# portions based on slap.py https://github.com/hexchat/hexchat-addons/blob/master/python/slap/slap.py
from __future__ import print_function
import hexchat
__module_name__ = "Shrug"
__module_version__ = "1.1"
__module_description__ = "Shrugs with /SHRUG [target]"
__author__ = 'Geoffrey Mon @GEOFBOT'
@GEOFBOT
GEOFBOT / .vimrc
Last active July 5, 2016 15:15
.vimrc
set nocompatible " be iMproved, required
filetype off " required
set shell=/bin/bash " Fish shell
" set the runtime path to include Vundle and initialize
set rtp+=~/.vim/bundle/Vundle.vim
call vundle#begin()
set number
@GEOFBOT
GEOFBOT / Setting up a Flink cluster.md
Last active July 17, 2016 20:06
Guide to setting up a BlueData CentOS 6.7 / AWS Ubuntu 14.04 cluster for running Flink jobs
package org.apache.flink;
import org.apache.commons.io.Charsets;
import org.apache.commons.io.FileUtils;
import org.apache.flink.api.common.functions.RichFlatMapFunction;
import org.apache.flink.api.common.functions.RichMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.api.java.tuple.Tuple1;
# modified from http://www.willmcginnis.com/2015/11/08/getting-started-with-python-and-apache-flink/
from flink.plan.Environment import get_environment
from flink.plan.Constants import INT, STRING, WriteMode
from flink.functions.GroupReduceFunction import GroupReduceFunction
class Adder(GroupReduceFunction):
def reduce(self, iterator, collector):
count, word = iterator.next()
count += sum([x[0] for x in iterator])
@GEOFBOT
GEOFBOT / HDFS on AWS.md
Last active August 7, 2016 16:17
Setting up HDFS on AWS

On each node:

Set up packages and install Hadoop:

#!/bin/bash

sudo yum install java-1.8.0-openjdk-devel wget git bzip2 -y
echo export JAVA_HOME=/usr/lib/jvm/java >> ~/.bashrc
source ~/.bashrc
@GEOFBOT
GEOFBOT / flink.service
Created July 19, 2016 15:11
systemd file for flink
@GEOFBOT
GEOFBOT / barebones.py
Last active October 18, 2016 02:05
Flink file that causes issues with a modified version of Flink featuring bulk iterations in the Python API
# Barebones test file to check for issues
import math
from flink.functions.Aggregation import Sum
from flink.functions.GroupReduceFunction import GroupReduceFunction
from flink.plan.Environment import get_environment
class NormalizeVectorGroupReducer(GroupReduceFunction):
"""
@GEOFBOT
GEOFBOT / EMR_flinkerations.sh
Last active January 6, 2017 22:24
Bootstrap script to update Flink on Amazon EMR to use my build with Python bulk iterations
#!/bin/bash
# Runs after installation of included Flink
set -e
cd ~
sudo rm /usr/lib/flink/lib/flink-* # make sure we don't have two versions of jars
sudo tar -xzf flinkerations-emr.tgz -C /usr/lib/
rm flinkerations-emr.tgz
# Copy over EMRFS jars to Flink lib path
sudo cp /usr/share/aws/emr/s3-dist-cp/lib/*.jar /usr/lib/flink/lib/