Skip to content

Instantly share code, notes, and snippets.

@ivanlei
ivanlei / mrjob_install.sh
Created July 20, 2013 16:24
Install the python package mrjob from source
sudo apt-get update
sudo apt-get install -y git-core python-pip python-dev build-essential
sudo pip install --upgrade pip
sudo pip install --upgrade virtualenv
sudo pip install boto
sudo pip install simplejson
sudo git clone https://github.com/Yelp/mrjob.git
pushd mrjob
sudo python setup.py install
popd
@ivanlei
ivanlei / mrjob_pooling_with_multiple_iam.sh
Created July 20, 2013 16:30
Example command for running an mrjob with job pooling and EMR instances shared between IAM in the same AWS account.
export AWS_ACCESS_KEY_ID=XXX
export AWS_SECRET_ACCESS_KEY=XXX
python ./mr_word_freq_count.py ./wordlist \
--pool-emr-job-flows \
--runner=emr \
--visible-to-all-users \
--num-ec2-instances=1 \
--aws-region=us-west-2 --emr-endpoint=us-west-2.elasticmapreduce.amazonaws.com
@ivanlei
ivanlei / boto keyring setup
Last active January 2, 2016 15:29
This is a script takes AWS credentials csv file containing an AWS keypair and: * Uses keyring package to store the Secret Access Key * Outputs the proper config lines for .boto file to use the keyring
# -*- coding: utf-8 -*-
#
# This is a script takes AWS credentials csv file containing an AWS keypair and:
# * Uses keyring package to store the Secret Access Key
# * Outputs the proper config lines for .boto file to use the keyring
#
import sys
import keyring
import optparse
@ivanlei
ivanlei / gist:f7131796abdae2e3b087
Last active September 16, 2015 19:01
keybase.md
### Keybase proof
I hereby claim:
* I am ivanlei on github.
* I am ivanlei (https://keybase.io/ivanlei) on keybase.
* I have a public key whose fingerprint is A809 2A4A FC1D 5E1D 346F 4A5C 0827 0A86 9915 D0B8
To claim this, I am signing this object:
@ivanlei
ivanlei / gist:f81630084d2be0f22362
Last active August 2, 2016 07:17
S3->SNS->SQS->Logstash

S3 support SNS notifications for new objects. These SNS notifications can fan out to SQS queues.

Logstash can read from inputs including:

but neither of these are sufficient:

  • S3 input has very low performance in attempting to read for a bucket with a high number of writes. The S3 reader lands up spending most of its time listing the bucket contents vs. reading objects.
  • SQS input works well. It works well with a single logstash process or a cluster of multiple processes. However, the SQS input doesn't understand the format of an SNS notification object for S3 changes.
@ivanlei
ivanlei / crash.log
Created March 5, 2017 14:49
terraform crash
2017/03/05 09:48:24 [INFO] Terraform version: 0.8.8 403a86dc557fae52f8e39676b11e1e4356b7d1a2
2017/03/05 09:48:24 [INFO] CLI args: []string{"/Volumes/dat/Users/ivanlei/bin/terraform", "plan"}
2017/03/05 09:48:24 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 09:48:24 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 09:48:24 [DEBUG] Attempting to open CLI config file: /Volumes/dat/Users/ivanlei/.terraformrc
2017/03/05 09:48:24 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2017/03/05 09:48:24 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 09:48:24 [DEBUG] plugin: waiting for all plugin processes to complete...
panic: runtime error: index out of range
@ivanlei
ivanlei / crash.log
Created March 5, 2017 14:59
terraform crash
2017/03/05 09:59:20 [INFO] Terraform version: 0.8.8 403a86dc557fae52f8e39676b11e1e4356b7d1a2
2017/03/05 09:59:20 [INFO] CLI args: []string{"/Volumes/dat/Users/ivanlei/bin/terraform", "plan"}
2017/03/05 09:59:20 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 09:59:20 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 09:59:20 [DEBUG] Attempting to open CLI config file: /Volumes/dat/Users/ivanlei/.terraformrc
2017/03/05 09:59:20 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2017/03/05 09:59:20 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 09:59:20 [DEBUG] plugin: waiting for all plugin processes to complete...
panic: runtime error: index out of range
@ivanlei
ivanlei / debug spew
Created March 5, 2017 15:02
terraform crash debug spew
$ export TF_LOG=TRACE
$ terraform plan
2017/03/05 10:01:09 [INFO] Terraform version: 0.8.8 403a86dc557fae52f8e39676b11e1e4356b7d1a2
2017/03/05 10:01:09 [INFO] CLI args: []string{"/Volumes/dat/Users/ivanlei/bin/terraform", "plan"}
2017/03/05 10:01:09 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:01:09 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:01:09 [DEBUG] Attempting to open CLI config file: /Volumes/dat/Users/ivanlei/.terraformrc
2017/03/05 10:01:09 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2017/03/05 10:01:09 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:01:09 [DEBUG] plugin: waiting for all plugin processes to complete...
@ivanlei
ivanlei / crash spew
Created March 5, 2017 15:11
terraform crash
```shell
$ terraform plan
2017/03/05 10:11:05 [INFO] Terraform version: 0.8.8 403a86dc557fae52f8e39676b11e1e4356b7d1a2
2017/03/05 10:11:05 [INFO] CLI args: []string{"/Volumes/dat/Users/ivanlei/bin/terraform", "plan"}
2017/03/05 10:11:05 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:11:05 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:11:05 [DEBUG] Attempting to open CLI config file: /Volumes/dat/Users/ivanlei/.terraformrc
2017/03/05 10:11:05 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2017/03/05 10:11:05 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:11:05 [DEBUG] plugin: waiting for all plugin processes to complete...
$ terraform plan
2017/03/05 10:12:56 [INFO] Terraform version: 0.8.8 403a86dc557fae52f8e39676b11e1e4356b7d1a2
2017/03/05 10:12:56 [INFO] CLI args: []string{"/Volumes/dat/Users/ivanlei/bin/terraform", "plan"}
2017/03/05 10:12:56 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:12:56 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:12:56 [DEBUG] Attempting to open CLI config file: /Volumes/dat/Users/ivanlei/.terraformrc
2017/03/05 10:12:56 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2017/03/05 10:12:56 [DEBUG] Detected home directory from env var: /Volumes/dat/Users/ivanlei
2017/03/05 10:12:56 [DEBUG] plugin: waiting for all plugin processes to complete...
panic: runtime error: index out of range