Skip to content

Instantly share code, notes, and snippets.


Philip (flip) Kromer mrflip

View GitHub Profile
mrflip / Gemfile
Last active Aug 29, 2015
Spike of a Roomie Remote API
View Gemfile
source ''
gem 'gorillib', "~> 0.6"
gem 'pry', "~> 0.10"
gem 'multi_json', ">= 1.1"
gem 'crack'
gem 'erubis'
mrflip /
Last active Aug 29, 2015
Roomie Remote Image Remotes borken

I am trying to make a custom image remote. Roomie crashes even on a trivial modification to the DDK example.

I've tried a whole lot of things, but here I'll describe the most minimal test case I can conceive.

First, I reset my enviroment back to zero:

  • I first reset my configuration
  • Disabled Dropbox polling and Wifi sync
  • Removed ~/Dropbox/Roomie
View gist:94b460f1db531de7e165
sudo -u hdfs hadoop fs -mkdir -p \
/tmp /tmp/mapred/system \
/user/root /user/chimpy \
$HADOOP_LOG_DIR/yarn-apps \
$HADOOP_BULK_DIR/yarn-staging/history/done_intermediate \
/var/lib/hadoop-hdfs/cache/mapred/mapred/staging \
sudo -u hdfs hadoop fs -chmod -R 1777 \
/tmp /tmp/mapred/system \
View gist:e3f3999af39bb9986475
14/11/19 10:08:12 INFO listeners.LipstickPPNL: --- Init TBPPNL ---
2014-11-19 10:08:12,865 [main] INFO - build version: 0.6-SNAPSHOT, ts: 2014-11-19T09:02Z, git: 6324e130b4df844930f33cc7fb08aed94d07e76e
2014-11-19 10:08:12,865 [main] INFO - Logging error messages to: /home/chimpy/book/code/pig_1416391692863.log
2014-11-19 10:08:13,245 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/chimpy/.pigbootup not found
2014-11-19 10:08:13,740 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - is deprecated. Instead, use fs.defaultFS
2014-11-19 10:08:13,741 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://nn:8020
2014-11-19 10:08:14,215 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jt:8021
2014-11-19 10:08:14,273 [main] INFO org.apache.hadoop.conf.Configuration.deprecati
mrflip / 2014 TED w
Last active Dec 24, 2016
Notes from the 2014 TED conference
View 2014 TED w

TED 2014 Friday

Friday mid-Morning: Onward (final session)

Andrew Solomon, author

  • Reports on experience of people in extreme circumstances
  • Avoidance and Endurance
  • Take traumas and make them part of who you'll be
  • Mother of a child due to rape: I think of him (rapist) with pity -- he has a beautiful daughter he doesn't know, and I do, and so I’m the lucky one
mrflip / tuning_storm_trident.asciidoc
Last active Jul 25, 2021
Notes on Storm+Trident tuning
View tuning_storm_trident.asciidoc

Tuning Storm+Trident

Tuning a dataflow system is easy:

The First Rule of Dataflow Tuning:
* Ensure each stage is always ready to accept records, and
* Deliver each processed record promptly to its destination
mrflip /
Created Jun 29, 2013
Trident Kafka State
package com.infochimps.storm.trident;
import kafka.javaapi.producer.Producer;
import kafka.javaapi.producer.ProducerData;
import kafka.message.Message;
import kafka.serializer.Encoder;
import kafka.producer.ProducerConfig;
import backtype.storm.task.IMetricsContext;
import storm.trident.operation.TridentCollector;
mrflip /
Last active Jun 23, 2021
Elasticsearch Tuning Plan

Next Steps

  • Measure time spend on index, flush, refresh, merge, query, etc. (TD - done)
  • Take hot threads snapshots under read+write, read-only, write-only (TD - done)
  • Adjust refresh time to 10s (from 1s) and see how load changes (TD)
  • Measure time of a rolling restart doing disable_flush and disable_recovery (TD)
  • Specify routing on query -- make it choose same node for each shard each time (MD)
  • GC new generation size (TD)
  • Warmers
    • measure before/after of client query time with and without warmers (MD)

Performance Qualification

Identify all reasons why (eg) Elasticsearch cannot provide acceptable performance for standard requests and Qualifying load. The "Qualifying load" for each performance bound is double (or so) the worst-case scenario for that bound across all our current clients.

  • Performance
    • bandwidth (rec/s, MB/s) and latency (s) for 100B, 1kB, 100kB records
    • under read, write, read/write
    • in degraded state: a) loss of one/two servers and recovering; b) elevated packet latency + drop rate between "regions"
    • High concurrency
  • keepalive
mrflip /
Last active Dec 15, 2015
Notes for 2013 spec


"Big Five" == Elasticsearch, Storm, Kafka, HBase, wukong decorators

  • Faster chef convergence (custom packages; local physical cluster)
  • Centralized log archiving
  • Performance qualification of
  • Visibility and request manipulation
  • Metarepo (deb/rpm, gem, egg, maven)