Skip to content

Instantly share code, notes, and snippets.

Creating subclusters and node groups within YARN queues using node labels.

  1. Create directories in HDFS for node labels
hadoop fs -mkdir -p /yarn/node-labels
hadoop fs -chown -R yarn:yarn /yarn
hadoop fs -chmod -R 700 /yarn
hadoop fs -mkdir -p /user/yarn
hadoop fs -chown -R yarn:yarn /user/yarn
hadoop fs -chmod -R 700 /user/yarn
# YouTube (english) : https://www.youtube.com/watch?v=FtU2_bBfSgM
# YouTube (french) : https://www.youtube.com/watch?v=VjnaVBnERDU
#
# On your laptop, connect to the Mac instance with SSH (similar to Linux instances)
#
ssh -i <your private key.pem> ec2-user@<your public ip address>
#
# On the Mac
@libin
libin / Gemfile
Created July 3, 2020 18:00 — forked from dhh/Gemfile
HEY's Gemfile
ruby '2.7.1'
gem 'rails', github: 'rails/rails'
gem 'tzinfo-data', '>= 1.2016.7' # Don't rely on OSX/Linux timezone data
# Action Text
gem 'actiontext', github: 'basecamp/actiontext', ref: 'okra'
gem 'okra', github: 'basecamp/okra'
# Drivers
@libin
libin / web-servers.md
Created March 14, 2020 04:11 — forked from willurd/web-servers.md
Big list of http static server one-liners

Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000. Use this power wisely.

Discussion on reddit.

Python 2.x

$ python -m SimpleHTTPServer 8000
In a Hadoop cluster, if you would like to get a count of lines in some files, one easy way is to do the following:
hadoop fs -cat inputdir/* | wc -l
However this streams the content from all machines to the single machine that performs the counting.
It would be nice if "hadoop fs" has a subcommand to do this for example "hadoop fs -wc -l" but that is not the case.
An alternative is to use Hadoop streaming to parallize the lines counting task and then a single reducor to sum up the results from all the nodes. Something like the following:
hadoop jar ${HADOOP_HOME}/hadoop-streaming.jar \
-Dmapred.reduce.tasks=1 \
@libin
libin / gist:23312de56c3bdf1b0b94a0c122673537
Created August 22, 2016 22:33 — forked from kwk/gist:1167959
How to install NodeJS and NPM on a host without internet access and without compile tools
# On build host (has internet access): Download and install NodeJS and NPM
wget http://nodejs.org/dist/node-v0.4.10.tar.gz
tar xvzf node-v0.4.10.tar.gz
cd node-v0.4.11
./configure
make
sudo make install
wget http://npmjs.org/install.sh
sudo sh ./install.sh
@libin
libin / benchmark-commands.txt
Created July 7, 2016 04:03 — forked from jkreps/benchmark-commands.txt
Kafka Benchmark Commands
Producer
Setup
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3
Single thread, no replication
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196
@libin
libin / libevent-2.0.22-stable.sh
Last active May 17, 2016 20:34 — forked from solar/libevent-2.0.20-stable.sh
Install libevent and tmux on CentOS/RH 6.3
#!/bin/sh
curl -sL 'https://github.com/downloads/libevent/libevent/libevent-2.0.22-stable.tar.gz' | tar zx
cd libevent-2.0.20-stable/
./configure --prefix=/usr/local/libevent/2.0.22-stable
make
sudo make install
sudo alternatives --install /usr/local/lib64/libevent libevent /usr/local/libevent/2.0.22-stable/lib 20018 \
--slave /usr/local/include/libevent libevent-include /usr/local/libevent/2.0.22-stable/include \
--slave /usr/local/bin/event_rpcgen.py event_rpcgen /usr/local/libevent/2.0.22-stable/bin/event_rpcgen.py
@libin
libin / gist:1c480ba06b5ae03c23baed22bac28854
Created May 5, 2016 22:15 — forked from unnitallman/gist:944011
sqlite with activerecord outside rails
require 'active_record'
ActiveRecord::Base.logger = Logger.new(STDERR)
ActiveRecord::Base.colorize_logging = false
ActiveRecord::Base.establish_connection(
:adapter => "sqlite3",
:dbfile => ":memory:"
)