Skip to content

Instantly share code, notes, and snippets.

View jbenninghoff's full-sized avatar

John Benninghoff jbenninghoff

  • Ventura, CA, United States
View GitHub Profile
@jbenninghoff
jbenninghoff / 0_reuse_code.js
Created July 31, 2014 06:49
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@jbenninghoff
jbenninghoff / gist:13982fe5468c591c43df
Created January 22, 2015 20:51
Another wordcount in pig
hduser@master:~$ cat wordcount.pig
A = load '/user/jbenninghoff/somefile.txt';
B = foreach A generate flatten(TOKENIZE((chararray)$0)) as word;
C = filter B by word matches '\\w+';
D = group C by word;
E = foreach D generate COUNT(C), group;
store E into '/user/jbenninghoff/somefileWordcount';
@jbenninghoff
jbenninghoff / text
Last active August 29, 2015 14:16
clush2ansible-hosts
root@cent01 redhat 4C 02:43pm# sed -n '/^#/d;/@/d;s/\(^.*\): \(.*$\)/[\1]\n\2\n/p' /etc/clustershell/groups | sed '/\[.*\]/s/-/:/;/.* .*/s/ /\n/'
[all]
cent[01:05]
[zk]
cent[01:03]
[cldb]
cent01
cent02
@jbenninghoff
jbenninghoff / runSparkTeraSort.sh
Last active March 26, 2019 19:34
Spark TeraSort launch example
#!/bin/bash
/opt/mapr/spark/spark-1.2.1/bin/spark-submit --master yarn-client \
--class org.apache.spark.examples.terasort.TeraGen \
--name 'TeraGen' \
--conf 'mapreduce.terasort.num.partitions=5' \
--executor-cores 30 \
--executor-memory 7G \
--num-executors 9 \
terasort-project_2.10-1.0.jar 50G /user/$USER/spark-terasort
@jbenninghoff
jbenninghoff / TestHBase.java
Created August 18, 2015 19:00
HBase Test Case
/*
* Compile and run with:
* javac -cp $(hbase classpath) TestHBase.java
* java -cp .:$(hbase classpath) TestHBase
*/
import java.net.*;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
@jbenninghoff
jbenninghoff / mesos-install.sh
Created December 2, 2015 00:00
Mesos install steps
#!/bin/bash
echo 'Script not ready for execution. Copy and paste line by line into a shell instead'
echo 'Assumes clush installed and /etc/hosts propagated to all nodes'
exit 1
#Configure edge node as MapR client
vi /etc/yum.repos.d/maprtech.repo # We should have rpm to install+enable like EPEL rpm
yum clean all
#Insure iptables (firewall) is off and disabled everywhere
@jbenninghoff
jbenninghoff / ycsbtest.sh
Created December 3, 2015 21:18
YCSB test run script
#!/bin/bash
# jbenninghoff 2013-Sep-13 vi: set ai et sw=3 tabstop=3:
# Assumes MapR YCSB branch to handle large tables: https://github.com/mapr/YCSB
# Assumes MapR HBase client software installed. Can be an edge/gateway node
export HBASE_CLASSPATH=core/lib/core-0.1.4.jar:hbase-binding/lib/hbase-binding-0.1.4.jar
table=/benchmarks/usertable #YCSB uses table named 'usertable' by default
thrds=4
count=$[100*1000*1000] #table row count
@jbenninghoff
jbenninghoff / sh
Last active July 13, 2017 01:59
Bash Idioms template
#!/bin/bash
#jbenninghoff 2015-Dec-28 vi: set ai et sw=3 tabstop=3 retab:
: << '--BLOCK-COMMENT--'
Bash idioms template
Save as ~/.vim/templates/sh
Above requires vim templates plugin: https://github.com/ap/vim-templates
Useful site for lots of Bash info: http://wiki.bash-hackers.org/
--BLOCK-COMMENT--
@jbenninghoff
jbenninghoff / LVM mods
Last active February 10, 2023 23:16
Linux LVM modifications for MapR
#!/bin/bash
umount /home
lsblk -P /dev/sdb | grep -o MOUNTPOINT.*
lvremove -f vg_$(hostname -s|tr A-Z a-z)/lv_home
parted /dev/sdb -- rm 1
grep home /etc/fstab
sed -i.bak '/home/d' /etc/fstab
vgreduce -f vg_$(hostname -s|tr A-Z a-z) --removemissing
vgreduce -f vg_${HOSTNAME,,} --removemissing
@jbenninghoff
jbenninghoff / fixes-via-clush.txt
Last active November 22, 2016 19:22
Fix list and clush fixes
Fix list from cluster-audit.sh findings:
Push mapr repos from node1 to all others.
Yum install dstat jq nmap nc tmux tuned vim xml2 zsh
Disable /etc/selinux/config and setenforce Permissive
chkconfig iptables off
echo 'vm.swappiness = 1' >> /etc/sysctl.conf
Not in tmpwatch: /tmp/hadoop-mapr/nm-local-dir
New hostname in /etc/sysconfig/network
Add all hosts to /etc/hosts