Skip to content

Instantly share code, notes, and snippets.

-- invoke with two arguments, the input file , and the output file. -input /bps/gen -output /bps/analytics
-- FYI...
-- If you run into errors, you can see them in
-- ./target/failsafe-reports/TEST-org.bigtop.bigpetstore.integration.BigPetStorePigIT.xml
-- First , we load data in from a file, as tuples.
-- in pig, relations like tables in a relational database
-- so each relation is just a bunch of tuples.
-- in this case csvdata will be a relation,
@mattf
mattf / gist:10578722
Last active August 29, 2015 13:59
sahara bigpetstore script
create node group
- master
- namenode, oozie, resourcemanager, historyserver
create node group
- worker
- datanode, nodemanager
create cluster
- master: 1
@mattf
mattf / kubelet_hostname
Created February 19, 2015 15:04
fake hostname for running kubelet with asymmetric name resolution
#!/bin/sh
# mv /bin/hostname /bin/hostname.real
# install $0 as /bin/hostname (a+rx)
TARGET="kubelet"
IFACE=eth0
PARENT=$(ps -oucmd= $PPID)
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <deque>
#define MAX_LINE_LEN 1024
struct Poller
@mattf
mattf / gist:1501447
Created December 20, 2011 12:42
Cumin quick start
How do you install, setup and run Cumin on a head node -
0) yum install qpidd cumin sesame
1) echo cumin | sudo -u qpidd /usr/sbin/saslpasswd2 -f /var/lib/qpidd/qpidd.sasldb -u QPID cumin
2) sed -i 's,# brokers:.*,brokers: cumin/cumin@localhost:5672,' /etc/cumin/cumin.conf
3) (probably) sed -i 's/# host:.*/host: 0.0.0.0/' /etc/cumin/cumin.conf
4) (hopefully) sed -i 's/# persona:.*/persona: grid/' /etc/cumin/cumin.conf
5) service qpidd start
6) service sesame start (or systemctl start sesame.service)
7) service cumin start (follow instructions, return here when done)
@mattf
mattf / newpgid.c
Created December 28, 2011 18:25
newpgid tool
#include <unistd.h>
int
main(int argc, char *argv[])
{
setpgid(0, 0);
execvp(argv[1], &(argv[1]));
return 1;
@mattf
mattf / 49facter.config
Created December 28, 2011 18:33
Custom Startd attrtibutes from facter
FACTER = /usr/libexec/condor/facter.sh
STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST) FACTER
STARTD_CRON_FACTER_EXECUTABLE = $(FACTER)
STARTD_CRON_FACTER_PERIOD = 300
@mattf
mattf / memcached.job
Created December 28, 2011 18:34
memcached managed from Condor
cmd = memcached.sh
args = -m $$(Memory)
log = memcached.log
kill_sig = SIGTERM
# Want chirp functionality
+WantIOProxy = TRUE
@mattf
mattf / condor_ec2_q.sh
Created December 28, 2011 18:38
EC2 details from condor_q
#!/bin/sh
# NOTE:
# . Requires condor_q >= 7.5.2, old classads do not
# have %
# . When running, jobs show RUN_TIME of their current
# run, not accumulated, which would require adding
# in RemoteWallClockTime
# . See condor_utils/condor_q.cpp:encode_status for
# JobStatus map
@mattf
mattf / condor_ec2_link.sh
Created December 28, 2011 18:37
Import EC2 instance into Schedd queue