Skip to content

Instantly share code, notes, and snippets.

@smrmkt
smrmkt / kill-hadoop-slave-processes.php
Last active December 11, 2015 09:58
Kill hadoop DataNode/TaskTracker process using ssh. You can kill all hadoop nodes' zombie processes with this script.
<?php
$hosts = array('slave_host01', 'slave_host02', 'slave_host03');
foreach ($hosts as $host) {
exec("ssh $host /usr/java/default/bin/jps | /bin/grep 'DataNode\|TaskTracker' | /bin/sed 's/\([0-9]\) .*/\1/'", $pids);
foreach ($pids as $pid) {
exec("ssh $host kill $pid);
}
}
@smrmkt
smrmkt / confidence and predict interval
Last active December 20, 2015 19:08
calcurate confidence/predict interval and plot with a result of regression
# import data
data <- read.delim(testadata.txt")
# execute regression and plot confidence/predict interval
predict_continue <- function(target_method, target_cpa, interval) {
d <- subset(data, method==target)
# regression using log() transformation
d.lm <- lm(continue~log(month)+cpa, data=d)
summary(d.lm)
@smrmkt
smrmkt / chi-square test snippet
Last active December 20, 2015 19:19
chi-square test and multiple comparison using bonferroni adjust with 3 groups (class A, class B, class C), 2 levels (passed, failed) original prop test metod is provided by Prof. Aoki.
source("http://aoki2.si.gunma-u.ac.jp/R/src/pairwise.prop2.test.R", encoding="euc-jp")
x <- matrix(c(18, 82, 40, 76, 32, 52), ncol=2, byrow=T)
x
chisq.test(x)
fisher.test(x)
pairwise.prop2.test(x, p.adjust.method="holm", test.function=chisq.test)
@smrmkt
smrmkt / perceptron.R
Last active December 27, 2015 13:39
simple perceptron R sample
x <- matrix(c(1, 1, 1, 1, 1, 1, 3, 7, 1, 5, 4, 2), 6, 2) #素性ベクトル
l <- c(-1, 1, -1, 1, 1, -1) #ラベル
w <- c(0, 0) #重みベクトル
r <- 0.5 #学習係数
#重みベクトルの更新メソッド
update <- function(x, l, w) {
if (sign(x %*% w) == sign(l)) {
return(w)
}
@smrmkt
smrmkt / perceptron.graph.R
Last active December 28, 2015 13:49
simple perceptron and graph sample using R
#パラメタ
x <- matrix(c(1, 1, 1, 1, 1, 1, 3, 7, 1, 5, 4, 2), 6, 2) #素性ベクトル
l <- c(-1, 1, -1, 1, 1, -1) #ラベル
w <- c(0, 0) #重みベクトル
r <- 0.5 #学習係数
#重みベクトルの更新メソッド
update <- function(x, l, w) {
if (sign(x %*% w) == sign(l)) {
return(w)
team team_no number era game win lose winning_rate WHIP DIPS
giants 1 19 3.12 27 13 6 0.684 1.15 2.81
giants 1 26 3.31 25 13 6 0.684 1.27 3.65
giants 1 15 3.13 34 5 10 0.333 1.14 3.64
giants 1 18 3.35 24 11 6 0.647 1.12 3.87
tigers 2 54 2.89 30 12 8 0.6 1.17 3.03
tigers 2 14 2.69 25 11 7 0.611 1.08 3.67
tigers 2 55 2.74 26 8 12 0.4 1.29 3.46
curp 3 18 2.1 26 15 7 0.682 0.96 2.98
curp 3 42 3.23 28 11 9 0.55 1.13 3.97
team teme_no OPS DER DEF
giants 1 0.728 0.706 0.988
tigers 2 0.686 0.704 0.989
curp 3 0.688 0.697 0.981
dragons 4 0.676 0.688 0.986
baysters 5 0.717 0.675 0.983
swallows 6 0.705 0.683 0.988
goldeneagles 7 0.719 0.681 0.986
lions 0.696 8 0.693 0.988
marines 9 0.708 0.695 0.986
@smrmkt
smrmkt / bwt.py
Last active August 29, 2015 14:12
Burrows Wheeler Transform
#!/usr/bin/env python
#-*-coding:utf-8-*-
import argparse
# args
parser = argparse.ArgumentParser()
parser.add_argument('text')
@smrmkt
smrmkt / Quarterly Earnings per Johnson & Johnson Share
Last active August 29, 2015 14:14
two-piece linear regression model
data(JohnsonJohnson)
p=40
# y = α + βx + γ(x > t)(x - t + 1) + ε
t = c(1:84)
JJ2 = as.data.frame(cbind(JohnsonJohnson, t, (t>p)*(t-p)))
colnames(JJ2) = c('earnings', 't', 't2')
JJ2.lm <- lm(earnings~t+t2, data=JJ2)
summary(JJ2.lm)
@smrmkt
smrmkt / stacking
Last active August 29, 2015 14:17
library(randomForest)
# load data
data = read.delim("data/sample.tsv", sep="\t")
# create data for k-fold cross validation
cv = function(d, k) {
n = sample(nrow(d), nrow(d))
d.randomized = data[n,] # randomize data
n.residual = k-nrow(d)%%k