This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
set_var <- function(k, v) { | |
.Internal(Sys.setenv(k, v)) | |
} | |
# currently just imports env vars | |
load_bash <- function(bash_file) { | |
rl <- readLines(bash_file) | |
# just find statements that begin with `export` | |
raw.exports <- grep("^export", rl) | |
exports <- gsub("\"", "", gsub("^export ", "", rl[raw.exports])) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require(graphics) | |
loc <- cmdscale(eurodist) | |
x <- loc[, 1] | |
y <- -loc[, 2] # reflect so North is at the top | |
## note asp = 1, to ensure Euclidean distances are represented correctly | |
plot(x, y, type = "n", xlab = "", ylab = "", asp = 1, axes = FALSE, main = "cmdscale(eurodist)") | |
text(x, y, rownames(loc), cex = 0.6) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14/06/25 16:40:07 INFO mapreduce.Job: map 97% reduce 0% | |
14/06/25 16:40:24 INFO mapreduce.Job: map 97% reduce 3% | |
14/06/25 16:40:27 INFO mapreduce.Job: map 97% reduce 5% | |
14/06/25 16:40:27 INFO mapreduce.Job: Task Id : attempt_1402677141678_0008_r_000000_0, Status : FAILED | |
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 | |
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121) | |
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380) | |
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) | |
at java.security.AccessController.doPrivileged(Native Method) | |
at javax.security.auth.Subject.doAs(Subject.java:396) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def surprise( mapcount ): | |
""" | |
calculate the surprise value for a given mapcount. | |
basically if the more uneven the distribution of values, | |
the higher the surprise value. | |
for example, a good field to use for a coverage score | |
might have a surprise value less than 0.5 or 0.6. | |
""" | |
values = mapcount.values() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Regression with correlated errors | |
tolley.shuffle <- function(model,k=1e-5){ | |
if(class(model)!="lm") | |
stop("Object 'model' must be of class 'lm'.") | |
uncorrelate <- function(model) { | |
r <- residuals(model) | |
n <- length(r) | |
y <- model$model[,1] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: WritableName can't load class: org.apache.hadoop.hbase.io.ImmutableBytesWritable | |
at org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:2030) | |
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1960) | |
at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1810) | |
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1759) | |
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773) | |
at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:49) | |
at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64) | |
at org.apache.hadoop.streaming.AutoInputFormat.getRecordReader(AutoInputFormat.java:67) | |
at org.apache.hadoop.streaming.DumpTypedBytes.dumpTypedBytes(DumpTypedBytes.java:115) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(rmr2) | |
groups <- rbinom(32, n = 50, prob = 0.4) | |
print(groups) | |
groups.dfs <- to.dfs(groups) | |
from.dfs( | |
mapreduce( | |
input = groups.dfs, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0"?> | |
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_1 http://www.dmg.org/v4-1/pmml-4-1.xsd"> | |
<Header copyright="Copyright (c) 2014 lanenga" description="Linear Regression Model"> | |
<Extension name="user" value="lanenga" extender="Rattle/PMML"/> | |
<Application name="Rattle/PMML" version="1.4"/> | |
<Timestamp>2014-01-07 15:33:34</Timestamp> | |
</Header> | |
<DataDictionary numberOfFields="4"> | |
<DataField name="sepal_width" optype="continuous" dataType="double"/> | |
<DataField name="sepal_length" optype="continuous" dataType="double"/> |