Skip to content

Instantly share code, notes, and snippets.

Created September 7, 2012 05:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/3663341 to your computer and use it in GitHub Desktop.
Save anonymous/3663341 to your computer and use it in GitHub Desktop.
hive streaming reducer function in R
#! /usr/bin/env Rscript
con <-file('stdin', open = 'r')
means <- numeric(0)
lastKey <- ""
while(length(line<-readLines(con, n = 1, warn = FALSE))>0) {
fields <- unlist(strsplit(line, '\t'))
key <- fields[[1]]
user.rating <- as.numeric(fields[[2]])
if (!(identical(lastKey, "")) & (!(identical(lastKey, key)))) {
cat(lastKey, '\t', (mean(means)), '\n')
lastKey <- key
means <-c(user.rating)
} else {
lastKey <- key
means <- c(means, user.rating)
}
}
cat(paste(lastKey, '\t', mean(means), '\n'))
close(con)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment