Skip to content

Instantly share code, notes, and snippets.

  • brew install mvnvm (just to install maven on mac)
  • make a eclipse maven project on your local (File -> New -> Project -> Maven Project). During the setup just click next until you run into a place that prompt you to set the group id = com.javamakeuse.hadoop.poc, artifact id = Homeworkx (name is whatever you want, e.g. Homework1)
  • copy the pom.xml from wolf and replace the local pom.xml (you'll see it on your left in eclipse)
  • go to src/main/java and start a new class (e.g. Exercise1) to do your coding
  • after we're done coding, navigate to where the maven project is stored (e.g. mine is stored under /Users/ethen/Documents/workspace/Homework1) and type mvn package to create the jar file
  • After that copy the mr-app-1.0-SNAPSHOT.jar inside the target folder to wolf.
  • Then ssh to wolf and run the job on wolf using hadoop jar e.g. for the wordcount example I had a folder called wordcount on hadoop and I want the output
library(yaImpute)
library(caret)
heart<-read.csv("heart.csv")
heart<-heart[-1]
heart$cost<-log10(heart$cost)
CVInd <- function(n,K) { #n is sample size; K is number of parts; returns K-length list of indices for each part
m<-floor(n/K) #approximate size of each part
r<-n-m*K