- brew install mvnvm (just to install maven on mac)
- make a eclipse maven project on your local (File -> New -> Project -> Maven Project). During the setup just click next until you run into a place that prompt you to set the
group id
= com.javamakeuse.hadoop.poc,artifact id
= Homeworkx (name is whatever you want, e.g. Homework1) - copy the pom.xml from wolf and replace the local pom.xml (you'll see it on your left in eclipse)
- go to src/main/java and start a new class (e.g. Exercise1) to do your coding
- after we're done coding, navigate to where the maven project is stored (e.g. mine is stored under /Users/ethen/Documents/workspace/Homework1) and type
mvn package
to create the jar file - After that copy the
mr-app-1.0-SNAPSHOT.jar
inside the target folder to wolf. - Then ssh to wolf and run the job on wolf using
hadoop jar
e.g. for the wordcount example I had a folder called wordcount on hadoop and I want the output
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(yaImpute) | |
library(caret) | |
heart<-read.csv("heart.csv") | |
heart<-heart[-1] | |
heart$cost<-log10(heart$cost) | |
CVInd <- function(n,K) { #n is sample size; K is number of parts; returns K-length list of indices for each part | |
m<-floor(n/K) #approximate size of each part | |
r<-n-m*K |