Skip to content

Instantly share code, notes, and snippets.

@mwinkle
Created February 25, 2015 05:44
Show Gist options
  • Save mwinkle/ddfaad852a41680b7c92 to your computer and use it in GitHub Desktop.
Save mwinkle/ddfaad852a41680b7c92 to your computer and use it in GitHub Desktop.
Deploying Giraph on an HDInsight Linux Cluster
special thanks to http://giraph.apache.org/quick_start.html, and thanks to this for the last tip, http://stackoverflow.com/a/27003213/500945
sudo apt-get install openjdk-7-jdk
sudo apt-get install git
sudo apt-get install maven
git clone https://github.com/apache/giraph.git
mvn -Phadoop_2 -fae -DskipTests -Dhadoop=non_secure clean package
# need to put the sample file in storage
hadoop jar ./giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -libjars . org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /tmp/tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/hduser/output/shortestpaths -w 1 -ca giraph.SplitMasterWorker=false
hadoop fs -cat /user/hduser/output/shortestpaths/part-m-00000
0 1.0
2 2.0
1 0.0
3 1.0
4 5.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment