Skip to content

Instantly share code, notes, and snippets.

@EscVector
Forked from jeongho/hadoop-benchmark
Created January 15, 2016 08:18
Show Gist options
  • Save EscVector/acefc6b9e01daf003d2f to your computer and use it in GitHub Desktop.
Save EscVector/acefc6b9e01daf003d2f to your computer and use it in GitHub Desktop.
Hadoop benchmark
http://answers.oreilly.com/topic/460-how-to-benchmark-a-hadoop-cluster/
http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
## MR pi
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 10 100
## terasort
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar teragen 1000 /user/cloudera/terasort-input
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar teragen 100000000 /user/cloudera/terasort-input
## 100BYTE * 10,000,000,000 = 1,000,000,000,000
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar teragen 10000000000 /user/cloudera/terasort-input
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar terasort /user/cloudera/terasort-input /user/cloudera/terasort-output
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar teravalidate /user/cloudera/terasort-output /user/cloudera/terasort-validate
## TestDFSIO
Usage: TestDFSIO [genericOptions] -read | -write | -append | -clean [-nrFiles N] [-fileSize Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -write -nrFiles 10 -fileSize 1000MB -resFile TesetDFSIO_results_10_1000MB_blocksize_128MB.log
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -read -nrFiles 10 -fileSize 1000MB -resFile TesetDFSIO_results_10_1000MB_blocksize_128MB.log
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -clean
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -Ddfs.block.size=52428800 -write -nrFiles 10 -fileSize 1000MB -resFile TesetDFSIO_results_10_1000MB_blocksize_50MB.log
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -Ddfs.block.size=52428800 -read -nrFiles 10 -fileSize 1000MB -resFile TesetDFSIO_results_10_1000MB_blocksize_50MB.log
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -clean
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -Ddfs.block.size=10485760 -write -nrFiles 10 -fileSize 1000MB -resFile TesetDFSIO_results_10_1000MB_blocksize_10MB.log
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -Ddfs.block.size=10485760 -read -nrFiles 10 -fileSize 1000MB -resFile TesetDFSIO_results_10_1000MB_blocksize_10MB.log
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -clean
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment