Last active
January 28, 2019 00:01
-
-
Save dgadiraju/524a6597f0df3a647616651e398b751d to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# TeraGen – Generating the generates random data that can be used as input data, | |
# TeraGen takes the number of 100-byte rows and output directory as options. | |
# To Generate a file of 325MB size | |
hadoop jar \ | |
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \ | |
teragen 3407872 /user/itversity/teragen | |
# To generate a file of 325MB size, with blocksize of 64MB | |
hadoop jar \ | |
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \ | |
teragen \ | |
-D dfs.blocksize=67108864 3407872 \ | |
/user/itversity/teragen | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# TeraSort – Runs a MapReduce job to sort on the data, | |
# it takes the source and destinations paths as options as shown in the below command. | |
hadoop jar \ | |
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \ | |
terasort /user/itversity/teragen /user/itversity/terasort |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# TeraValidate – To validate the sorted output, ensures that the output data of TeraSort is globally sorted. It takes two options mainly, the source directory which is the output directory of terasort and destination directory as shown below. | |
hadoop jar \ | |
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \ | |
teravalidate /user/itversity/terasort /user/itversity/teravalidate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# And, for HDFS stress testing and to discover performance bottlenecks in the network we will be using TestDFSIO benchmark, which is a read and write test for HDFS. The default output directory is /benchmarks/TestDFSIO. | |
# Become a superuser | |
sudo su - hdfs | |
# Write test – to run a write test that generates 3 output files of size 100 MB | |
hadoop jar \ | |
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \ | |
TestDFSIO -write -nrFiles 3 -size 100MB | |
# Read test – to run the corresponding read test using 3 input files of size 100 MB | |
hadoop jar \ | |
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \ | |
TestDFSIO -read -nrFiles 3 -fileSize 100MB |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment