Created
May 24, 2017 15:46
-
-
Save skonto/734453320f039b28ce71867eafbc0197 to your computer and use it in GitHub Desktop.
terasort
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Run against the normal package, not the beta one. | |
DATA=1 GB | |
Teragen: | |
17/05/24 15:30:57 INFO mapreduce.Job: Counters: 21 | |
File System Counters | |
FILE: Number of bytes read=276327 | |
FILE: Number of bytes written=565835 | |
FILE: Number of read operations=0 | |
FILE: Number of large read operations=0 | |
FILE: Number of write operations=0 | |
HDFS: Number of bytes read=0 | |
HDFS: Number of bytes written=10000000000 | |
HDFS: Number of read operations=4 | |
HDFS: Number of large read operations=0 | |
HDFS: Number of write operations=3 | |
Map-Reduce Framework | |
Map input records=100000000 | |
Map output records=100000000 | |
Input split bytes=83 | |
Spilled Records=0 | |
Failed Shuffles=0 | |
Merged Map outputs=0 | |
GC time elapsed (ms)=505 | |
Total committed heap usage (bytes)=278921216 | |
org.apache.hadoop.examples.terasort.TeraGen$Counters | |
CHECKSUM=214760662691937609 | |
File Input Format Counters | |
Bytes Read=0 | |
File Output Format Counters | |
Bytes Written=10000000000 | |
Terasort last few lines with summary: | |
duration: ~9min | |
17/05/24 15:45:53 INFO mapreduce.Job: Counters: 35 | |
File System Counters | |
FILE: Number of bytes read=282857735960 | |
FILE: Number of bytes written=424889341747 | |
FILE: Number of read operations=0 | |
FILE: Number of large read operations=0 | |
FILE: Number of write operations=0 | |
HDFS: Number of bytes read=182074929179 | |
HDFS: Number of bytes written=44922543979 | |
HDFS: Number of read operations=1359 | |
HDFS: Number of large read operations=0 | |
HDFS: Number of write operations=126 | |
Map-Reduce Framework | |
Map input records=100000000 | |
Map output records=100000000 | |
Map output bytes=10200000000 | |
Map output materialized bytes=10400000912 | |
Input split bytes=1919 | |
Combine input records=0 | |
Combine output records=0 | |
Reduce input groups=100000000 | |
Reduce shuffle bytes=10400000912 | |
Reduce input records=100000000 | |
Reduce output records=100000000 | |
Spilled Records=300000000 | |
Shuffled Maps =152 | |
Failed Shuffles=0 | |
Merged Map outputs=152 | |
GC time elapsed (ms)=3888 | |
Total committed heap usage (bytes)=13784580096 | |
Shuffle Errors | |
BAD_ID=0 | |
CONNECTION=0 | |
IO_ERROR=0 | |
WRONG_LENGTH=0 | |
WRONG_MAP=0 | |
WRONG_REDUCE=0 | |
File Input Format Counters | |
Bytes Read=10000000000 | |
File Output Format Counters | |
Bytes Written=10000000000 | |
17/05/24 15:45:53 INFO terasort.TeraSort: done | |
Commands: | |
./bin/hdfs dfs -rm -r -f hdfs://hdfs/teraInputTB hdfs://hdfs/teraOutputTB hdfs://hdfs/teraValidateTB | |
# TeraSort | |
./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.9.1.jar teragen \ | |
-Ddfs.block.size=536870912 \ | |
-Dmapred.map.tasks=16 \ | |
-Dmapred.reduce.tasks=8 \ | |
-Dmapred.map.tasks.speculative.execution=true \ | |
-Dmapred.compress.map.output=true \ | |
$DATA_RECORDS hdfs://hdfs/teraInputTB | |
./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.9.1.jar terasort \ | |
-Ddfs.block.size=536870912 \ | |
-Dio.file.buffer.size=32768 \ | |
-Dmapred.map.tasks=16 \ | |
-Dmapred.reduce.tasks=8 \ | |
-Dio.sort.factor=48 \ | |
-Dio.sort.record.percent=0.138 \ | |
hdfs://hdfs/teraInputTB hdfs://hdfs/teraOutputTB |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment