Created
September 7, 2014 20:18
-
-
Save wavescholar/088fac6a275a3e44fb80 to your computer and use it in GitHub Desktop.
running the hadoop grep example
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hadoop fs -put /etc/hadoop/conf/*.xml input | |
[bcampbell@localhost ~]$ hadoop fs -ls input | |
Found 7 items | |
-rw-r--r-- 1 bcampbell supergroup 507105 2014-09-07 15:55 input/Milton_ParadiseLost.txt | |
-rw-r--r-- 1 bcampbell supergroup 246679 2014-09-07 15:55 input/WilliamYeats.txt | |
-rw-r--r-- 1 bcampbell supergroup 2133 2014-09-07 15:58 input/core-site.xml | |
-rw-r--r-- 1 bcampbell supergroup 2324 2014-09-07 15:58 input/hdfs-site.xml | |
-rw-r--r-- 1 bcampbell supergroup 246679 2014-09-07 15:56 input/inputWC | |
-rw-r--r-- 1 bcampbell supergroup 1549 2014-09-07 15:58 input/mapred-site.xml | |
-rw-r--r-- 1 bcampbell supergroup 2375 2014-09-07 15:58 input/yarn-site.xml | |
[bcampbell@localhost ~]$ hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input output23 'dfs[a-z.]+' | |
14/09/07 16:00:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 | |
14/09/07 16:00:07 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). | |
14/09/07 16:00:07 INFO input.FileInputFormat: Total input paths to process : 7 | |
14/09/07 16:00:08 INFO mapreduce.JobSubmitter: number of splits:7 | |
14/09/07 16:00:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1410054700839_0002 | |
14/09/07 16:00:09 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources. | |
14/09/07 16:00:09 INFO impl.YarnClientImpl: Submitted application application_1410054700839_0002 | |
14/09/07 16:00:09 INFO mapreduce.Job: The url to track the job: http://localhost.localdomain:8088/proxy/application_1410054700839_0002/ | |
14/09/07 16:00:09 INFO mapreduce.Job: Running job: job_1410054700839_0002 | |
14/09/07 16:00:18 INFO mapreduce.Job: Job job_1410054700839_0002 running in uber mode : false | |
14/09/07 16:00:18 INFO mapreduce.Job: map 0% reduce 0% | |
14/09/07 16:00:23 INFO mapreduce.Job: map 29% reduce 0% | |
14/09/07 16:00:24 INFO mapreduce.Job: map 43% reduce 0% | |
14/09/07 16:00:25 INFO mapreduce.Job: map 57% reduce 0% | |
14/09/07 16:00:26 INFO mapreduce.Job: map 100% reduce 0% | |
14/09/07 16:00:30 INFO mapreduce.Job: map 100% reduce 100% | |
14/09/07 16:00:30 INFO mapreduce.Job: Job job_1410054700839_0002 completed successfully | |
14/09/07 16:00:30 INFO mapreduce.Job: Counters: 49 | |
File System Counters | |
FILE: Number of bytes read=330 | |
FILE: Number of bytes written=740425 | |
FILE: Number of read operations=0 | |
FILE: Number of large read operations=0 | |
FILE: Number of write operations=0 | |
HDFS: Number of bytes read=1009700 | |
HDFS: Number of bytes written=470 | |
HDFS: Number of read operations=24 | |
HDFS: Number of large read operations=0 | |
HDFS: Number of write operations=2 | |
Job Counters | |
Launched map tasks=7 | |
Launched reduce tasks=1 | |
Data-local map tasks=7 | |
Total time spent by all maps in occupied slots (ms)=20069 | |
Total time spent by all reduces in occupied slots (ms)=3482 | |
Total time spent by all map tasks (ms)=20069 | |
Total time spent by all reduce tasks (ms)=3482 | |
Total vcore-seconds taken by all map tasks=20069 | |
Total vcore-seconds taken by all reduce tasks=3482 | |
Total megabyte-seconds taken by all map tasks=20550656 | |
Total megabyte-seconds taken by all reduce tasks=3565568 | |
Map-Reduce Framework | |
Map input records=27113 | |
Map output records=10 | |
Map output bytes=304 | |
Map output materialized bytes=366 | |
Input split bytes=856 | |
Combine input records=10 | |
Combine output records=10 | |
Reduce input groups=10 | |
Reduce shuffle bytes=366 | |
Reduce input records=10 | |
Reduce output records=10 | |
Spilled Records=20 | |
Shuffled Maps =7 | |
Failed Shuffles=0 | |
Merged Map outputs=7 | |
GC time elapsed (ms)=323 | |
CPU time spent (ms)=6260 | |
Physical memory (bytes) snapshot=2039488512 | |
Virtual memory (bytes) snapshot=5680246784 | |
Total committed heap usage (bytes)=1610612736 | |
Shuffle Errors | |
BAD_ID=0 | |
CONNECTION=0 | |
IO_ERROR=0 | |
WRONG_LENGTH=0 | |
WRONG_MAP=0 | |
WRONG_REDUCE=0 | |
File Input Format Counters | |
Bytes Read=1008844 | |
File Output Format Counters | |
Bytes Written=470 | |
14/09/07 16:00:30 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 | |
14/09/07 16:00:30 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). | |
14/09/07 16:00:30 INFO input.FileInputFormat: Total input paths to process : 1 | |
14/09/07 16:00:30 INFO mapreduce.JobSubmitter: number of splits:1 | |
14/09/07 16:00:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1410054700839_0003 | |
14/09/07 16:00:30 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources. | |
14/09/07 16:00:30 INFO impl.YarnClientImpl: Submitted application application_1410054700839_0003 | |
14/09/07 16:00:30 INFO mapreduce.Job: The url to track the job: http://localhost.localdomain:8088/proxy/application_1410054700839_0003/ | |
14/09/07 16:00:30 INFO mapreduce.Job: Running job: job_1410054700839_0003 | |
14/09/07 16:00:37 INFO mapreduce.Job: Job job_1410054700839_0003 running in uber mode : false | |
14/09/07 16:00:37 INFO mapreduce.Job: map 0% reduce 0% | |
14/09/07 16:00:43 INFO mapreduce.Job: map 100% reduce 0% | |
14/09/07 16:00:49 INFO mapreduce.Job: map 100% reduce 100% | |
14/09/07 16:00:50 INFO mapreduce.Job: Job job_1410054700839_0003 completed successfully | |
14/09/07 16:00:50 INFO mapreduce.Job: Counters: 49 | |
File System Counters | |
FILE: Number of bytes read=330 | |
FILE: Number of bytes written=184533 | |
FILE: Number of read operations=0 | |
FILE: Number of large read operations=0 | |
FILE: Number of write operations=0 | |
HDFS: Number of bytes read=605 | |
HDFS: Number of bytes written=244 | |
HDFS: Number of read operations=7 | |
HDFS: Number of large read operations=0 | |
HDFS: Number of write operations=2 | |
Job Counters | |
Launched map tasks=1 | |
Launched reduce tasks=1 | |
Data-local map tasks=1 | |
Total time spent by all maps in occupied slots (ms)=3171 | |
Total time spent by all reduces in occupied slots (ms)=3435 | |
Total time spent by all map tasks (ms)=3171 | |
Total time spent by all reduce tasks (ms)=3435 | |
Total vcore-seconds taken by all map tasks=3171 | |
Total vcore-seconds taken by all reduce tasks=3435 | |
Total megabyte-seconds taken by all map tasks=3247104 | |
Total megabyte-seconds taken by all reduce tasks=3517440 | |
Map-Reduce Framework | |
Map input records=10 | |
Map output records=10 | |
Map output bytes=304 | |
Map output materialized bytes=330 | |
Input split bytes=135 | |
Combine input records=0 | |
Combine output records=0 | |
Reduce input groups=1 | |
Reduce shuffle bytes=330 | |
Reduce input records=10 | |
Reduce output records=10 | |
Spilled Records=20 | |
Shuffled Maps =1 | |
Failed Shuffles=0 | |
Merged Map outputs=1 | |
GC time elapsed (ms)=58 | |
CPU time spent (ms)=2140 | |
Physical memory (bytes) snapshot=431476736 | |
Virtual memory (bytes) snapshot=1437347840 | |
Total committed heap usage (bytes)=402653184 | |
Shuffle Errors | |
BAD_ID=0 | |
CONNECTION=0 | |
IO_ERROR=0 | |
WRONG_LENGTH=0 | |
WRONG_MAP=0 | |
WRONG_REDUCE=0 | |
File Input Format Counters | |
Bytes Read=470 | |
File Output Format Counters | |
Bytes Written=244 | |
[bcampbell@localhost ~]$ hadoop fs -ls output23 | |
Found 2 items | |
-rw-r--r-- 1 bcampbell supergroup 0 2014-09-07 16:00 output23/_SUCCESS | |
-rw-r--r-- 1 bcampbell supergroup 244 2014-09-07 16:00 output23/part-r-00000 | |
[bcampbell@localhost ~]$ hadoop fs -cat output23/part-r-00000 | head | |
1 dfs.safemode.min.datanodes | |
1 dfs.safemode.extension | |
1 dfs.replication | |
1 dfs.namenode.name.dir | |
1 dfs.namenode.checkpoint.dir | |
1 dfs.domain.socket.path | |
1 dfs.datanode.hdfs | |
1 dfs.datanode.data.dir | |
1 dfs.client.read.shortcircuit | |
1 dfs.client.file | |
[bcampbell@localhost ~]$ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment