Skip to content

Instantly share code, notes, and snippets.

@callingmedic911
Last active September 12, 2023 04:34
Show Gist options
  • Save callingmedic911/5d0199c1b356f1e0caeb8e79df91db09 to your computer and use it in GitHub Desktop.
Save callingmedic911/5d0199c1b356f1e0caeb8e79df91db09 to your computer and use it in GitHub Desktop.
hadoop_2_node.txt
adpa2403@cluster-a3e1-m:~/lab-2-convert-wordcount-to-urlcount-callingmedic911$ hdfs dfsadmin -report
Configured Capacity: 210853076992 (196.37 GB)
Present Capacity: 172264705966 (160.43 GB)
DFS Remaining: 172263632896 (160.43 GB)
DFS Used: 1073070 (1.02 MB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (2):
Name: 10.128.0.6:9866 (cluster-a02d-w-0.c.pro-visitor-398803.internal)
Hostname: cluster-a02d-w-0.c.pro-visitor-398803.internal
Decommission Status : Normal
Configured Capacity: 105426538496 (98.19 GB)
DFS Used: 536535 (523.96 KB)
Non DFS Used: 14881959977 (13.86 GB)
DFS Remaining: 86131822592 (80.22 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.70%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 0
Last contact: Tue Sep 12 04:19:36 UTC 2023
Last Block Report: Tue Sep 12 04:16:21 UTC 2023
Num of Blocks: 5
Name: 10.128.0.7:9866 (cluster-a02d-w-1.c.pro-visitor-398803.internal)
Hostname: cluster-a02d-w-1.c.pro-visitor-398803.internal
Decommission Status : Normal
Configured Capacity: 105426538496 (98.19 GB)
DFS Used: 536535 (523.96 KB)
Non DFS Used: 14881972265 (13.86 GB)
DFS Remaining: 86131810304 (80.22 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.70%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 0
Last contact: Tue Sep 12 04:19:36 UTC 2023
Last Block Report: Tue Sep 12 04:16:21 UTC 2023
Num of Blocks: 5
adpa2403@cluster-a3e1-m:~/lab-2-convert-wordcount-to-urlcount-callingmedic911$ make urlcount
javac -classpath /etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/local/share/google/dataproc/lib/conscrypt.jar:/usr/local/share/google/dataproc/lib/dataproc-spark-plugins-1.0.0.jar:/usr/local/share/google/dataproc/lib/dataproc-spark-plugins.jar:/usr/local/share/google/dataproc/lib/gcs-connector-hadoop3-2.2.15.jar:/usr/local/share/google/dataproc/lib/gcs-connector.jar:/usr/local/share/google/dataproc/lib/spark-bigquery-connector.jar:/usr/local/share/google/dataproc/lib/spark-bigquery-with-dependencies_2.12-0.27.1.jar:/usr/local/share/google/dataproc/lib/spark-metrics-listener-dataproc-2.1-1.6.0.jar:/usr/local/share/google/dataproc/lib/spark-metrics-listener.jar -d ./ UrlCount.java
jar cf UrlCount.jar UrlCount*.class
rm -f UrlCount*.class
rm -rf output
hadoop jar UrlCount.jar UrlCount input output
2023-09-12 04:18:07,396 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at cluster-a02d-m.c.pro-visitor-398803.internal./10.128.0.5:8032
2023-09-12 04:18:07,638 INFO client.AHSProxy: Connecting to Application History server at cluster-a02d-m.c.pro-visitor-398803.internal./10.128.0.5:10200
2023-09-12 04:18:08,080 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2023-09-12 04:18:08,105 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/adpa2403/.staging/job_1694492157007_0001
2023-09-12 04:18:08,761 INFO input.FileInputFormat: Total input files to process : 2
2023-09-12 04:18:08,929 INFO mapreduce.JobSubmitter: number of splits:2
2023-09-12 04:18:09,432 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1694492157007_0001
2023-09-12 04:18:09,432 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-09-12 04:18:09,778 INFO conf.Configuration: resource-types.xml not found
2023-09-12 04:18:09,780 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-09-12 04:18:10,417 INFO impl.YarnClientImpl: Submitted application application_1694492157007_0001
2023-09-12 04:18:10,698 INFO mapreduce.Job: The url to track the job: http://cluster-a02d-m.c.pro-visitor-398803.internal.:8088/proxy/application_1694492157007_0001/
2023-09-12 04:18:10,707 INFO mapreduce.Job: Running job: job_1694492157007_0001
2023-09-12 04:18:29,973 INFO mapreduce.Job: Job job_1694492157007_0001 running in uber mode : false
2023-09-12 04:18:29,974 INFO mapreduce.Job: map 0% reduce 0%
2023-09-12 04:18:48,125 INFO mapreduce.Job: map 100% reduce 0%
2023-09-12 04:18:57,205 INFO mapreduce.Job: map 100% reduce 33%
2023-09-12 04:19:03,247 INFO mapreduce.Job: map 100% reduce 67%
2023-09-12 04:19:04,254 INFO mapreduce.Job: map 100% reduce 100%
2023-09-12 04:19:08,285 INFO mapreduce.Job: Job job_1694492157007_0001 completed successfully
2023-09-12 04:19:08,434 INFO mapreduce.Job: Counters: 55
File System Counters
FILE: Number of bytes read=85638
FILE: Number of bytes written=1602420
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=507958
HDFS: Number of bytes written=212
HDFS: Number of read operations=21
HDFS: Number of large read operations=0
HDFS: Number of write operations=9
HDFS: Number of bytes read erasure-coded=0
Job Counters
Killed reduce tasks=1
Launched map tasks=2
Launched reduce tasks=3
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=100928323
Total time spent by all reduces in occupied slots (ms)=103359857
Total time spent by all map tasks (ms)=30799
Total time spent by all reduce tasks (ms)=31541
Total vcore-milliseconds taken by all map tasks=30799
Total vcore-milliseconds taken by all reduce tasks=31541
Total megabyte-milliseconds taken by all map tasks=100928323
Total megabyte-milliseconds taken by all reduce tasks=103359857
Map-Reduce Framework
Map input records=3388
Map output records=2391
Map output bytes=80819
Map output materialized bytes=85656
Input split bytes=226
Combine input records=0
Combine output records=0
Reduce input groups=1941
Reduce shuffle bytes=85656
Reduce input records=2391
Reduce output records=7
Spilled Records=4782
Shuffled Maps =6
Failed Shuffles=0
Merged Map outputs=6
GC time elapsed (ms)=407
CPU time spent (ms)=5230
Physical memory (bytes) snapshot=2129780736
Virtual memory (bytes) snapshot=23592620032
Total committed heap usage (bytes)=1847590912
Peak Map Physical memory (bytes)=567480320
Peak Map Virtual memory (bytes)=4713017344
Peak Reduce Physical memory (bytes)=372199424
Peak Reduce Virtual memory (bytes)=4727394304
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=507732
File Output Format Counters
Bytes Written=212
adpa2403@cluster-a3e1-m:~/lab-2-convert-wordcount-to-urlcount-callingmedic911$ hadoop fs -getmerge /user/adpa2403/output result.txt
adpa2403@cluster-a3e1-m:~/lab-2-convert-wordcount-to-urlcount-callingmedic911$ cat result.txt
/wiki/MapReduce 7
mw-data:TemplateStyles:r1133582631 121
mw-data:TemplateStyles:r886049734 12
/wiki/Doi_(identifier) 18
/wiki/ISBN_(identifier) 18
/wiki/S2CID_(identifier) 14
mw-data:TemplateStyles:r1129693374 6
adpa2403@cluster-a3e1-m:~/lab-2-convert-wordcount-to-urlcount-callingmedic911$ time hadoop jar UrlCount.jar UrlCount input output
2023-09-12 04:23:21,771 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at cluster-a02d-m.c.pro-visitor-398803.internal./10.128.0.5:8032
2023-09-12 04:23:21,994 INFO client.AHSProxy: Connecting to Application History server at cluster-a02d-m.c.pro-visitor-398803.internal./10.128.0.5:10200
2023-09-12 04:23:22,255 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2023-09-12 04:23:22,287 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/adpa2403/.staging/job_1694492157007_0002
2023-09-12 04:23:22,790 INFO input.FileInputFormat: Total input files to process : 2
2023-09-12 04:23:22,896 INFO mapreduce.JobSubmitter: number of splits:2
2023-09-12 04:23:23,334 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1694492157007_0002
2023-09-12 04:23:23,334 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-09-12 04:23:23,644 INFO conf.Configuration: resource-types.xml not found
2023-09-12 04:23:23,645 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-09-12 04:23:23,793 INFO impl.YarnClientImpl: Submitted application application_1694492157007_0002
2023-09-12 04:23:23,886 INFO mapreduce.Job: The url to track the job: http://cluster-a02d-m.c.pro-visitor-398803.internal.:8088/proxy/application_1694492157007_0002/
2023-09-12 04:23:23,887 INFO mapreduce.Job: Running job: job_1694492157007_0002
2023-09-12 04:23:38,059 INFO mapreduce.Job: Job job_1694492157007_0002 running in uber mode : false
2023-09-12 04:23:38,060 INFO mapreduce.Job: map 0% reduce 0%
2023-09-12 04:23:54,187 INFO mapreduce.Job: map 100% reduce 0%
2023-09-12 04:24:03,239 INFO mapreduce.Job: map 100% reduce 33%
2023-09-12 04:24:10,274 INFO mapreduce.Job: map 100% reduce 100%
2023-09-12 04:24:14,319 INFO mapreduce.Job: Job job_1694492157007_0002 completed successfully
2023-09-12 04:24:14,450 INFO mapreduce.Job: Counters: 55
File System Counters
FILE: Number of bytes read=85638
FILE: Number of bytes written=1602420
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=507958
HDFS: Number of bytes written=212
HDFS: Number of read operations=21
HDFS: Number of large read operations=0
HDFS: Number of write operations=9
HDFS: Number of bytes read erasure-coded=0
Job Counters
Killed reduce tasks=1
Launched map tasks=2
Launched reduce tasks=3
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=93587843
Total time spent by all reduces in occupied slots (ms)=103753097
Total time spent by all map tasks (ms)=28559
Total time spent by all reduce tasks (ms)=31661
Total vcore-milliseconds taken by all map tasks=28559
Total vcore-milliseconds taken by all reduce tasks=31661
Total megabyte-milliseconds taken by all map tasks=93587843
Total megabyte-milliseconds taken by all reduce tasks=103753097
Map-Reduce Framework
Map input records=3388
Map output records=2391
Map output bytes=80819
Map output materialized bytes=85656
Input split bytes=226
Combine input records=0
Combine output records=0
Reduce input groups=1941
Reduce shuffle bytes=85656
Reduce input records=2391
Reduce output records=7
Spilled Records=4782
Shuffled Maps =6
Failed Shuffles=0
Merged Map outputs=6
GC time elapsed (ms)=519
CPU time spent (ms)=5400
Physical memory (bytes) snapshot=2069057536
Virtual memory (bytes) snapshot=23596089344
Total committed heap usage (bytes)=1780482048
Peak Map Physical memory (bytes)=526888960
Peak Map Virtual memory (bytes)=4714541056
Peak Reduce Physical memory (bytes)=346714112
Peak Reduce Virtual memory (bytes)=4727676928
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=507732
File Output Format Counters
Bytes Written=212
real 0m58.118s
user 0m12.076s
sys 0m0.598s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment