-
-
Save ceteri/3889615 to your computer and use it in GitHub Desktop.
ACM DM - Multitool exercise
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# use git to load multitool (simplest as a ZIP) | |
# https://github.com/Cascading/cascading.multitool | |
# to save time, we'll skip the JAR compile/build... | |
# download the JAR file from: | |
# https://s3.amazonaws.com/ceteri-mapred/multitool.jar | |
# cd to your cascading.multitool download | |
bash-3.2$ rm -rf output | |
bash-3.2$ hadoop jar ./multitool.jar source=data/days.txt select=Tuesday sink=output/tuesday.txt | |
Warning: $HADOOP_HOME is deprecated. | |
12/10/14 12:43:11 INFO multitool.Main: key: source | |
12/10/14 12:43:11 INFO multitool.Main: key: select | |
12/10/14 12:43:11 INFO multitool.Main: key: sink | |
12/10/14 12:43:11 INFO util.HadoopUtil: resolving application jar from found main method on: multitool.Main | |
12/10/14 12:43:11 INFO planner.HadoopPlanner: using application jar: /Users/ceteri/Downloads/multitool.jar | |
12/10/14 12:43:11 INFO property.AppProps: using app.id: 179169EFDA360324B08F975B10EABE23 | |
2012-10-14 12:43:11.671 java[49161:1903] Unable to load realm info from SCDynamicStore | |
12/10/14 12:43:11 INFO flow.Flow: [multitool] starting | |
12/10/14 12:43:11 INFO flow.Flow: [multitool] source: Hfs["TextLine[[0:1]->[ALL]]"]["data/days.txt"]"] | |
12/10/14 12:43:11 INFO flow.Flow: [multitool] sink: Hfs["TextDelimited[[UNKNOWN]->[0]]"]["output/tuesday.txt"]"] | |
12/10/14 12:43:11 INFO flow.Flow: [multitool] parallel execution is enabled: false | |
12/10/14 12:43:11 INFO flow.Flow: [multitool] starting jobs: 1 | |
12/10/14 12:43:11 INFO flow.Flow: [multitool] allocating threads: 1 | |
12/10/14 12:43:11 INFO flow.FlowStep: [multitool] starting step: (1/1) output/tuesday.txt | |
12/10/14 12:43:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable | |
12/10/14 12:43:11 WARN snappy.LoadSnappy: Snappy native library not loaded | |
12/10/14 12:43:11 INFO mapred.FileInputFormat: Total input paths to process : 1 | |
12/10/14 12:43:12 INFO flow.FlowStep: [multitool] submitted hadoop job: job_local_0001 | |
12/10/14 12:43:12 INFO mapred.Task: Using ResourceCalculatorPlugin : null | |
12/10/14 12:43:12 INFO io.MultiInputSplit: current split input path: file:/Users/ceteri/src/concur/cascading.multitool/data/days.txt | |
12/10/14 12:43:12 INFO mapred.MapTask: numReduceTasks: 0 | |
12/10/14 12:43:12 INFO hadoop.FlowMapper: cascading version: Concurrent, Inc - Cascading 2.0.6-wip-362 | |
12/10/14 12:43:12 INFO hadoop.FlowMapper: child jvm opts: -server -Xmx512m | |
12/10/14 12:43:12 INFO hadoop.FlowMapper: sourcing from: Hfs["TextLine[[0:1]->[ALL]]"]["data/days.txt"]"] | |
12/10/14 12:43:12 INFO hadoop.FlowMapper: sinking to: Hfs["TextDelimited[[UNKNOWN]->[0]]"]["output/tuesday.txt"]"] | |
12/10/14 12:43:12 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting | |
12/10/14 12:43:12 INFO mapred.LocalJobRunner: | |
12/10/14 12:43:12 INFO mapred.Task: Task attempt_local_0001_m_000000_0 is allowed to commit now | |
12/10/14 12:43:12 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_m_000000_0' to file:/Users/ceteri/src/concur/cascading.multitool/output/tuesday.txt | |
12/10/14 12:43:15 INFO mapred.LocalJobRunner: file:/Users/ceteri/src/concur/cascading.multitool/data/days.txt:0+726 | |
12/10/14 12:43:15 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. | |
12/10/14 12:43:17 INFO util.Hadoop18TapUtil: deleting temp path output/tuesday.txt/_temporary | |
bash-3.2$ ls output/tuesday.txt/ | |
_SUCCESS part-00000 | |
bash-3.2$ cat output/tuesday.txt/part-00000 | |
Monday's child is fair in face, Tuesday's child is full of grace, Wednesday's child is full of woe. | |
We're still investigating. I heard that Monday or Tuesday we will probably be having a press conference announcing more. | |
We take time to go to a restaurant two times a week. A little candlelight, dinner, soft music and dancing. She goes Tuesdays, I go Fridays. | |
bash-3.2$ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment