Skip to content

Instantly share code, notes, and snippets.

@Quantisan
Created August 23, 2012 20:21
Show Gist options
  • Save Quantisan/3441186 to your computer and use it in GitHub Desktop.
Save Quantisan/3441186 to your computer and use it in GitHub Desktop.
Impatient part 1
$ more output/rain/part-00000
doc_id text
doc01 A rain shadow is a dry area on the lee back side of a mountainous area.
doc02 This sinking, dry air produces a rain shadow, or area in the lee of a mountain with less rain and cloudcover.
doc03 A rain shadow is an area of dry land that lies on the leeward (or downwind) side of a mountain.
doc04 This is known as the rain shadow effect and is the primary cause of leeward deserts of mountain ranges, such as California's Death Valley.
doc05 Two Women. Secrets. A Broken Land. [DVD Australia]
$ hadoop jar target/impatient.jar data/rain.txt output/rain
12/08/23 21:19:31 INFO util.HadoopUtil: resolving application jar from found main method on: impatient.core
12/08/23 21:19:31 INFO planner.HadoopPlanner: using application jar: /Users/paullam/Dropbox/Projects/Impatient/part1/target/impatient.jar
12/08/23 21:19:31 INFO property.AppProps: using app.id: F48AFC0890AD7189158F5B4EEABA1455
2012-08-23 21:19:31.662 java[12900:1903] Unable to load realm info from SCDynamicStore
12/08/23 21:19:31 INFO util.Version: Concurrent, Inc - Cascading 2.0.0
12/08/23 21:19:31 INFO flow.Flow: [] starting
12/08/23 21:19:31 INFO flow.Flow: [] source: Hfs["TextDelimited[[UNKNOWN]->[ALL]]"]["data/rain.txt"]"]
12/08/23 21:19:31 INFO flow.Flow: [] sink: Hfs["TextDelimited[[UNKNOWN]->['?doc', '?line']]"]["output/rain"]"]
12/08/23 21:19:31 INFO flow.Flow: [] parallel execution is enabled: false
12/08/23 21:19:31 INFO flow.Flow: [] starting jobs: 1
12/08/23 21:19:31 INFO flow.Flow: [] allocating threads: 1
12/08/23 21:19:31 INFO flow.FlowStep: [] starting step: (1/1) output/rain
12/08/23 21:19:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/08/23 21:19:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/08/23 21:19:32 WARN snappy.LoadSnappy: Snappy native library not loaded
12/08/23 21:19:32 INFO mapred.FileInputFormat: Total input paths to process : 1
12/08/23 21:19:32 INFO flow.FlowStep: [] submitted hadoop job: job_local_0001
12/08/23 21:19:32 INFO hadoop.TupleSerialization: using default comparator: cascalog.hadoop.DefaultComparator
12/08/23 21:19:32 INFO io.MultiInputSplit: current split input path: file:/Users/paullam/Dropbox/Projects/Impatient/part1/data/rain.txt
12/08/23 21:19:32 INFO mapred.MapTask: numReduceTasks: 0
12/08/23 21:19:32 INFO hadoop.FlowMapper: sourcing from: Hfs["TextDelimited[[UNKNOWN]->[ALL]]"]["data/rain.txt"]"]
12/08/23 21:19:32 INFO hadoop.FlowMapper: sinking to: Hfs["TextDelimited[[UNKNOWN]->['?doc', '?line']]"]["output/rain"]"]
12/08/23 21:19:32 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/08/23 21:19:32 INFO mapred.LocalJobRunner:
12/08/23 21:19:32 INFO mapred.Task: Task attempt_local_0001_m_000000_0 is allowed to commit now
12/08/23 21:19:32 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_m_000000_0' to file:/Users/paullam/Dropbox/Projects/Impatient/part1/output/rain
12/08/23 21:19:32 INFO mapred.LocalJobRunner: file:/Users/paullam/Dropbox/Projects/Impatient/part1/data/rain.txt:0+510
12/08/23 21:19:32 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/08/23 21:19:33 INFO util.Hadoop18TapUtil: deleting temp path output/rain/_temporary
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment