Install pandoc on Mac OS X 10.8
$ brew install haskell-platform$ brew install haskell-platform| user www-data; | |
| # As a thumb rule: One per CPU. If you are serving a large amount | |
| # of static files, which requires blocking disk reads, you may want | |
| # to increase this from the number of cpu_cores available on your | |
| # system. | |
| # | |
| # The maximum number of connections for Nginx is calculated by: | |
| # max_clients = worker_processes * worker_connections | |
| worker_processes 1; |
| server { | |
| listen 80 default; ## listen for ipv4; this line is default and implied | |
| listen [::]:80 default ipv6only=on; ## listen for ipv6 | |
| # Make site accessible from http://localhost/ | |
| server_name localhost; | |
| server_name_in_redirect off; | |
| charset utf-8; |
| This gist includes components of a simple workflow application that created a directory and moves files within | |
| hdfs to this directory; | |
| Emails are sent out to notify designated users of success/failure of workflow. There is a prepare section, | |
| to allow re-run of the action..the prepare essentially negates the move done by a potential prior run | |
| of the action. Sample data is also included. | |
| The sample application includes: | |
| -------------------------------- | |
| 1. Oozie actions: hdfs action and email action | |
| 2. Oozie workflow controls: start, end, and kill. |
| This gist includes components of a oozie workflow - scripts/code, sample data | |
| and commands; Oozie actions covered: java mapreduce action; Oozie controls | |
| covered: start, kill, end; The java program uses regex to parse the logs, and | |
| also extracts the path of the mapper input directory path and includes in the | |
| key emitted. | |
| Note: The reducer can be specified as a combiner as well. | |
| Usecase | |
| ------- |
| This gist includes components of a oozie, dataset availability initiated, coordinator job - | |
| scripts/code, sample data and commands; Oozie actions covered: hdfs action, email action, | |
| sqoop action (mysql database); Oozie controls covered: decision; | |
| Usecase | |
| ------- | |
| Pipe report data available in HDFS, to mysql database; | |
| Pictorial overview of job: | |
| -------------------------- |
| This gist includes components of a simple workflow application (oozie 3.3.0) that | |
| pipes data in a Hive table to mysql; | |
| The sample application includes: | |
| -------------------------------- | |
| 1. Oozie actions: sqoop action | |
| 2. Oozie workflow controls: start, end, and kill. | |
| 3. Workflow components: job.properties and workflow.xml | |
| 4. Sample data | |
| 5. Prep tasks in Hive |
| This gist includes components of a oozie (trigger file initiated) coordinator job - | |
| scripts/code, sample data and commands; Oozie actions covered: hdfs action, email action, | |
| java main action, hive action; Oozie controls covered: decision, fork-join; The workflow | |
| includes a sub-workflow that runs two hive actions concurrently. The hive table is | |
| partitioned; Parsing uses hive-regex serde, and Java-regex. Also, the java mapper, gets | |
| the input directory path and includes part of it in the key. | |
| Usecase | |
| ------- | |
| Parse Syslog generated log files to generate reports; |
| This gist includes components of a oozie (time initiated) coordinator application - scripts/code, sample data | |
| and commands; Oozie actions covered: hdfs action, email action, java main action, | |
| hive action; Oozie controls covered: decision, fork-join; The workflow includes a | |
| sub-workflow that runs two hive actions concurrently. The hive table is partitioned; | |
| Parsing uses hive-regex serde, and Java-regex. Also, the java mapper, gets the input | |
| directory path and includes part of it in the key. | |
| Usecase: Parse Syslog generated log files to generate reports; | |
| Pictorial overview of job: |
| This gist includes oozie workflow components (streaming map reduce action) to execute | |
| python mapper and reducer scripts to parse Syslog generated log files using regex; | |
| Usecase: Count the number of occurances of processes that got logged, by month, and process. | |
| Pictorial overview of workflow: | |
| -------------------------------- | |
| http://hadooped.blogspot.com/2013/07/apache-oozie-part-5-oozie-workflow-with.html | |
| Includes: | |
| --------- |