Tryout the sqoop task- outside of workflow
-------------------------------------------
Use the dataset from my gist-
https://gist.github.com/airawat/5970026


*********************************
Sqoop command
*********************************
 
Pre-requisties:
1.  Dataset to be exported should exist on HDFS
2.  mySql table that is the destination for the export should exist
 
Command:
--Run on node that acts as sqoop client;
 
$ sqoop export \
--connect jdbc:mysql://cdh-dev01/airawat \
--username devUser \
--password myPwd \
--table Logged_Process_Count_By_Year  \
--direct \
--export-dir "oozieProject/datasetGeneratorApp/outputDir" \
--fields-terminated-by "\t"
 
 
*********************************
Results in mysql
*********************************
 
mysql> select * from Logged_Process_Count_By_Year order by occurrence desc;
+----------------------------+------------+
| year_and_process           | occurrence |
+----------------------------+------------+
| 2013-ntpd_initres          |       4133 |
| 2013-kernel                |        810 |
| 2013-init                  |        166 |
| 2013-pulseaudio            |         18 |
| 2013-spice-vdagent         |         15 |
| 2013-gnome-session         |         11 |
| 2013-sudo                  |          8 |
| 2013-polkit-agent-helper-1 |          8 |
| 2013-console-kit-daemon    |          7 |
| 2013-NetworkManager        |          7 |
| 2013-udevd                 |          6 |
| 2013-sshd                  |          6 |
| 2013-nm-dispatcher.action  |          4 |
| 2013-login                 |          2 |
+----------------------------+------------+
14 rows in set (0.00 sec)