Skip to content

Instantly share code, notes, and snippets.

@lordjc
Created March 21, 2014 17:41
Show Gist options
  • Save lordjc/9691566 to your computer and use it in GitHub Desktop.
Save lordjc/9691566 to your computer and use it in GitHub Desktop.
sqoop a table from sqlserver -> avro on hdfs -> create schema -> create hive table
<workflow-app name="sqoopSqlServerAndCreateHive" xmlns="uri:oozie:workflow:0.4">
<global>
<configuration>
<property>
<name>table</name>
<value></value>
</property>
<property>
<name></name>
<value></value>
</property>
</configuration>
</global>
<start to="sqoopSqlServer"/>
<action name="sqoopSqlServer">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>import --connect jdbc:sqlserver://54.201.152.106 --username ${username} --password ${password} --table ${table} --target-dir /etl/sqlserver/landing/${table}/${outpath} --as-avrodatafile -m 1</command>
</sqoop>
<ok to="extractSchemaAvro"/>
<error to="kill"/>
</action>
<action name="extractSchemaAvro">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>extract_schema.sh</exec>
<argument>${table}</argument>
<argument>${outpath}</argument>
<file>/user/ec2-user/avro-tools-1.7.5.jar#avro-tools-1.7.5.jar</file>
<file>/user/ec2-user/extract_schema.sh#extract_schema.sh</file>
</shell>
<ok to="createHiveTable"/>
<error to="kill"/>
</action>
<action name="createHiveTable">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>/user/ec2-user/hive-site.xml</job-xml>
<configuration>
<property>
<name>job-xml</name>
<value>/user/ec2-user/hive-site.xml</value>
</property>
<property>
<name>oozie.action.sharelib.for.hive</name>
<value>hive2</value>
</property>
<property>
<name>oozie.launcher.action.main.class</name>
<value>org.apache.oozie.action.hadoop.Hive2Main</value>
</property>
</configuration>
<script>/user/ec2-user/createSqlServerTable.hql</script>
<param>table=${table}</param>
<param>outpath=${outpath}</param>
<file>/user/ec2-user/createSqlServerTable.hql#createSqlServerTable.hql</file>
<file>/user/ec2-user/hive-site.xml#hive-site.xml</file>
</hive>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment