Skip to content

Instantly share code, notes, and snippets.

@tariqmislam
Created March 22, 2012 15:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save tariqmislam/2159024 to your computer and use it in GitHub Desktop.
Save tariqmislam/2159024 to your computer and use it in GitHub Desktop.
Querying SQL Server Using Sqoop From Ubuntu VM
I ran into an issue with importing from SQL Server using Sqoop, where the import/import-all-tables options do not seem to support custom defined schema prefix owners (default is 'dbo', which is not a problem).
This is using the MS SQL Server - Hadoop Connector (sqoop-sqlserver-1.0.tar.gz) found at http://download.microsoft.com. In addition, and you'll find this in the instructions/user guide for the connector, you will need the Microsoft JDBC Driver (sqljdbc_3.0), which will need to be placed into your $SQOOP_HOME/lib directory. This can be downloaded from http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=21599
All of this is assuming you are running Cloudera's distribution on Ubuntu 11.10 through VMWare Player on Windows 7 64-bit (this is my environment anyway).
Query:
bin/sqoop import --connect 'jdbc:sqlserver://<ip-address>;instanceName=<instance-name>;username=<user-name>;password=<password>;database=<database-name>' --query 'SELECT * FROM [Owner].[prefix].[table-name] WHERE $CONDITIONS' --split-by <column-to-split-by> --target-dir <hdfs-target-directory>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment