Created
October 30, 2017 15:36
-
-
Save vinodnerella/16c441fe3abec5277d861a8235057f76 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sqoop takes a long time to retrieve the minimum and maximum values of the column specified in the --split-by parameter that are needed for breaking the data into multiple independent tasks. | |
mysql -u retail_dba -h nn01.itversity.com -pitversity | |
use retail_export; | |
mysql> create table int_splitby (col1 integer, col2 char(10)); | |
Query OK, 0 rows affected (0.34 sec) | |
mysql> select * from int_splitby; | |
Empty set (0.00 sec) | |
mysql> INSERT INTO int_splitby values(1,"One"); | |
Query OK, 1 row affected (0.05 sec) | |
mysql> select * from int_splitby; | |
+------+------+ | |
| col1 | col2 | | |
+------+------+ | |
| 1 | One | | |
+------+------+ | |
1 row in set (0.01 sec) | |
mysql> INSERT INTO int_splitby values(1,"One"); | |
Query OK, 1 row affected (0.06 sec) | |
mysql> INSERT INTO int_splitby values(NULL,"One"); | |
Query OK, 1 row affected (0.04 sec) | |
mysql> INSERT INTO int_splitby values(NULL,NULL); | |
Query OK, 1 row affected (0.10 sec) | |
mysql> INSERT INTO int_splitby values(3,"Three"); | |
Query OK, 1 row affected (0.17 sec) | |
mysql> INSERT INTO int_splitby values(4,"Four"); | |
Query OK, 1 row affected (0.05 sec) | |
mysql> select * from int_splitby; | |
+------+-------+ | |
| col1 | col2 | | |
+------+-------+ | |
| 1 | One | | |
| 1 | One | | |
| NULL | One | | |
| NULL | NULL | | |
| 3 | Three | | |
| 4 | Four | | |
+------+-------+ | |
[nerellavinod@gw01 ~]$ sqoop import --connect "jdbc:mysql://nn01.itversity.com:3306/retail_export" --username retail_dba -password itversity --query 'select * from int_splitby WHERE $CONDITIONS' --hive-import --hive-table INT_SPLIT_1 --split-by col1 --target-dir /user/nerellavinod/sqoopimport/ | |
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail. | |
Please set $ACCUMULO_HOME to the root of your Accumulo installation. | |
17/10/30 11:30:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.5.0.0-1245 | |
17/10/30 11:30:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. | |
17/10/30 11:30:16 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override | |
17/10/30 11:30:16 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. | |
17/10/30 11:30:17 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. | |
17/10/30 11:30:17 INFO tool.CodeGenTool: Beginning code generation | |
17/10/30 11:30:18 INFO manager.SqlManager: Executing SQL statement: select * from int_splitby WHERE (1 = 0) | |
17/10/30 11:30:18 INFO manager.SqlManager: Executing SQL statement: select * from int_splitby WHERE (1 = 0) | |
17/10/30 11:30:18 INFO manager.SqlManager: Executing SQL statement: select * from int_splitby WHERE (1 = 0) | |
17/10/30 11:30:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.5.0.0-1245/hadoop-mapreduce | |
Note: /tmp/sqoop-nerellavinod/compile/875e42d1f9de6ced335702fe025a4b18/QueryResult.java uses or overrides a deprecated API. | |
Note: Recompile with -Xlint:deprecation for details. | |
17/10/30 11:30:20 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-nerellavinod/compile/875e42d1f9de6ced335702fe025a4b18/QueryResult.jar | |
17/10/30 11:30:20 INFO mapreduce.ImportJobBase: Beginning query import. | |
17/10/30 11:30:23 INFO impl.TimelineClientImpl: Timeline service address: http://rm01.itversity.com:8188/ws/v1/timeline/ | |
17/10/30 11:30:23 INFO client.RMProxy: Connecting to ResourceManager at rm01.itversity.com/172.16.1.106:8050 | |
17/10/30 11:30:23 INFO client.AHSProxy: Connecting to Application History server at rm01.itversity.com/172.16.1.106:10200 | |
17/10/30 11:30:32 INFO db.DBInputFormat: Using read commited transaction isolation | |
17/10/30 11:30:32 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(col1), MAX(col1) FROM (select * from int_splitby WHERE (1 = 1) ) AS t1 | |
17/10/30 11:30:32 INFO db.IntegerSplitter: Split size: 0; Num splits: 4 from: 1 to: 4 | |
17/10/30 11:30:32 INFO mapreduce.JobSubmitter: number of splits:4 | |
17/10/30 11:30:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1507687444776_5127 | |
17/10/30 11:30:33 INFO impl.YarnClientImpl: Submitted application application_1507687444776_5127 | |
17/10/30 11:30:33 INFO mapreduce.Job: The url to track the job: http://rm01.itversity.com:8088/proxy/application_1507687444776_5127/ | |
17/10/30 11:30:33 INFO mapreduce.Job: Running job: job_1507687444776_5127 | |
17/10/30 11:30:39 INFO mapreduce.Job: Job job_1507687444776_5127 running in uber mode : false | |
17/10/30 11:30:39 INFO mapreduce.Job: map 0% reduce 0% | |
17/10/30 11:30:44 INFO mapreduce.Job: map 100% reduce 0% | |
17/10/30 11:30:45 INFO mapreduce.Job: Job job_1507687444776_5127 completed successfully | |
17/10/30 11:30:45 INFO mapreduce.Job: Counters: 30 | |
File System Counters | |
FILE: Number of bytes read=0 | |
FILE: Number of bytes written=640796 | |
FILE: Number of read operations=0 | |
FILE: Number of large read operations=0 | |
FILE: Number of write operations=0 | |
HDFS: Number of bytes read=393 | |
HDFS: Number of bytes written=27 | |
HDFS: Number of read operations=16 | |
HDFS: Number of large read operations=0 | |
HDFS: Number of write operations=8 | |
Job Counters | |
Launched map tasks=4 | |
Other local map tasks=4 | |
Total time spent by all maps in occupied slots (ms)=24298 | |
Total time spent by all reduces in occupied slots (ms)=0 | |
Total time spent by all map tasks (ms)=12149 | |
Total vcore-milliseconds taken by all map tasks=12149 | |
Total megabyte-milliseconds taken by all map tasks=24881152 | |
Map-Reduce Framework | |
Map input records=4 | |
Map output records=4 | |
Input split bytes=393 | |
Spilled Records=0 | |
Failed Shuffles=0 | |
Merged Map outputs=0 | |
GC time elapsed (ms)=178 | |
CPU time spent (ms)=3700 | |
Physical memory (bytes) snapshot=994680832 | |
Virtual memory (bytes) snapshot=14829936640 | |
Total committed heap usage (bytes)=915931136 | |
File Input Format Counters | |
Bytes Read=0 | |
File Output Format Counters | |
Bytes Written=27 | |
17/10/30 11:30:45 INFO mapreduce.ImportJobBase: Transferred 27 bytes in 22.8895 seconds (1.1796 bytes/sec) | |
17/10/30 11:30:45 INFO mapreduce.ImportJobBase: Retrieved 4 records. | |
17/10/30 11:30:45 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners | |
17/10/30 11:30:45 INFO manager.SqlManager: Executing SQL statement: select * from int_splitby WHERE (1 = 0) | |
17/10/30 11:30:45 INFO manager.SqlManager: Executing SQL statement: select * from int_splitby WHERE (1 = 0) | |
17/10/30 11:30:45 INFO hive.HiveImport: Loading uploaded data into Hive | |
Logging initialized using configuration in jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties | |
OK | |
Time taken: 1.84 seconds | |
Loading data to table default.int_split_1 | |
Table default.int_split_1 stats: [numFiles=4, numRows=0, totalSize=27, rawDataSize=0] | |
OK | |
Time taken: 0.403 seconds | |
[nerellavinod@gw01 ~]$ hive | |
Logging initialized using configuration in file:/etc/hive/2.5.0.0-1245/0/hive-log4j.properties | |
hive (default)> select * from int_split_1; | |
OK | |
1 One | |
1 One | |
3 Three | |
4 Four | |
Time taken: 2.232 seconds, Fetched: 4 row(s) | |
hive (default)> | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment