Skip to content

Instantly share code, notes, and snippets.

@nileema
Created September 23, 2013 21:56
Show Gist options
  • Save nileema/6677516 to your computer and use it in GitHub Desktop.
Save nileema/6677516 to your computer and use it in GitHub Desktop.
list bucketing, skewed tables
hive:di> create table test_nileema_skewed (c1 int, c2 int, c3 string) skewed by (c1) on (5) ;
OK
Time taken: 5.572 seconds
hive:di> desc formatted test_nileema_skewed;
OK
# col_name data_type comment
c1 int None
c2 int None
c3 string None
# Detailed Table Information
Database: di
Owner: nileema
CreateTime: Mon Sep 23 14:47:20 PDT 2013
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://dfs41.data.facebook.com:9000/namespace/di/warehouse/test_nileema_skewed
Table Type: MANAGED_TABLE
Table Parameters:
transient_lastDdlTime 1379972840
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat: org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Statistics stale: Yes
Skewed Columns: [c1]
Skewed Values: [[5]]
Storage Desc Params:
serialization.format 1
Time taken: 5.611 seconds, Fetched: 31 row(s)
== List bucketed table ==
hive:di> create table test_nileema_skewed_1 (c1 int, c2 int, c3 string) skewed by (c1, c2) on ((5,4), (3,2), (7,6)) stored as directories ;
OK
Time taken: 5.153 seconds
hive:di> desc formatted test_nileema_skewed_1;
OK
# col_name data_type comment
c1 int None
c2 int None
c3 string None
# Detailed Table Information
Database: di
Owner: nileema
CreateTime: Mon Sep 23 14:48:33 PDT 2013
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://dfs41.data.facebook.com:9000/namespace/di/warehouse/test_nileema_skewed_1
Table Type: MANAGED_TABLE
Table Parameters:
transient_lastDdlTime 1379972913
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat: org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Stored As SubDirectories: Yes
Statistics stale: Yes
Skewed Columns: [c1, c2]
Skewed Values: [[5, 4], [3, 2], [7, 6]]
Storage Desc Params:
serialization.format 1
Time taken: 5.913 seconds, Fetched: 32 row(s)
== Non skewed table ==
hive:di> create table test_nileema_normal (c1 int, c2 int, c3 string) ;
OK
Time taken: 5.242 seconds
hive:di> desc formatted test_nileema_normal;
OK
# col_name data_type comment
c1 int None
c2 int None
c3 string None
# Detailed Table Information
Database: di
Owner: nileema
CreateTime: Mon Sep 23 14:54:10 PDT 2013
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://dfs41.data.facebook.com:9000/namespace/di/warehouse/test_nileema_normal
Table Type: MANAGED_TABLE
Table Parameters:
transient_lastDdlTime 1379973250
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat: org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Statistics stale: Yes
Storage Desc Params:
serialization.format 1
Time taken: 5.514 seconds, Fetched: 29 row(s)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment