Skip to content

Instantly share code, notes, and snippets.

@rajkrrsingh
Last active October 14, 2018 20:05
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save rajkrrsingh/24ff6f426248276cfa79063967f08213 to your computer and use it in GitHub Desktop.
Save rajkrrsingh/24ff6f426248276cfa79063967f08213 to your computer and use it in GitHub Desktop.
  • install wget
  • su - hive
cd /tmp
wget https://raw.githubusercontent.com/HortonworksUniversity/Security_Labs/master/labdata/sample_07.csv
wget https://raw.githubusercontent.com/HortonworksUniversity/Security_Labs/master/labdata/sample_08.csv

open hive cli

-- create table
CREATE TABLE sample_07 (
code string,
description string,  
total_emp int,  
salary int )
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t' 
 STORED AS TextFile;

--load data into table
load data local inpath '/tmp/sample_07.csv' into table sample_07;


-- create another table

CREATE TABLE sample_08 (
code string ,
description string ,  
total_emp int ,  
salary int )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t' 
STORED AS TextFile;

--load data into a table
load data local inpath '/tmp/sample_08.csv' into table sample_08;

-- use virtual column to know where query is pulling data

select INPUT__FILE__NAME, code from sample_08;

-- create one more data file 
cat > sample_09.csv
54-7199	Material moving workers, all other	41140	34700

--upload it to table dir
hadoop fs -put sample_09.csv hdfs://rk1.hdp:8020/apps/hive/warehouse/sample_08/

--now run the query
select INPUT__FILE__NAME, code from sample_08 where code='54-7199'
will see output like 
OK
hdfs://rk1.hdp:8020/apps/hive/warehouse/sample_08/sample_09.csv	54-7199
Time taken: 0.466 seconds, Fetched: 1 row(s)

-- figure out how this query was planned

explain select INPUT__FILE__NAME, code from sample_08 where code='54-7199';

Stage-0
   Fetch Operator
      limit:-1
      Select Operator [SEL_2]
         outputColumnNames:["_col0","_col1"]
         Filter Operator [FIL_4]
            predicate:(code = '54-7199') (type: boolean)
            TableScan [TS_0]
               alias:sample_08

the other way of creating a table

  • CTAS
create table sample_07_orc stored as orc  as select * from sample_07;
  • Create Identical table using LIKE
create table sample_07_id like sample_07;

-- to know how much time a query spend on each phase during its run

set hive.tez.exec.print.summary;

-External Table

--External Table

create external table techfunding(
permalink string,
company string,
numEmps string,
category string,
city string,
state string,
fundedDate string,
raisedAmt string,
raisedCurrency string,
round string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/tmp/techfunding';

load data local inpath '/tmp/techfundingUSA.csv' into table techfunding;

select * from techfunding limit 5;

Hive Partitioning

Hive Partitioning

Partitions are horizontal record of data which allows large datasets to get seperated into more managable chunks. In Hive, partitioning is supported for both managed dataset in folders and for external tables also.

cat > /tmp/emp.txt
101,Kyle,Admin,50000,A
102,Xander,Admin,50000,B
103,Jerome,Sales,60000,A
104,Upton,Admin,50000,C
105,Ferris,Admin,50000,C
106,Stewart,Tech,12000,A
107,Chase,Tech,12000,B
108,Malik,Engineer,45000,B
109,Samson,Admin,50000,A
110,Quinlan,Manager,40000,A
111,Joseph,Manager,40000,B
112,Axel,Sales,60000,B
113,Robert,Manager,40000,A
114,Cairo,Engineer,45000,A
115,Gavin,Ceo,100000,D
116,Vaughan,Manager,40000,B
117,Drew,Engineer,45000,D
118,Quinlan,Admin,50000,B
119,Gabriel,Engineer,45000,A
120,Palmer,Ceo,100000,A

We are going to partition this dataset into 3 Departments A,B,C

Create a non partitioned table to store the data (Staging table)

create external table emp_stage (
empid int,
name string,
designation  string,
Salary int,
department string) 
row format delimited 
fields terminated by "," 
location '/tmp/emp_stage_data';

load data local inpath '/tmp/emp.txt' into table emp_stage;

Create Partitioned hive table

create  table emp_part (
empid int,
name string,
designation  string,
salary int) 
PARTITIONED BY (department String) 
row format delimited fields terminated by ","; 


INSERT INTO TABLE emp_part PARTITION(department='A') 
SELECT empid, name,designation,salary FROM emp_stage WHERE department='A'; 

INSERT INTO TABLE emp_part PARTITION (department='B') 
SELECT empid, name,designation,salary FROM emp_stage WHERE department='B'; 

INSERT INTO TABLE emp_part PARTITION (department='C') 
SELECT empid, name,designation,salary FROM emp_stage WHERE department='C';

INSERT INTO TABLE emp_part PARTITION (department='D') 
SELECT empid, name,designation,salary FROM emp_stage WHERE department='D';

If we go for the above approach , if we have 50 partitions we need to do the insert statement 50 times. That is a tedeous task and it is known as Static Partition.

Dynamic Partition – Single insert to partition table Inorder to achieve the same we need to set 4 things,

  • set hive.exec.dynamic.partition=true

  • set hive.exec.dynamic.partition.mode=nonstrict We are using the dynamic partition without a static partition (A table can be partitioned based
    on multiple columns in hive) in such case we have to
    enable the non strict mode. In strict mode we can use
    dynamic partition only with a Static Partition.

  • set hive.exec.max.dynamic.partitions.pernode=4 The default value is 2000(HDP-2.5), we have to modify the
    same according to the possible no of partitions

  • set hive.exec.max.created.files=100000 The default values is 100000 but for larger tables
    it can exceed the default, so we may have to update the same.

    create  table emp_part_dy (
    empid int,
    name string,
    designation  string,
    salary int) 
    PARTITIONED BY (department String) 
    row format delimited fields terminated by ","; 

    INSERT OVERWRITE TABLE emp_part_dy PARTITION(department) SELECT empid, name,designation,salary,department FROM emp_stage; 

will see the output like this

    Loading data to table default.emp_part_dy partition (department=null)
    	 Time taken for load dynamic partitions : 901
    	Loading partition {department=D}
    	Loading partition {department=B}
    	Loading partition {department=A}
    	Loading partition {department=C}
    	 Time taken for adding to write entity : 2
    Partition default.emp_part_dy{department=A} stats: [numFiles=1, numRows=9, totalSize=215, rawDataSize=206]
    Partition default.emp_part_dy{department=B} stats: [numFiles=1, numRows=7, totalSize=165, rawDataSize=158]
    Partition default.emp_part_dy{department=C} stats: [numFiles=1, numRows=2, totalSize=45, rawDataSize=43]
    Partition default.emp_part_dy{department=D} stats: [numFiles=1, numRows=2, totalSize=45, rawDataSize=43]

--verify from the data gets pulled up

hive>  select INPUT__FILE__NAME,empid from default.emp_part_dy where department='D';
OK
hdfs://rk1.hdp:8020/apps/hive/warehouse/emp_part_dy/department=D/000000_0	115
hdfs://rk1.hdp:8020/apps/hive/warehouse/emp_part_dy/department=D/000000_0	117
Time taken: 0.327 seconds, Fetched: 2 row(s)

see the difference between the query plans

explain select * from default.emp_part_dy where department='D';
OK
Plan not optimized by CBO.

Stage-0
   Fetch Operator
      limit:-1
      Select Operator [SEL_2]
         outputColumnNames:["_col0","_col1","_col2","_col3","_col4"]
         TableScan [TS_0]
            alias:emp_part_dy

Time taken: 0.2 seconds, Fetched: 10 row(s)

explain on non partitioned table

explain  select empid from default.emp_stage where department='D';
OK
Plan not optimized by CBO.

Stage-0
   Fetch Operator
      limit:-1
      Select Operator [SEL_2]
         outputColumnNames:["_col0"]
         Filter Operator [FIL_4]
            predicate:(department = 'D') (type: boolean)
            TableScan [TS_0]
               alias:emp_stage

Time taken: 0.083 seconds, Fetched: 12 row(s)
  • Partitioning works at split generation
  • Predicate pushdown is applied during file reads
  • many partitions can result into performance degradation

partition on managed vs external table

Partition on managed Data in HDFS

Data are filtered and seperated to different folders in HDFS

on external tables

Create table with partition
create external table emp_par_ex (
empid int,
name string,
designation string,
salary Int) 
PARTITIONED BY (department string) 
row format delimited fields terminated by "," ;

Load data into emp_par_ex table using ALTER statement 
ALTER TABLE emp_par_ex ADD PARTITION (Department='A') location '/tmp/emp_par_ex/A';
ALTER TABLE emp_par_ex ADD PARTITION (Department='B') location '/tmp/emp_par_ex/B';

ALTER TABLE emp_par_ex ADD PARTITION (Department='C') location '/tmp/emp_par_ex/C';
```
#### common problems with dynamic partitioning
* lots of reducer but only few are doing real work
* simple query are running very slow
  ```
  SELECT COUNT(*) FROM test_table WHERE col = 'value';
  
  due to many number  of partitions
  due to many number of files per partition
  The size of each files under each partition compared with block size
  
  Resolution
  take advantage of like table and insert the data into new table from the original table
  ```

### Bucketing


How does Hive distribute the rows across the buckets? 
In general, the bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets. (There's a '0x7FFFFFFF in there too, but that's not that important). The hash_function depends on the type of the bucketing column. For an int, it's easy, hash_int(i) == i. For example, if user_id were an int, and there were 10 buckets, we would expect all user_id's that end in 0 to be in bucket 1, all user_id's that end in a 1 to be in bucket 2, etc. For other datatypes, it's a little tricky. In particular, the hash of a BIGINT is not the same as the BIGINT. And the hash of a string or a complex datatype will be some number that's derived from the value, but not anything humanly-recognizable. For example, if user_id were a STRING, then the user_id's in bucket 1 would probably not end in 0. In general, distributing rows based on the hash will give you a even distribution in the buckets
```
cat > /tmp/census.csv
100,Oscar,Londerzeel,ac.tellus.Suspendisse@loremut.org
101,Jameson,Loverval,Cras.dolor@magnaUt.net
102,Damon,Morinville,Sed.neque.Sed@urna.co.uk
103,Chancellor,Blairgowrie,ante.dictum.mi@nonummyultricies.net
104,Kenyon,Shimla,posuere.at.velit@pedenec.net
105,Caldwell,Gianico,libero.Proin@semPellentesque.com
106,Levi,Troyes,augue@elit.net
107,Yasir,Langford,est.mauris.rhoncus@turpisIn.edu
108,Upton,Aartrijke,mi.tempor.lorem@nuncid.net
109,Nicholas,Placilla,cursus@primisin.edu
110,Emerson,Minervino di Lecce,molestie@acturpis.net
111,Reese,Juseret,lacinia@risusNunc.co.uk
112,Giacomo,Lampernisse,In.nec.orci@vitaesodales.edu
113,Mohammad,Sandviken,egestas.ligula.Nullam@rutrum.co.uk
114,Fulton,Lawton,tristique.neque@utaliquamiaculis.edu
115,Barry,Kitchener,Fusce@ametrisusDonec.org
116,Valentine,Nandyal,nonummy.ut.molestie@nasceturridiculusmus.ca
117,Lawrence,Hachy,eget.venenatis.a@Aliquamnisl.com
118,Micah,Wolfsberg,ultricies.ornare@ultriciesornareelit.org
119,Chandler,Lago Verde,primis.in.faucibus@seddui.org
120,Myles,Abolens,et@mollis.ca
121,Len,Souvret,vitae.posuere.at@rhoncus.com
122,Blaze,Brunn am Gebirge,pede.Cras@imperdietullamcorper.net
123,Rafael,Hofstade,eget.laoreet@tellus.ca
124,Ahmed,St. Clears,Phasellus@aceleifendvitae.co.uk
125,Deacon,Ramskapelle,Ut.tincidunt@egetipsumDonec.net
126,Steel,Newbury,amet.risus@Nullamscelerisque.ca
127,Isaac,Brixton,non.enim.Mauris@gravidaPraesenteu.ca
128,Andrew,Smithers,massa.Mauris@tortor.ca
129,Cullen,Lerum,cubilia.Curae.Donec@acfacilisis.ca
130,Lucian,Oudenburg,Ut.semper.pretium@anteiaculis.edu
131,Kevin,Flin Flon,non@elementumlorem.ca
132,Abbot,Hulshout,Nunc.lectus@Maurisnon.net
133,Hammett,Torrevecchia Teatina,Nullam@quis.edu
134,Troy,Alençon,amet.ornare.lectus@dictumplacerat.org
135,Chase,Poole,ac.sem@consequat.co.uk
136,Bruce,Forgaria nel Friuli,pulvinar.arcu.et@disparturient.com
137,Ali,Kozhikode,semper@felisadipiscing.net
138,Avram,Springdale,risus@vestibulumneceuismod.com
139,Kadeem,Kalyan,orci@acsem.org
140,Abel,Molfetta,aliquet@egestasrhoncus.net
141,Beau,Oppido Mamertina,Class.aptent.taciti@nec.net
142,Hakeem,Beerse,sit@lorem.edu
143,Berk,Tirúa,ad.litora.torquent@enim.com
144,Jakeem,Beaumaris,arcu.Sed.et@lectus.ca
145,Rudyard,Częstochowa,Aliquam.erat@dignissimmagna.edu
146,Oscar,Koekelberg,at.velit@taciti.ca
147,Clarke,Basildon,mollis@eget.co.uk
148,Linus,Gateshead,et@facilisisSuspendisse.net
149,Christian,Baracaldo,non.sollicitudin.a@semperrutrumFusce.org
150,Ferris,Birmingham,penatibus@facilisisvitaeorci.edu
151,Ray,Langley,eu.tellus.Phasellus@Donec.com
152,Jason,Naperville,Vivamus.sit@diam.net
153,Cain,Dieppe,Vivamus.molestie.dapibus@Naminterdumenim.ca
154,Vance,Iqaluit,condimentum.Donec.at@Namac.edu
155,Daniel,Sambreville,dictum.eu@Innecorci.co.uk
156,John,Vänersborg,sodales@Quisquefringillaeuismod.net
157,Colton,Rostock,quis.tristique.ac@acurna.edu
158,Kamal,Ternat,eleifend@mitemporlorem.edu
159,Kadeem,Sparwood,non.sollicitudin@dapibusid.edu
160,Keefe,Rocca Santo Stefano,risus.In.mi@eget.edu
161,Tate,Mango,arcu@vulputatelacus.org
162,Zachery,Pointe-Claire,nibh.sit@auctor.org
163,Kelly,Versailles,justo@a.net
164,Oscar,Lille,rutrum@congueelit.com
165,Castor,Cametá,et@sapiengravidanon.ca
166,Rashad,Logroño,non@Nullaaliquet.co.uk
167,Merritt,Marchihue,Cum.sociis@enimnonnisi.com
168,Isaac,Steyr,non.lobortis@a.ca
169,Matthew,Tewkesbury,Sed.eu.nibh@ut.co.uk
170,Tiger,Kawerau,tortor.dictum@ultrices.edu
171,Harrison,Oderzo,Duis@pede.org
172,Zahir,Pali,in.tempus@sociis.edu
173,Alden,Castelnovo del Friuli,facilisis.non@feugiatnon.ca
174,Chadwick,Sialkot,malesuada.id@luctus.ca
175,Kevin,Gibsons,orci.luctus.et@urnaNunc.com
176,Leo,Offida,eleifend.nunc.risus@auctor.net
177,Cameron,Oudegem,Praesent@ategestas.org
178,Carlos,São Luís,Class@iaculisodio.co.uk
179,Kieran,Manchester,luctus.lobortis@semmolestie.co.uk
180,Acton,Calgary,sit@sempertellus.com
181,Aladdin,Aylmer,lacinia.at.iaculis@anteMaecenasmi.ca
182,Barry,Tampa,mauris@semvitae.org
183,Gregory,Coaldale,dictum.eu.placerat@Phasellus.ca
184,Dean,Couthuin,porttitor@risusDonec.org
185,Castor,Pathankot,non@faucibusleoin.org
186,Yuli,Jemeppe-sur-Sambre,at.velit@velarcueu.org
187,Rooney,Cerreto di Spoleto,sollicitudin.orci.sem@pede.org
188,Uriel,Marchienne-au-Pont,Cras.eu.tellus@sociisnatoquepenatibus.net
189,Theodore,Villa Cortese,Aliquam.erat@magnaPhasellusdolor.org
190,Callum,Clearwater Municipal District,sociosqu.ad@milorem.edu
191,Gabriel,Fort Good Hope,aliquet@sit.org
192,Berk,Filacciano,dui.Fusce@velmauris.edu
193,Noah,Lebu,magna.tellus@pede.net
194,Cruz,Stirling,elit@facilisis.co.uk
195,Hakeem,Pelarco,Aenean.euismod.mauris@pedesagittis.net
196,Devin,Castlegar,Cras.eu.tellus@Curabitur.edu
197,Cedric,Montleban,Morbi.metus.Vivamus@congueturpis.co.uk
198,Adrian,Swan Hills,lacus.Cras@nibhsit.org
199,Lars,Buguma,gravida.non.sollicitudin@Maecenasmi.com
```
--create table backed by csv text file
```
create table census(
ssn int,
name string,
city string,
email string) 
row format delimited 
fields terminated by ',';

load data local inpath '/tmp/census.csv' into table census;
```
--create bucketed table
```
create table census_clus(
ssn int,
name string,
city string,
email string) 
clustered by (ssn) into 8 buckets;
```
---insert data in bucketed table from census table
```
set hive.enforce.bucketing=true

insert overwrite  table census_clus select *  from census;
```
-- now check on hdfs table directory
```
 hadoop fs -ls /apps/hive/warehouse/census_clus

 will see 8 part file on hdfs
 Found 8 items
-rwxrwxrwx   3 hive hdfs        542 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000000_0
-rwxrwxrwx   3 hive hdfs        576 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000001_0
-rwxrwxrwx   3 hive hdfs        571 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000002_0
-rwxrwxrwx   3 hive hdfs        543 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000003_0
-rwxrwxrwx   3 hive hdfs        649 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000004_0
-rwxrwxrwx   3 hive hdfs        691 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000005_0
-rwxrwxrwx   3 hive hdfs        630 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000006_0
-rwxrwxrwx   3 hive hdfs        641 2016-09-11 04:41 /apps/hive/warehouse/census_clus/000007_0
```
### Joins

#### shuffle or common join or reducer join
```
set hive.auto.convert.join;
explain select s.* from sample_07 s,sample_08 ss where s.code=ss.code;
```

#### Map Join or Broadcast join
```
set hive.auto.convert.join=true;
hive.auto.convert.join.noconditionaltask=true
hive.auto.convert.join.noconditionaltask.size=286331153

explain select s.* from sample_07 s,sample_08 ss where s.code=ss.code;
notice difference
Map 2 <- Map 1 (BROADCAST_EDGE)
Map Join Operator [MAPJOIN_15]
```
#### Bucket Map Join
```
--join is done in a Mapper only
--Mapper processing bucket 1 from table A will fetch bucket 1 from table B
--Spawn mapper based on big table
--only matching buckets of all small tables are replicated onto each mapper
-- the table should be bucketed on same join column and The number of buckets in one table is a multiple of the number of buckets in the other table.

--tables: 
census_clus with 8 buckets
census_clus_4 with 4 buckets

CREATE TABLE `census_clus_4`(
  `ssn` int, 
  `name` string, 
  `city` string, 
  `email` string)
CLUSTERED BY ( 
  ssn) 
INTO 4 BUCKETS 

insert into table census_clus_4 select * from census;

set hive.optimize.bucketmapjoin=true;
explain select c4.* from census_clus_4 c4,census_clus c where c4.ssn=c.ssn;
```
#### Sort Merge Bucket Map Join
When all tables are:
Large.
Bucketed using the join columns.
Sorted using the join columns.
All tables have the same number of buckets.

```
CREATE TABLE `census_clus_smb1`(
  `ssn` int, 
  `name` string, 
  `city` string, 
  `email` string)
CLUSTERED BY (ssn)
SORTED BY (ssn) 
INTO 4 BUCKETS 

CREATE TABLE `census_clus_smb2`(
  `ssn` int, 
  `name` string, 
  `city` string, 
  `email` string)
CLUSTERED BY (ssn)
SORTED BY (ssn) 
INTO 4 BUCKETS 

insert into table census_clus_smb1 select * from census;
insert into table census_clus_smb2 select * from census;

set hive.auto.convert.sortmerge.join=true;
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;

explain select c1.* from census_clus_smb1 c1,census_clus_smb2 c2 where c1.ssn=c2.ssn;
```
on MR
```
set hive.enforce.sortmergebucketmapjoin=false;
explain select c1.* from census_clus_smb1 c1,census_clus_smb2 c2 where c1.ssn=c2.ssn;
notice Sorted Merge Bucket Map Join Operator
```
#### Skew Join
* One table has huge skew values on the joining column.
* enable skew join.
  * set hive.optimize.skewjoin=true.
  * specified number of rows with the same key in join operator, assuming key as a skew join key.
  * set hive.skewjoin.key=100000.
 
  ### Locks
```
create table test(code int);

show locks techfunding;
+----------+-----------+--------+------------+-------------+-------------+------------+-----------------+-----------------+--------------+-------+-----------+-------------+--+
|  lockid  | database  | table  | partition  | lock_state  | blocked_by  | lock_type  | transaction_id  | last_heartbeat  | acquired_at  | user  | hostname  | agent_info  |
+----------+-----------+--------+------------+-------------+-------------+------------+-----------------+-----------------+--------------+-------+-----------+-------------+--+
| Lock ID  | Database  | Table  | Partition  | State       | Blocked By  | Type       | Transaction ID  | Last Hearbeat   | Acquired At  | User  | Hostname  | Agent Info  |
+----------+-----------+--------+------------+-------------+-------------+------------+-----------------+-----------------+--------------+-------+-----------+-------------+--+


select count(*) from techfunding;

show locks techfunding;
+----------+-----------+--------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-----------+-----------------------------------------------------------+--+
|  lockid  | database  |    table     | partition  | lock_state  |  blocked_by   |  lock_type   | transaction_id  | last_heartbeat  |  acquired_at   | user  | hostname  |                        agent_info                         |
+----------+-----------+--------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-----------+-----------------------------------------------------------+--+
| Lock ID  | Database  | Table        | Partition  | State       | Blocked By    | Type         | Transaction ID  | Last Hearbeat   | Acquired At    | User  | Hostname  | Agent Info                                                |
| 145.1    | default   | techfunding  | NULL       | ACQUIRED    |               | SHARED_READ  | NULL            | 1473575910000   | 1473575910000  | hive  | rk2.hdp   | hive_20160911063828_8fa70fce-587f-479d-8a77-245f525cf624  |
+----------+-----------+--------------+------------+-------------+---------------+--------------+-----------------+-----------------+----------------+-------+-----------+-----------------------------------------------------------+--+
2 rows selected (0.037 seconds)
```
“SHARED” lock is also called a “READ” lock, meaning, other people can still read from the table, but any writes will have to wait for it to finish.

```
INSERT OVERWRITE TABLE test SELECT COUNT(1) FROM sample_07;

sHOW LOCKS test;
+----------+-----------+--------+------------+-------------+---------------+------------+-----------------+-----------------+----------------+-------+-----------+-----------------------------------------------------------+--+
|  lockid  | database  | table  | partition  | lock_state  |  blocked_by   | lock_type  | transaction_id  | last_heartbeat  |  acquired_at   | user  | hostname  |                        agent_info                         |
+----------+-----------+--------+------------+-------------+---------------+------------+-----------------+-----------------+----------------+-------+-----------+-----------------------------------------------------------+--+
| Lock ID  | Database  | Table  | Partition  | State       | Blocked By    | Type       | Transaction ID  | Last Hearbeat   | Acquired At    | User  | Hostname  | Agent Info                                                |
| 147.1    | default   | test   | NULL       | ACQUIRED    |               | EXCLUSIVE  | NULL            | 1473576075000   | 1473576075000  | hive  | rk2.hdp   | hive_20160911064115_e462deea-aaaa-4aeb-9131-ce9194a289d3  |
+----------+-----------+--------+------------+-------------+---------------+------------+-----------------+-----------------+----------------+-------+-----------+-----------------------------------------------------------+--+
2 rows selected (0.051 seconds)
```

“EXCLUSIVE” lock is also called a “WRITE” lock, meaning no one else is able to read or write to the table while the lock is present, all other queries will have to wait for the current query to finish before they can start.

```
2016-09-11 07:21:28,042 INFO  [HiveServer2-Background-Pool: Thread-6040]: lockmgr.DbLockManager (DbLockManager.java:lock(103)) - Requesting: queryId=hive_20160911072127_9bad6e63-c422-49a3-87ee-0db1168bea8c LockRequest(component:[LockComponent(type:EXCLUSIVE, level:TABLE, dbname:default, tablename:test, operationType:NO_TXN), LockComponent(type:SHARED_READ, level:TABLE, dbname:default, tablename:techfunding, operationType:SELECT)], txnid:0, user:anonymous, hostname:rk2.hdp, agentInfo:hive_20160911072127_9bad6e63-c422-49a3-87ee-0db1168bea8c)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment