Skip to content

Instantly share code, notes, and snippets.

@abhilater
Last active March 7, 2019 14:53
Show Gist options
  • Save abhilater/cf9add0ea400f1b36e966135a45a9245 to your computer and use it in GitHub Desktop.
Save abhilater/cf9add0ea400f1b36e966135a45a9245 to your computer and use it in GitHub Desktop.
Does Hive ACID tables for Hive version 1.2 posses the capability of being read into Apache Pig using HCatLoader (or other means) or in Spark using SQLContext(or other means).
For Spark, it seems it is only possible to read ACID tables if the table is fully compacted i.e no delta folders exist in any partition. Details in the following JIRA
https://issues.apache.org/jira/browse/SPARK-15348,
https://issues.apache.org/jira/browse/SPARK-15348
However I wanted to know if it is supported at all in Apache Pig to read ACID tables in Hive.
When I tried reading both an unpartitoned/partitioned ACID table in Pig version 0.16 I get 0 records read.
​Successfully read 0 records from: "dwh.acid_table:
HDP version 2.6.5
Spark version 2.3
Pig version 0.16
Hive version 1.2
@tspannhw
Copy link

tspannhw commented Mar 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment