Skip to content

Instantly share code, notes, and snippets.

@alexradzin
Last active September 12, 2019 07:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alexradzin/5ad344e079b1aa3583a9a2b5bc25dbd6 to your computer and use it in GitHub Desktop.
Save alexradzin/5ad344e079b1aa3583a9a2b5bc25dbd6 to your computer and use it in GitHub Desktop.
Patch for the Hive JDBC driver under spark

h1. How to create patched version of Hive JDBC driver that will work with Spark h2. Goal Make it possible to successfully use statement like

spark.read.jdbc(jdbcUrl, query, props).show()

when JDBC URL looks like jdbc:hive2://the-host:10000/the-namespace Unfortunately this code does not work throwing various exceptions from the driver layer.

h2. How to make the patch h3. Clone hive from git repository

git clone https://github.com/apache/hive.git

h3. Checkout the relevant revistion

git checkout d81c41c4f54160376c2a1b5186d5ceb7ef29a770

h3. Apply patch

git apply --check fix_hive_jdbc.patch

h3. Compile Unfortunately maven build fails for this version due to lack of the third party dependencies in central maven repository. So, the easiest whay I found is to

Create directory ./jdbc/jars and put there the following files:

  • commons-logging-1.1.3.jar
  • hive-jdbc-1.2.1.jar
  • hive-service-1.2.1.jar
  • libthrift-0.9.3.jar

The files can be found ini the central maven repository.

Put build_patch.sh under src/java/, grant it execute permissions using chmod +x src/java/build_patch.sh and run it from directory src/java/. This script should create hive-jdbc-1.2.1_patch.jar into directory ./jdbc/jars.

h3. Put patch to spark cluster The patch should be copied to all nodes of spark cluster under ./spark-SPARK_VERSION-bin-hadoopHADOOP_VERSION/jars instead of hive-jdbc-1.2.1.spark2.jar.

#!/bin/sh
# This file shoule be copied to src/java/ and executed from this directory
JARS=../../jars
CP=$JARS/hive-service-1.2.1.jar:$JARS/hive-jdbc-1.2.1.jar:$JARS/commons-logging-1.1.3.jar:$JARS/libthrift-0.9.3.jar
javac -cp $CP org/apache/hive/jdbc/HiveStatement.java
javac -cp $CP org/apache/hive/jdbc/HivePreparedStatement.java
javac -cp $CP org/apache/hive/jdbc/HiveResultSetMetaData.java
cp $JARS/hive-jdbc-1.2.1.jar $JARS/hive-jdbc-1.2.1_patch.jar
jar uvf $JARS/hive-jdbc-1.2.1_patch.jar org/apache/hive/jdbc/*.class
diff --git a/jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java b/jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java
index 26251557ab..15be8a9ad7 100644
--- a/jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java
+++ b/jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java
@@ -767,4 +767,10 @@ public void setUnicodeStream(int parameterIndex, InputStream x, int length)
// TODO Auto-generated method stub
throw new SQLException("Method not supported");
}
+
+ @Override
+ public void setFetchSize(int rows) throws SQLException {
+ super.setFetchSize(sql.toUpperCase().startsWith("SELECT 1") ? 0 : rows == 0 ? 50 : rows);
+ }
+
}
diff --git a/jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java b/jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java
index aa6f58a7dd..6cd119cdee 100644
--- a/jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java
+++ b/jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java
@@ -63,11 +63,11 @@ public int getColumnDisplaySize(int column) throws SQLException {
}
public String getColumnLabel(int column) throws SQLException {
- return columnNames.get(toZeroIndex(column));
+ return columnNames.get(toZeroIndex(column)).replaceFirst(".+\\.", "");
}
public String getColumnName(int column) throws SQLException {
- return columnNames.get(toZeroIndex(column));
+ return columnNames.get(toZeroIndex(column)).replaceFirst(".+\\.", "");
}
public int getColumnType(int column) throws SQLException {
@@ -97,7 +97,7 @@ public String getSchemaName(int column) throws SQLException {
}
public String getTableName(int column) throws SQLException {
- throw new SQLException("Method not supported");
+ return columnNames.get(toZeroIndex(column)).split("\\.")[0];
}
public boolean isAutoIncrement(int column) throws SQLException {
diff --git a/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java b/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
index 170fc53391..c52def754e 100644
--- a/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
+++ b/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
@@ -722,10 +722,7 @@ public void setPoolable(boolean poolable) throws SQLException {
@Override
public void setQueryTimeout(int seconds) throws SQLException {
- // 0 is supported which means "no limit"
- if (seconds != 0) {
- throw new SQLException("Query timeout seconds must be 0");
- }
+ // ignore the call
}
/*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment