Skip to content

Instantly share code, notes, and snippets.

@eric-maynard
Last active November 20, 2018 20:31
Show Gist options
  • Save eric-maynard/3f56ea51455d0d733d07dd91813e0eab to your computer and use it in GitHub Desktop.
Save eric-maynard/3f56ea51455d0d733d07dd91813e0eab to your computer and use it in GitHub Desktop.
A script to analyze a large volume of Impala views with an Impala table
#!/bin/bash
##################################################
# Constants #
##################################################
TABLE_NAME="underlying_tables"
DB_COL="db_name"
VIEW_COL="view_name"
TABLE_COL="table_name"
dbs=('db_one' 'db_two' 'db_three' 'db_four')
alias imp=''
##################################################
# Script #
##################################################
echo "Creating table ${TABLE_NAME}..."
imp -Bq "DROP TABLE IF EXISTS ${TABLE_NAME}" &> /dev/null
imp -Bq "create table if not exists ${TABLE_NAME} (${DB_COL} string, ${VIEW_COL} string, ${TABLE_COL} string)" &> /dev/null
echo "Iterating through views..."
for db in "${dbs[@]}"; do
imp -Bq "use ${db}; show tables" 2> /dev/null | while read table_or_view; do
underlying_tables=$(imp -Bq "explain select * from ${table_or_view}" 2> /dev/null | perl -ne 'print if s#.*SCAN HDFS \[.*?\.(.*?)(\s|\]).*#\1#g')
for t in ${underlying_tables}; do
if [ ${t} != ${table_or_view} ] ; then
imp -Bq "insert into table ${TABLE_NAME} values ('${db}', '${table_or_view}','${t}')" &> /dev/null
fi
done
done
done
echo "Done!"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment