Skip to content

Instantly share code, notes, and snippets.

@abajwa-hw
Last active May 18, 2018 17:49
Show Gist options
  • Save abajwa-hw/02cc0d35b3143c3f89c7975cd3cd01eb to your computer and use it in GitHub Desktop.
Save abajwa-hw/02cc0d35b3143c3f89c7975cd3cd01eb to your computer and use it in GitHub Desktop.
Sample script to Onboard and run Hive queries from multiple users
#onboard N users and HDFS home directories
numusers=5
userprefix="testuser"
group="testusers"
users=()
tables=("hortoniabank.ww_customers" "hortoniabank.us_customers" "finance.tax_2009" "finance.tax_2010" "finance.tax_2015" "cost_savings.claim_savings" "claim.provider_summary" "consent_master.consent_data")
export hive_port=10500
groupadd ${group}
for i in $(seq 1 ${numusers})
do
user=${userprefix}${i}
echo "Creating user ${user}"
useradd -g ${group} ${user} #must be done on all compute nodes!
echo "Creating home dir for ${user}"
sudo -u hdfs hdfs dfs -mkdir /user/${user}
users+=(${user})
done
echo "${users[*]}"
#Simulate random user querying a random table at random time
# Seed random generator
RANDOM=$$$(date +%s)
while [ 1 ]
do
# Get random user...
selecteduser=${users[$RANDOM % ${#users[@]} ]}
# Get random table...
selectedtable=${tables[$RANDOM % ${#tables[@]} ]}
beeline_url="jdbc:hive2://localhost:${hive_port}/default"
# Run query as random user
echo "$selecteduser: select count(*) from $selectedtable"
beeline -n $selecteduser -u ${beeline_url} -e "select count(*) from $selectedtable"
#sleep for random time between 1-10s
randomsleep=$(( ( RANDOM % 10 ) + 1 ))
sleep ${randomsleep}
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment