Skip to content

Instantly share code, notes, and snippets.

@choadrocker
Last active December 31, 2015 16:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save choadrocker/8011156 to your computer and use it in GitHub Desktop.
Save choadrocker/8011156 to your computer and use it in GitHub Desktop.
dump X many tables at a time to hdfs
#!/bin/bash
# dump X many tables at a time to hdfs
# will prolly be INSANE LOAD
max_threads=8
source=localhost
source_db=source
user=user
pass=pass
date=$(date +%Y%m%d%M%H)
# get a list of tables in the db were syncing
tables=$( mysql -h ${source} -u${user} -p${pass} ${source_db} -BN -e "show tables;" )
mysqldump -h ${source} -u${user} -p${pass} --max_allowed_packet=1G --single-transaction --no-data ${source_db} | lzop | hadoop dfs -put - /tmp${source_db}/${date}/${source_db}.schema.lzo &
for t in ${tables[@]}; do
threads=$( jobs | wc -l )
if [ ${threads} -gt ${max_threads} ]; then
while [ ${threads} -gt ${max_threads} ]; do
sleep 1
threads=$( jobs | wc -l )
done
fi
echo "Dumping ${t}"
mysqldump -h ${source} -u${user} -p${pass} --max_allowed_packet=1G --single-transaction -entqR ${source_db} ${t} | lzop | hadoop dfs -put - /tmp/${source_db}/${date}/${t}.lzo &
done
j=$(ps -ef | grep mysq[l]dump | wc -l)
while [ ${j} -gt 0 ]; do
printf "\e[5m${j} jobs running...\e[m"
sleep 10
j=$(ps -ef | grep mysq[l]dump | wc -l)
printf "\r"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment