Skip to content

Instantly share code, notes, and snippets.

@chadoh
Last active April 2, 2017 20:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save chadoh/4627921 to your computer and use it in GitHub Desktop.
Save chadoh/4627921 to your computer and use it in GitHub Desktop.
#!/bin/bash
# This script will switch all textual columns in your db to a new encoding
# without switching the actual bytes in those columns. If you had a dumb
# programming language like Ruby 1.8 that dumped a bunch of utf8-encoded
# strings into your latin1 database columns, then switching to a better
# programming language (like Ruby 1.9) may get you some weird results. MySQL
# will try to be helpful, and convert the stored bytes from latin1 to utf8. But
# since they're already in utf8, you get gobblidigook.
#
# Find more detail and a more complete description of the strategy used here at
# http://dba.stackexchange.com/a/33514/18310
#
# To use, just `./convert_charset db_name new_encoding` For a real db, you need
# to modify the USER, PW, and the `mysql` lines with a `-p$PW`
DB=$1
NEW_ENCODING=$2
USER='root'
PW=''
TABLES=`echo "SELECT distinct(TABLE_NAME) FROM information_schema.COLUMNS\
WHERE TABLE_SCHEMA='$DB'\
AND (DATA_TYPE='varchar' OR DATA_TYPE LIKE '%text')\
GROUP BY TABLE_NAME" | mysql -u $USER -NB`
for TABLE in $TABLES; do
# We'll construct these throughout the script, then execute at the end
BINARY_STATEMENT="ALTER TABLE $DB.$TABLE\
DEFAULT CHARSET=binary"
NEW_ENCODING_STATEMENT="ALTER TABLE $DB.$TABLE\
DEFAULT CHARSET=$NEW_ENCODING"
COLUMNS=`echo "SELECT COLUMN_NAME FROM information_schema.COLUMNS\
WHERE TABLE_SCHEMA='$DB'\
AND TABLE_NAME='$TABLE'\
AND (DATA_TYPE='varchar' OR DATA_TYPE LIKE '%text')" | mysql -u $USER -NB`
for COLUMN in $COLUMNS; do
TYPE=`echo "SELECT DATA_TYPE FROM information_schema.COLUMNS\
WHERE TABLE_SCHEMA='$DB'\
AND TABLE_NAME='$TABLE'\
AND COLUMN_NAME='$COLUMN'" | mysql -u $USER -NB`
if [ $TYPE = 'varchar' ]; then
LENGTH=`echo "SELECT CHARACTER_MAXIMUM_LENGTH FROM information_schema.COLUMNS\
WHERE TABLE_SCHEMA='$DB'\
AND TABLE_NAME='$TABLE'\
AND COLUMN_NAME='$COLUMN'" | mysql -u $USER -NB`
TYPE="$TYPE($LENGTH)"
fi
BINARY_STATEMENT="$BINARY_STATEMENT, MODIFY COLUMN \`$COLUMN\` $TYPE CHARACTER SET binary"
NEW_ENCODING_STATEMENT="$NEW_ENCODING_STATEMENT, MODIFY COLUMN \`$COLUMN\` $TYPE CHARACTER SET $NEW_ENCODING"
done
echo "switching $TABLE to binary..."
echo $BINARY_STATEMENT | mysql -u $USER
echo "switching $TABLE to $NEW_ENCODING..."
echo $NEW_ENCODING_STATEMENT | mysql -u $USER
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment