Skip to content

Instantly share code, notes, and snippets.

@vishalkakadiya
Last active October 31, 2022 16:48
Show Gist options
  • Save vishalkakadiya/598405dfe437373f9ed5d2f635bf3d32 to your computer and use it in GitHub Desktop.
Save vishalkakadiya/598405dfe437373f9ed5d2f635bf3d32 to your computer and use it in GitHub Desktop.
WordPress(or any mysql database) - Convert special characters(eg. Latin-1) to UTF-8 in mysql database
The actual step-by-step WordPress / MySQL fix…
1. Back up all your stuff first (likely using phpMyAdmin / CPANEL)
Before doing any of the following it strongly encouraged to back up all your data and files. Just to be safe. At the very least, your full database and the WordPress config file: wp-config.php
2. Note the settings that your WordPress is currently using (from wp-config.php)
Specifically, your MySQL database configuration, including DB_NAME, DB_USER, DB_PASSWORD, and also DB_CHARSET and DB_COLLATE
// ** MySQL settings - You can get this info from your web host ** //
/** The name of the database for WordPress */
define('DB_NAME', 'database_name_here');
/** MySQL database username */
define('DB_USER', 'username_here');
/** MySQL database password */
define('DB_PASSWORD', 'password_here');
3. Create a new MySQL database and create a new user with full privileges to that new database (both using phpMyAdmin)
After you export and “clean” you current data, you’re going to import it into this new database. Then you’ll be using this new database going forward.
4. Export your current WordPress database
The following exports all your data into a single text file (exp-my-database-latin1.sql).
mysqldump --opt --default-character-set=latin1 --skip-extended-insert --user my_db_user_here --password my_db_name_here -r exp-my-database-latin1.sql --log-error=log-mysqldump-my-database.txt
5. Change the character set in the output file (to the correct one)
replace "CHARSET=latin1" "CHARSET=utf8" "SET NAMES latin1" "SET NAMES utf8" < exp-my-database-latin1.sql > exp-my-database-utf8.sql
6. Run a compare script to sanity check your changes
diff exp-my-database-latin1.sql exp-my-database-utf8.sql
7. Import the cleaned data into the NEW database
mysql --default-character-set=utf8 --user=my_db_user_here --password='my_password_goes_here' my_db_name_here < exp-my-database-utf8.sql
8. Update your WordPress config to point at the shiny, clean, new database
Just update your wp-config.php file with the database name, login, and password of the clean, shiny new database.
Also, update your encoding settings in wp-config.php to reflect the new UTF character set:
/** Database Charset to use in creating database tables. */
define('DB_CHARSET', 'utf8');
/** The Database Collate type. Don't change this if in doubt. */
define('DB_COLLATE', '');
9. Test your changes… browse your website thoroughly and look for those funny characters
If it didn’t “work”, you’ll have funny characters all over your posts; they’ll be quite obvious as you browse 10 to 20 posts looking for them.
That’s it, you should now be able to pull up your website and enjoy the yummy UTF8 goodness, no more Latin1.
IF for whatever reason you need to back out this change, that’s very easy to do… just point your WordPress back to the original database (that you didn’t mess with).
To do that, just edit your wp-config.php file and restore the information you noted in step 1: DB_NAME, DB_USER, DB_PASSWORD, and also DB_CHARSET and DB_COLLATE. Then bring up your website and all should be good again.
# I am using wp-cli here to fix special characters.
# Export your database with character set right now latin1 charaters,
wp db export --opt --default-character-set=latin1 --skip-extended-insert
# Replace `DATABASE_NAME` with your exported file name, it will be copying original file.
cp DATABASE_NAME.sql DATABASE_NAME-UTF8.sql
# Replace `CHARSET=latin1` with `CHARSET=utf8`.
sed -i 's/CHARSET=latin1/CHARSET=utf8/g' DATABASE_NAME-UTF8.sql
# Replace `SET NAMES latin1` with `SET NAMES utf8`.
sed -i 's/SET NAMES latin1/SET NAMES utf8/g' DATABASE_NAME-UTF8.sql
# Import UTF-8 file back to the database.
wp db import DATABASE_NAME-UTF8.sql --default-character-set=utf8
After above all steps make sure your database is UTF-8 format, you can config that in wp-config.php file by defining constant `DB_CHARSET`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment