Skip to content

Instantly share code, notes, and snippets.

@hollodotme
Last active April 8, 2024 23:13
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 9 You must be signed in to fork a gist
  • Save hollodotme/fe24b961680e08473072 to your computer and use it in GitHub Desktop.
Save hollodotme/fe24b961680e08473072 to your computer and use it in GitHub Desktop.
Converting mysql string data form latin1 to utf8 for utf8 data stored in utf8 tables via latin1 connection
<?php
/**
* Requires php >= 5.5
*
* Use this script to convert utf-8 data in utf-8 mysql tables stored via latin1 connection
* This is a PHP port from: https://gist.github.com/njvack/6113127
*
* @link : http://www.ridesidecar.com/2013/07/30/of-databases-and-character-encodings/
*
* BACKUP YOUR DATABASE BEFORE YOU RUN THIS SCRIPT!
*
* Once the script ran over your databases, change your database connection charset to utf8:
*
* $dsn = 'mysql:host=localhost;port=3306;charset=utf8';
*
* DON'T RUN THIS SCRIPT MORE THAN ONCE!
*
* @author hollodotme
*/
header('Content-Type: text/plain; charset=utf-8');
$dsn = 'mysql:host=localhost;port=3306;charset=latin1';
$user = 'user';
$password = 'password';
$options = [
\PDO::ATTR_CURSOR => \PDO::CURSOR_FWDONLY,
\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true,
\PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET latin1",
];
$dbManager = new \PDO( $dsn, $user, $password, $options );
$databasesToConvert = [ 'database1', 'database2', /** database3, ... */ ];
$typesToConvert = [ 'char', 'varchar', 'tinytext', 'mediumtext', 'text', 'longtext' ];
foreach ( $databasesToConvert as $database )
{
echo $database, ":\n";
echo str_repeat( '=', strlen( $database ) + 1 ), "\n";
$dbManager->exec( "USE `{$database}`" );
$tablesStatement = $dbManager->query( "SHOW TABLES" );
while ( ($table = $tablesStatement->fetchColumn()) )
{
echo "Table: {$table}:\n";
echo str_repeat( '-', strlen( $table ) + 8 ), "\n";
$columnsToConvert = [ ];
$columsStatement = $dbManager->query( "DESCRIBE `{$table}`" );
while ( ($tableInfo = $columsStatement->fetch( \PDO::FETCH_ASSOC )) )
{
$column = $tableInfo['Field'];
echo ' * ' . $column . ': ' . $tableInfo['Type'];
$type = preg_replace( "#\(\d+\)#", '', $tableInfo['Type'] );
if ( in_array( $type, $typesToConvert ) )
{
echo " => must be converted\n";
$columnsToConvert[] = $column;
}
else
{
echo " => not relevant\n";
}
}
if ( !empty($columnsToConvert) )
{
$converts = array_map(
function ( $column )
{
return "`{$column}` = CONVERT(CAST(CONVERT(`{$column}` USING latin1) AS binary) USING utf8)";
},
$columnsToConvert
);
$query = "UPDATE `{$table}` SET " . join( ', ', $converts );
echo "\n", $query, "\n";
$dbManager->exec( $query );
}
echo "\n--\n";
}
echo "\n";
}
@jonom
Copy link

jonom commented Sep 19, 2016

Thanks for this!

@KayakinKoder
Copy link

Fantastic. One recommendation I would make is to use utf8mb4 instead of utf8. MySQL's utf8 character set does not actually fully implement utf8, their fix in 5.5.3+ is utf8mb4: https://mathiasbynens.be/notes/mysql-utf8mb4

@szmigieldesign
Copy link

You, sir, saved me a ton of time with this script. Thank you! It's brilliantly simple and it works.

@jhfredy
Copy link

jhfredy commented Apr 30, 2019

thank you very much for this post helped me a lot brother

@pobegov
Copy link

pobegov commented Jul 4, 2019

Fantastic. One recommendation I would make is to use utf8mb4 instead of utf8. MySQL's utf8 character set does not actually fully implement utf8, their fix in 5.5.3+ is utf8mb4: https://mathiasbynens.be/notes/mysql-utf8mb4

done here with some additions (based on this original script)
https://gist.github.com/pobegov/f8b293fb6eb658a13feb1c318e6c07ed

@aleerezaee
Copy link

Great Bro! You've helped me a lot! Saved a lot of time!

@knightseven
Copy link

Thank you so much !!! That script is awesome. Solved my problem with my old mysql databases.

All the best to you !!!

@beriksson3
Copy link

Amazing, thank you!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment