Skip to content

Instantly share code, notes, and snippets.

@joostvanveen
Created December 31, 2012 10:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save joostvanveen/4418967 to your computer and use it in GitHub Desktop.
Save joostvanveen/4418967 to your computer and use it in GitHub Desktop.
Check to see if a string is UTF-8 and convert it if it's not. Accidentally posted this as anonymous before :(
<?php
function getUtf8String($string) {
if ( !isUtf8($string) )
return utf8_encode($string);
return $string;
}
function isUtf8($string) {
if ( function_exists("mb_check_encoding") ) {
return mb_check_encoding($string, 'UTF8');
}
return preg_match('%^(?:
[\x09\x0A\x0D\x20-\x7E] # ASCII
| [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
| \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
| \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
| \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)*$%xs', $string);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment