Skip to content

Instantly share code, notes, and snippets.

@ceejayoz
Created July 2, 2009 20:15
Show Gist options
  • Save ceejayoz/139687 to your computer and use it in GitHub Desktop.
Save ceejayoz/139687 to your computer and use it in GitHub Desktop.
Removes nasty MS Word characters from HTML content.
$text = str_replace(chr(130), ',', $text); // baseline single quote
$text = str_replace(chr(132), '"', $text); // baseline double quote
$text = str_replace(chr(133), '...', $text); // ellipsis
$text = str_replace(chr(145), "'", $text); // left single quote
$text = str_replace(chr(146), "'", $text); // right single quote
$text = str_replace(chr(147), '"', $text); // left double quote
$text = str_replace(chr(148), '"', $text); // right double quote
$text = mb_convert_encoding($text, 'HTML-ENTITIES', 'UTF-8');
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment