Skip to content

Instantly share code, notes, and snippets.

@joostvanveen
Created December 31, 2012 10:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save joostvanveen/4419014 to your computer and use it in GitHub Desktop.
Save joostvanveen/4419014 to your computer and use it in GitHub Desktop.
Convert a Windows CP1252 string to Latin1. Converts the typical Windows (Word) crappy display characters like curly quotation marks, triple dots, etc. Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version …
<?php
function transcribe_cp1252_to_latin1($cp1252) {
return strtr(
$cp1252,
array(
"\x80" => "e", "\x81" => " ", "\x82" => "'", "\x83" => 'f',
"\x84" => '"', "\x85" => "…", "\x86" => "+", "\x87" => "#",
"\x88" => "^", "\x89" => "0/00", "\x8A" => "S", "\x8B" => "<",
"\x8C" => "OE", "\x8D" => " ", "\x8E" => "Z", "\x8F" => " ",
"\x90" => " ", "\x91" => "`", "\x92" => "'", "\x93" => '"',
"\x94" => '"', "\x95" => "*", "\x96" => "-", "\x97" => "--",
"\x98" => "~", "\x99" => "(™)", "\x9A" => "s", "\x9B" => ">",
"\x9C" => "oe", "\x9D" => " ", "\x9E" => "z", "\x9F" => "Y",
"…" => "..."
)
);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment