Skip to content

Instantly share code, notes, and snippets.

@lifo101
Last active October 8, 2015 16:49
Show Gist options
  • Save lifo101/3361046 to your computer and use it in GitHub Desktop.
Save lifo101/3361046 to your computer and use it in GitHub Desktop.
htmlentities() replacement that converts all entities in a string into character entities (suitable for XML documents)
<?php
/**
* Works like htmlentities() but only encodes entities as numeric codes
* instead of names (for use in XML).
*/
function xmlentities($str, $encoding = 'UTF-8', $asList = false)
{
$list = array();
$ent = '';
$chars = preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY);
foreach ($chars as $chr) {
// FYI: 'UCS-4BE' encodes all characters as a 4 byte 32bit int
$str = mb_convert_encoding($chr, 'UCS-4BE', $encoding);
$len = mb_strlen($str, 'UCS-4BE');
for ($i = 0; $i < $len; $i++){
$s = mb_substr($str, $i, 1, 'UCS-4BE');
list(, $val) = unpack('N', $s);
if ($val < 128) {
$list[] = chr($val);
$ent .= htmlspecialchars(chr($val), ENT_QUOTES, $encoding);
} else {
$list[] = $val;
$ent .= '&#' . $val . ';';
}
}
}
return $asList ? $list : $ent;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment