Skip to content

Instantly share code, notes, and snippets.

@cmbirk
Created September 13, 2013 00:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cmbirk/6545739 to your computer and use it in GitHub Desktop.
Save cmbirk/6545739 to your computer and use it in GitHub Desktop.
PHP script to strip entities from XML
<?php
$unconvertedEntity = array(
'&percnt;',
'&numsp;',
'&lowbar;'
);
$convertedChar = array(
'%',
' ',
'_'
);
$filename = $argv[1];
$xml = file_get_contents($filename);
$xml = html_entity_decode($xml);
$xml = htmlspecialchars_decode($xml, ENT_HTML5);
$xml = str_replace($unconvertedEntity, $convertedChar, $xml);
$xml = str_replace('&', '&amp;', $xml);
file_put_contents('stripped/'.$filename, $xml);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment