Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cliffordp/c849d40cf9a91e1550d3711f4c577f31 to your computer and use it in GitHub Desktop.
Save cliffordp/c849d40cf9a91e1550d3711f4c577f31 to your computer and use it in GitHub Desktop.
Use DOMDocument to do a more robust job at force_balance_tags.
<?php
/**
* Use DOMDocument to do a more robust job at force_balance_tags.
*
* "force_balance_tags() is not a really safe function. It doesn’t use an HTML parser
* but a bunch of potentially expensive regular expressions. You should use it only if
* you control the length of the excerpt too. Otherwise you could run into memory issues
* or some obscure bugs." <http://wordpress.stackexchange.com/a/89169/8521>
*
* For more reasons why to not use regular expressions on markup, see http://stackoverflow.com/a/1732454/93579
*
* @link http://wordpress.stackexchange.com/questions/89121/why-doesnt-default-wordpress-page-view-use-force-balance-tags
* @see force_balance_tags()
*
* @param string $markup
* @return string
*/
function force_balanced_tags2( $markup ) {
$dom = new DOMDocument();
// Note the meta charset is used to prevent UTF-8 data from being interpreted as Latin1, thus corrupting it
$html = '<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body>';
$html .= $markup;
$html .= '</body></html>';
$dom->loadHTML( $html );
$body = $dom->getElementsByTagName( 'body' )->item( 0 );
$markup = str_replace( array( '<body>', '</body>' ), '', $dom->saveHTML( $body ) );
return $markup;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment