Skip to content

Instantly share code, notes, and snippets.

@loilo

loilo/manipulate_html.php

Last active Jul 18, 2020
Embed
What would you like to do?
Modify HTML Using PHP

Modify HTML Using PHP

Instead of relying on unsafe regular expressions and string manipulation, we can utilize PHP's built-in DOM extension for modifying HTML.

The manipulate_html() function from this snippet allows you to pass some HTML code, traverse & modify each of its DOM nodes in a callback and will return to you the modified HTML code.

The following example modifies all images in an HTML snippet to use lazy loading:

manipulate_html('<img src="foo.jpg">', function (DOMNode $node) {
    if ($node->nodeName === 'img') {
        $node->setAttribute('loading', 'lazy');
    }
});

// Returns '<img src="foo.jpg" loading="lazy">'

This is just a single element, but you can test this snippet with any website:

manipulate_html(
    file_get_contents('https://www.php.net/'),
    function (DOMNode $node) {
        // ...
    }
);
<?php
function walk_dom(DOMNode $domNode, callable $callback): void
{
foreach ($domNode->childNodes as $node) {
$callback($node);
if ($node->hasChildNodes()) {
walk_dom($node, $callback);
}
}
}
function manipulate_html(string $html, callable $callback): string
{
$dom = new DOMDocument();
// Don't spread warnings when encountering malformed HTML
$previousXmlErrorBehavior = libxml_use_internal_errors(true);
// Use XML processing instruction to properly interpret document as UTF-8
@$dom->loadHTML(
'<?xml encoding="utf-8" ?>' . $html,
LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD
);
foreach ($dom->childNodes as $item) {
if ($item->nodeType === XML_PI_NODE) {
$dom->removeChild($item);
}
}
$dom->encoding = 'UTF-8';
walk_dom($dom, $callback);
// Turn DOM back into HTML and remove leading/trailing whitespace
$result = trim($dom->saveHTML());
// Restore previous XML error behavior
libxml_use_internal_errors($previousXmlErrorBehavior);
return $result;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.