Skip to content

Instantly share code, notes, and snippets.

@jonlabelle
Last active May 31, 2024 13:44
Show Gist options
  • Save jonlabelle/6317696 to your computer and use it in GitHub Desktop.
Save jonlabelle/6317696 to your computer and use it in GitHub Desktop.
This Regular Expression removes all attributes and values from an HTML tag, preserving the tag itself and textual content (if found).

Strip HTML Attributes

<([a-z][a-z0-9]*)[^>]*?(/?)>
token explanation
< match < at beginning of tags
( start capture group $1 - tag name
[a-z] match a through z
[a-z0-9]* match a through z or 0 through 9 zero or more times
) end capture group
[^>]*? match anything other than >, zero or more times, not-greedy (wont eat the /)
(/?) capture group $2 - / if it is there
> match >

Add some quoting, and use the replacement text <$1$2> it should strip any text after the tagname until the end of tag /> or just >.

Example

Before

HTML containing style attributes.

<p style="padding:0px;">
	<strong style="padding:0;margin:0;">hello</strong>
</p>

After

HTML attributes removed.

<p>
	<strong>hello</strong>
</p>

PHP Example

$with_attr    = '<p style="padding:0px;"><strong style="padding:0;margin:0;">hello</strong></p>';
$without_attr = preg_replace("/<([a-z][a-z0-9]*)[^>]*?(/?)>/i",'<$1$2>', $with_attr);

echo $without_attr
<p><strong>hello</strong></p>

stackoverflow post.

@miguelgisbert
Copy link

miguelgisbert commented Jul 22, 2022

I'll answer myself. Here's how to remove html tags and put them back again: https://gist.github.com/miguelgisbert/7ef9ee15aa0cc1ba32ea5ed192e486c3

    $str1 = "<p style='color:red;'>red</p><strong style='color:green;'>green</strong>";
    $pattern = '/<[^>]+>/';

    preg_match_all($pattern, $str1, $matches);
    $replacements = $matches[0];
    $str2 = preg_replace($pattern, '<>', $str1);

    // TRanslate $str2 with DeepL or do whatever without html tags

    $str3 = preg_replace_callback('/<>/', function($matches) use (&$replacements) {
        return array_shift($replacements);
    }, $str2);

    echo "str1 ".$str1."<br>";
    echo "str2 ".$str2."<br>";
    echo "str3 ".$str3."<br>";

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment