Skip to content

Instantly share code, notes, and snippets.

@taufik-nurrohman
Created January 17, 2015 00:45
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save taufik-nurrohman/b3b683f28d899e19f830 to your computer and use it in GitHub Desktop.
Save taufik-nurrohman/b3b683f28d899e19f830 to your computer and use it in GitHub Desktop.
Regex to Extract HTML Attributes
<?php
/*! Based on <https://github.com/mecha-cms/cms/blob/master/system/kernel/converter.php> */
function extract_html_attributes($input) {
if( ! preg_match('#^(<)([a-z0-9\-._:]+)((\s)+(.*?))?((>)([\s\S]*?)((<)\/\2(>))|(\s)*\/?(>))$#im', $input, $matches)) return false;
$matches[5] = preg_replace('#(^|(\s)+)([a-z0-9\-]+)(=)(")(")#i', '$1$2$3$4$5<attr:value>$6', $matches[5]);
$results = array(
'element' => $matches[2],
'attributes' => null,
'content' => isset($matches[8]) && $matches[9] == '</' . $matches[2] . '>' ? $matches[8] : null
);
if(preg_match_all('#([a-z0-9\-]+)((=)(")(.*?)("))?(?:(\s)|$)#i', $matches[5], $attrs)) {
$results['attributes'] = array();
foreach($attrs[1] as $i => $attr) {
$results['attributes'][$attr] = isset($attrs[5][$i]) && ! empty($attrs[5][$i]) ? ($attrs[5][$i] != '<attr:value>' ? $attrs[5][$i] : "") : $attr;
}
}
return $results;
}
@taufik-nurrohman
Copy link
Author

Test Code

<?php

$test = array(
    '<div class="foo" id="bar" data-test="1000">',
    '<div>',
    '<div class="foo" id="bar" data-test="1000">test content</div>',
    '<div>test content</div>',
    '<div>test content</span>',
    '<div>test content',
    '<div></div>',
    '<div class="foo" id="bar" data-test="1000"/>',
    '<div class="foo" id="bar" data-test="1000" />',
    '< div  class="foo"     id="bar"   data-test="1000"       />',
    '<div class id data-test>',
    '<id="foo" data-test="1000">',
    '<id data-test>',
    '<select name="foo" id="bar" empty-value-test="" selected disabled><option value="1">Option 1</option></select>'
);

foreach($test as $t) {
    var_dump($t, extract_html_attributes($t));
    echo '<hr>';
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment