Skip to content

Instantly share code, notes, and snippets.

@nevstokes
Created December 17, 2011 22:33
Show Gist options
  • Save nevstokes/1491625 to your computer and use it in GitHub Desktop.
Save nevstokes/1491625 to your computer and use it in GitHub Desktop.
Regex to extract site-relative stylesheets from HTML page source
$regex = '/
(?= # Lookahead:
<link # Open link tag
(?:\s+[^>]*)? # Any leading data (non-capturing)
\s+rel= # rel attribute preceded by one or more spaces
([\'"]) # Opening quote for attribute ($matches[1] for backreference)
stylesheet # Match attribute value
\1 # Backreference to match quotes
.*? # Consume any subsequent data non-greedily
\/? # Optionally self-close
> # Close link tag
) #
<link # Open link tag
(?:\s+[^>]*)? # Any leading data (non-capturing)
\s+href= # href attribute preceded by one or more spaces
([\'"]) # Opening quote for attribute ($matches[2] for backreference)
(?! # Negative lookahead:
(?:https?:)?\/\/ # Not interested in non-local files (non-capturing)
) #
(.+?) # The href attribute value ($matches[3])
\1 # Backreference to match quotes
.*? # Consume any subsequent data non-greedily
\/? # Optionally self-close
> # Close link tag
/mixs';
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment