Skip to content

Instantly share code, notes, and snippets.

@Shimmermare
Last active November 29, 2019 19:09
Show Gist options
  • Save Shimmermare/6d0548af59982649f14468893feb3153 to your computer and use it in GitHub Desktop.
Save Shimmermare/6d0548af59982649f14468893feb3153 to your computer and use it in GitHub Desktop.
Regex for matching HTML tag
!!! DON'T USE REGEX TO PARSE HTML UNLESS YOU ABSOLUTELY HAVE TO !!!
Match full tag: <\s*(\w+)(?:\s*\w+=(?:"[^"]*"|'[^']*'))*?\s*>(.*?)<\s*\/\s*\1\s*>
Group 1: tag name
Group 2: tag content
! Uses repeated captures, thus isn't supported by Python's re - use this https://pypi.org/project/regex/ !
Match self-closing tag: <\s*(\w+)(?:\s*\w+=(?:"[^"]*"|:'[^']*'))*?\s*\/\s*>
Group 1: tag name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment