Skip to content

Instantly share code, notes, and snippets.

@kch kch/strip_html.rb
Created Apr 21, 2009

Embed
What would you like to do?
my super duper html stripper regexp
class String
def strip_html!
gsub!(/
<
\/? # optional end tag
([\w:-]+) # tag name (capturing)
(?: # optional attribute set (allowing even for end tags)
(?: # group for attribute repetition
\s+ # mandatory space before first attribute
[\w:-]+ # attribute name
(?: # optional attribute value
\s*=\s* # optionally space-wrapped equal sign
(?: # attribute value group (for |)
'[^']*' | # either a single quoted attribute '#happy color coding
"[^"]*" | # or a double quoted attribute "#happy color coding
[^\s>]+ # or a non-space non tag end value
) # end attribute value group
)? # attr value is optional
)* # can have zero or more attributes
)? # may not have attributes at all
\s* # optional trailing spaces
\/? # optional self-closing empty tag
> # end tag
/ix, '')
self
end
def strip_html
dup.strip_html!
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.