Skip to content

Instantly share code, notes, and snippets.

@rngtng
Created January 26, 2011 11:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rngtng/796571 to your computer and use it in GitHub Desktop.
Save rngtng/796571 to your computer and use it in GitHub Desktop.
fix irregular HTML by replacing not closed '<' with '&lt;'
module SanitizeHelper
def fix_irregular_html(html)
regexp = /<([^<>]*)(<|$)/
#we need to do this multiple time as regex are overlapping
while (fixed_html = html.gsub(regexp, "&lt;\\1\\2")) && fixed_html != html
html = fixed_html
end
fixed_html
end
end
# Test
describe SanitizeHelper do
it "should fix irregular html" do
replaces = {
"Foo" => "Foo",
"Foo <3" => "Foo &lt;3",
"Foo 3>" => "Foo 3>",
"Foo <3>" => "Foo <3>",
"Foo <3<4" => "Foo &lt;3&lt;4",
"Foo 3><4" => "Foo 3>&lt;4",
"Foo <3><4" => "Foo <3>&lt;4",
"Foo <3<4>" => "Foo &lt;3<4>",
"Foo <3<<4>" => "Foo &lt;3&lt;<4>",
"Foo 3><4>" => "Foo 3><4>",
"Foo <3><4>" => "Foo <3><4>",
"Foo <3<4<5" => "Foo &lt;3&lt;4&lt;5",
"Foo 3><4<5" => "Foo 3>&lt;4&lt;5",
"Foo <3><4<5" => "Foo <3>&lt;4&lt;5",
"Foo <3<4><5" => "Foo &lt;3<4>&lt;5",
"Foo 3><4><5" => "Foo 3><4>&lt;5",
"Foo <3><4><5" => "Foo <3><4>&lt;5",
"Foo <3<4<5<6" => "Foo &lt;3&lt;4&lt;5&lt;6",
"Foo 3><4<5<6" => "Foo 3>&lt;4&lt;5&lt;6",
"Foo <3><4<5<6" => "Foo <3>&lt;4&lt;5&lt;6",
"Foo <3<4><5<6" => "Foo &lt;3<4>&lt;5&lt;6",
"Foo 3><4><5<6" => "Foo 3><4>&lt;5&lt;6",
"Foo <3><4><5<6" => "Foo <3><4>&lt;5&lt;6",
"Foo <3<4<5<6<7" => "Foo &lt;3&lt;4&lt;5&lt;6&lt;7",
"Foo <3<4<5<6<7<8" => "Foo &lt;3&lt;4&lt;5&lt;6&lt;7&lt;8",
"Foo <3<4<5<6<7<8<9" => "Foo &lt;3&lt;4&lt;5&lt;6&lt;7&lt;8&lt;9",
}.each do |is, should|
fix_irregular_html(is).should == should
end
end
end
@rngtng
Copy link
Author

rngtng commented Jan 26, 2011

@blegat
Copy link

blegat commented Oct 18, 2013

Thanks a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment