Created

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

Comparison of Loofah against other Ruby HTML sanitization libraries

View README.markdown

Overview of the Benchmark

The following benchmark output was generated from the codes at http://github.com/flavorjones/loofah/tree/master/benchmark

These results show the performance of Loofah scrubbing methods against comparable methods from other common open-source libraries:

  • ActionView sanitize() and strip_tags()
  • Sanitize sanitize()
  • HTML5lib sanitize()
  • HtmlFilter filter()

HTML of various sizes is tested:

  • a large document (~98 KB)
  • a sizable fragment (~3 KB)
  • a small snippet (58 bytes)

Head to Head against ActionView sanitize()

Loofah wins by about 20% on large documents and fragments, but loses on small snippets.

Loofah's comparative slowness for small snippets is because Nokogiri uses libxml2, which has a constant "startup overhead" that is incurred before parsing HTML regardless of size. ActionPack's regular expressions have no such startup overhead.

The win for ActionView on small snippets comes at a cost, though. From the ActionView comments:

Please note that sanitizing user-provided text [with ActionView]
does not guarantee that the resulting markup is valid (conforming
to a document type) or even well-formed.  The output may still
contain e.g. unescaped '<', '>', '&' characters and confuse
browsers.

Loofah will always generate well-formed and valid HTML with proper encoding and escaping. Something to keep in mind when choosing a sanitizing library. Just sayin'.

Head to Head against ActionView strip_tags()

Loofah wins by between 60% and 100% on large documents and fragments, but loses again on small snippets.

See previous section for explanation and commentary.

Head to Head against Sanitize sanitize()

Loofah wins on HTML of all sizes, between 13% and 280%.

Head to Head against HTML5lib sanitize()

Loofah wins on HTML of all sizes, between 300% and 1450%.

Yes. Not a typo. REXML is that slow.

Head to Head against HtmlFilter filter()

Loofah wins by a factor of two on large and medium documents, but loses on small snippets.

HtmlFilter also uses regular expressions and hence cannot guarantee that the output markup is well-formed or valid.

View README.markdown
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
Nokogiri version: {"warnings"=>[], "libxml"=>{"loaded"=>"2.7.5", "binding"=>"extension", "compiled"=>"2.7.5"}, "nokogiri"=>"1.4.0"}
Loofah version: "0.4.1"
---------- rehearsal ----------
(... omitted for brevity ...)
 
---------- realsies ----------
HeadToHeadRailsSanitize
Large document, 98282 bytes (x100)
total single rel
Loofah::Helpers.sanitize 17.019 (0.170191) -
ActionView sanitize 21.525 (0.215252) 1.26x
 
Small fragment, 3178 bytes (x1000)
total single rel
Loofah::Helpers.sanitize 5.559 (0.005559) -
ActionView sanitize 5.653 (0.005653) 1.02x
 
Text snippet, 58 bytes (x10000)
total single rel
Loofah::Helpers.sanitize 4.272 (0.000427) -
ActionView sanitize 1.170 (0.000117) 0.27x
 
HeadToHeadRailsStripTags
Large document, 98282 bytes (x100)
total single rel
Loofah::Helpers.strip_tags 8.019 (0.080195) -
ActionView strip_tags 14.615 (0.146151) 1.82x
 
Small fragment, 3178 bytes (x1000)
total single rel
Loofah::Helpers.strip_tags 2.197 (0.002197) -
ActionView strip_tags 4.220 (0.004220) 1.92x
 
Text snippet, 58 bytes (x10000)
total single rel
Loofah::Helpers.strip_tags 2.070 (0.000207) -
ActionView strip_tags 0.931 (0.000093) 0.45x
 
HeadToHeadSanitizerSanitize
Large document, 98282 bytes (x100)
total single rel
Loofah :strip 9.919 (0.099188) -
Sanitize.clean 27.625 (0.276255) 2.79x
 
Small fragment, 3178 bytes (x1000)
total single rel
Loofah :strip 5.317 (0.005317) -
Sanitize.clean 5.811 (0.005811) 1.09x
 
Text snippet, 58 bytes (x10000)
total single rel
Loofah :strip 4.156 (0.000416) -
Sanitize.clean 4.235 (0.000423) 1.02x
 
HeadToHeadHtml5LibSanitize
Large document, 98282 bytes (x100)
total single rel
Loofah :escape 8.643 (0.086426) -
HTML5lib.sanitize 125.315 (1.253149) 14.50x
 
Small fragment, 3178 bytes (x1000)
total single rel
Loofah :escape 4.715 (0.004715) -
HTML5lib.sanitize 36.438 (0.036438) 7.73x
 
Text snippet, 58 bytes (x10000)
total single rel
Loofah :escape 3.881 (0.000388) -
HTML5lib.sanitize 11.641 (0.001164) 3.00x
 
HeadToHeadHTMLFilter
Large document, 98282 bytes (x100)
total single rel
Loofah::Helpers.sanitize 15.579 (0.155785) -
HTMLFilter.filter 32.654 (0.326540) 2.10x
 
Small fragment, 3178 bytes (x1000)
total single rel
Loofah::Helpers.sanitize 5.097 (0.005097) -
HTMLFilter.filter 12.034 (0.012034) 2.36x
 
Text snippet, 58 bytes (x10000)
total single rel
Loofah::Helpers.sanitize 3.822 (0.000382) -
HTMLFilter.filter 1.876 (0.000188) 0.49x
rgrove commented

Here's a more up to date comparison between Loofah, Sanitize, and HTMLFilter: https://github.com/rgrove/sanitize/blob/master/COMPARISON.md#performance-comparison

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.