Skip to content

Instantly share code, notes, and snippets.

@Pistos
Created November 7, 2008 22:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Pistos/22990 to your computer and use it in GitHub Desktop.
Save Pistos/22990 to your computer and use it in GitHub Desktop.
Using better-benchmark.
#!/usr/bin/env ruby
require 'rubygems'
gem 'hpricot', '>=0.6.170'
require 'open-uri'
require 'hpricot'
require 'nokogiri'
require 'better-benchmark'
uri = URI.parse( "http://railstips.org/assets/2008/8/9/timeline.xml" )
content = uri.read
hdoc = Hpricot.XML(content)
ndoc = Nokogiri.Hpricot(content)
#ndoc = Nokogiri.XML(content)
hdoc2 = Hpricot.scan(content)
puts "\nnokogiri vs. hpricot: Parsing XML"
result = Benchmark.compare_realtime(
:iterations => 10,
:inner_iterations => 150,
:verbose => true
) {
Nokogiri.Hpricot content
}.with {
Hpricot.XML content
}
Benchmark.report_on result
puts "\nnokogiri vs. hpricot scan: Parsing XML"
result = Benchmark.compare_realtime(
:iterations => 10,
:inner_iterations => 600,
:verbose => true
) {
Nokogiri.Hpricot content
}.with {
Hpricot.scan content
}
Benchmark.report_on result
puts "\nnokogiri vs. hpricot: Searching with XPath"
result = Benchmark.compare_realtime(
:iterations => 10,
:inner_iterations => 200,
:verbose => true
) {
info = ndoc.xpath('//status/text').first.inner_text
url = ndoc.xpath('//user/name').first.inner_text
}.with {
info = hdoc.search('//status/text').first.inner_text
url = hdoc.search('//user/name').first.inner_text
}
Benchmark.report_on result
puts "\nnokogiri vs. hpricot (scanned): Searching with XPath"
result = Benchmark.compare_realtime(
:iterations => 10,
:inner_iterations => 200,
:verbose => true
) {
info = ndoc.xpath('//status/text').first.inner_text
url = ndoc.xpath('//user/name').first.inner_text
}.with {
info = hdoc2.search('//status/text').first.inner_text
url = hdoc2.search('//user/name').first.inner_text
}
Benchmark.report_on result
puts "\nnokogiri vs. hpricot: Searching with CSS"
result = Benchmark.compare_realtime(
:iterations => 10,
:inner_iterations => 200,
:verbose => true
) {
info = ndoc.search('status text').first.inner_text
url = ndoc.search('user name').first.inner_text
}.with {
info = hdoc.search('status text').first.inner_text
url = hdoc.search('user name').first.inner_text
}
Benchmark.report_on result
puts "\nnokogiri vs. hpricot (scanned): Searching with CSS"
result = Benchmark.compare_realtime(
:iterations => 10,
:inner_iterations => 200,
:verbose => true
) {
info = ndoc.search('status text').first.inner_text
url = ndoc.search('user name').first.inner_text
}.with {
info = hdoc2.search('status text').first.inner_text
url = hdoc2.search('user name').first.inner_text
}
Benchmark.report_on result
I got a segfault when I tried to run this at first, and am now getting lots of
"called on terminated object" at random times (!). Wondering what to do at this
point...
hpricot 0.6.170
nokogiri 1.0.2
ruby 1.8.7 (2008-06-20 patchlevel 22) [i686-linux]
--------------------
UPDATE:
Okay, when upgrading to nokogiri 1.0.3, and then removing old hpricot installs
(0.6 and 0.6.164), I was able to at least avoid segfaults and other errors.
But gee, the results under Ruby 1.8.7 aren't very flattering for hpricot?
nokogiri vs. hpricot: Parsing XML
..........
Set 1 mean: 1.349 s
Set 1 std dev: 0.086
Set 2 mean: 6.680 s
Set 2 std dev: 0.136
p.value: 1.0825088224469e-05
W: 0.0
The difference (+395.1%) IS statistically significant.
nokogiri vs. hpricot scan: Parsing XML
..........
Set 1 mean: 5.426 s
Set 1 std dev: 0.020
Set 2 mean: 3.345 s
Set 2 std dev: 0.050
p.value: 1.0825088224469e-05
W: 100.0
The difference (-38.3%) IS statistically significant.
nokogiri vs. hpricot: Searching with XPath
..........
Set 1 mean: 0.079 s
Set 1 std dev: 0.014
Set 2 mean: 8.383 s
Set 2 std dev: 1.185
p.value: 1.0825088224469e-05
W: 0.0
The difference (+10473.4%) IS statistically significant.
nokogiri vs. hpricot (scanned): Searching with XPath
..........
Set 1 mean: 0.066 s
Set 1 std dev: 0.021
Set 2 mean: 2.908 s
Set 2 std dev: 0.169
p.value: 1.0825088224469e-05
W: 0.0
The difference (+4302.5%) IS statistically significant.
nokogiri vs. hpricot: Searching with CSS
..........
Set 1 mean: 0.386 s
Set 1 std dev: 0.053
Set 2 mean: 9.140 s
Set 2 std dev: 0.153
p.value: 1.0825088224469e-05
W: 0.0
The difference (+2265.3%) IS statistically significant.
nokogiri vs. hpricot (scanned): Searching with CSS
..........
Set 1 mean: 0.373 s
Set 1 std dev: 0.074
Set 2 mean: 3.060 s
Set 2 std dev: 0.296
p.value: 1.0825088224469e-05
W: 0.0
The difference (+720.4%) IS statistically significant.
Use Nokogiri's XPath selectors for fastest speed - CSS-based search is faster than Hpricot but not as fast.
Also take note that this benchmark is only shows parsing of XML (not HTML).
This benchmark takes the original and uses better-benchmark instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment