Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Nokogiri HTML Parsers - BenchPress Results

Nokogiri Parser Comparisons

Author: Ezekiel Templin
Date: February 06, 2011
Summary: Comparing Nokogiri's parsers

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i7 2.66 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH" is up to 16% faster over 1,000 repetitions

Nokogiri XPATH                0.13944482803344727 secs    Fastest
Nokogiri Search (w/ XPATH)    0.14812183380126953 secs    5% Slower
Nokogiri CSS                  0.16048192977905273 secs    13% Slower
Nokogiri Search (w/ CSS)      0.16649389266967773 secs    16% Slower

Nokogiri Parser Comparisons

Author: Ezekiel Templin
Date: February 06, 2011
Summary: Comparing Nokogiri's parsers

System Information

Operating System:    Mac OS X 10.6.6 (10J567)
CPU:                 Intel Core i7 2.66 GHz
Processor Count:     2
Memory:              4 GB
ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

"Nokogiri XPATH - Specific" is up to 74% faster over 1,000 repetitions

Nokogiri XPATH - Specific                     0.04272198677062988  secs    Fastest
Nokogiri Search (w/ XPATH) - Specific         0.051782846450805664 secs    17% Slower
Nokogiri XPATH - Semi-Specific                0.1356217861175537   secs    68% Slower
Nokogiri XPATH - Nonspecific                  0.14063310623168945  secs    69% Slower
Nokogiri CSS - Nonspecific                    0.1483469009399414   secs    71% Slower
Nokogiri Search (w/ CSS) - Nonspecific        0.15012288093566895  secs    71% Slower
Nokogiri Search (w/ CSS) - Semi-Specific      0.15072989463806152  secs    71% Slower
Nokogiri CSS - Semi-Specific                  0.15105509757995605  secs    71% Slower
Nokogiri Search (w/ XPATH) - Semi-Specific    0.15113496780395508  secs    71% Slower
Nokogiri Search (w/ XPATH) - Nonspecific      0.1534569263458252   secs    72% Slower
Nokogiri Search (w/ CSS) - Specific           0.16422104835510254  secs    73% Slower
Nokogiri CSS - Specific                       0.1705338954925537   secs    74% Slower
require 'nokogiri'
require 'bench_press'
extend BenchPress
name 'Nokogiri Parser Comparisons'
author 'Ezekiel Templin'
date '2011-02-06'
summary 'Comparing Nokogiri\'s parsers'
@doc = Nokogiri::HTML.parse(open('test.html'))
measure "Nokogiri CSS" do
@doc.css('title')
end
measure "Nokogiri XPATH" do
@doc.xpath('//title')
end
measure "Nokogiri Search (w/ XPATH)" do
@doc.search('//title')
end
measure "Nokogiri Search (w/ CSS)" do
@doc.search('title')
end
require 'nokogiri'
require 'bench_press'
extend BenchPress
name 'Nokogiri Parser Comparisons'
author 'Ezekiel Templin'
date '2011-02-06'
summary 'Comparing Nokogiri\'s parsers and RegEx for good measure.'
@file = File.open('test.html', 'r')
@xpath_non = "//title"
@xpath_semi = "//head/title"
@xpath_spec = "/html/head/title"
@css_non = "title"
@css_semi = "head title"
@css_spec = "html head title"
@doc = Nokogiri::HTML.parse(@file)
# XPath
measure "Nokogiri XPATH - Nonspecific" do
@doc.xpath(@xpath_non)
end
measure "Nokogiri Search (w/ XPATH) - Nonspecific" do
@doc.search(@xpath_non)
end
measure "Nokogiri XPATH - Semi-Specific" do
@doc.xpath(@xpath_semi)
end
measure "Nokogiri Search (w/ XPATH) - Semi-Specific" do
@doc.search(@xpath_semi)
end
measure "Nokogiri XPATH - Specific" do
@doc.xpath(@xpath_spec)
end
measure "Nokogiri Search (w/ XPATH) - Specific" do
@doc.search(@xpath_spec)
end
# CSS
measure "Nokogiri CSS - Nonspecific" do
@doc.css(@css_non)
end
measure "Nokogiri Search (w/ CSS) - Nonspecific" do
@doc.search(@css_non)
end
measure "Nokogiri CSS - Semi-Specific" do
@doc.css(@css_semi)
end
measure "Nokogiri Search (w/ CSS) - Semi-Specific" do
@doc.search(@css_semi)
end
measure "Nokogiri CSS - Specific" do
@doc.css(@css_spec)
end
measure "Nokogiri Search (w/ CSS) - Specific" do
@doc.search(@css_spec)
end
@ezkl

This comment has been minimized.

Copy link
Owner Author

commented Feb 7, 2011

I would have published this on RubyBenchmark, but the publish command doesn't work in 1.9.2. The HTML being parsed is ~153kb of pretty sloppy code from a reasonably well-known auction site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.