Skip to content

Instantly share code, notes, and snippets.

@timcharper
Created February 3, 2009 21:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save timcharper/57787 to your computer and use it in GitHub Desktop.
Save timcharper/57787 to your computer and use it in GitHub Desktop.
html = <<-EOF
<html>
<body>
<table>
<tr>
<td>
One
<table><tr><td>Nested Cell</td></tr></table>
</td>
</tr>
<tr>
<td>
Two
<table><tr><td>Nested Cell</td></tr></table>
</td>
</tr>
</table>
</body>
</html>
EOF
require "nokogiri"
require "hpricot"
# ---------
# -HPRICOT-
# ---------
root_table = (Hpricot(html) / "body > table")
# this works as expected
puts (root_table / "tr").length # => 4
# this also works as expected
puts (root_table / "> tr").length # => 2
# ----------
# -NOKOGIRI-
# ----------
noko_root_table = (Nokogiri::HTML.parse(html) / "body > table")
# this also works as expected
puts (noko_root_table / "tr").length # => 4
puts (noko_root_table / "table > tr").length # => 2 (but it's returning the nested table rows, and I want only the rows from the root table)
puts (noko_root_table / ".//table/tr").length # => 2 (same - this is returning the nested rows, not the rows from the root table)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment