Skip to content

Instantly share code, notes, and snippets.

@vanakenm
Created August 15, 2017 12:09
Show Gist options
  • Save vanakenm/b9865bd979e16a42bc2d83305d94121d to your computer and use it in GitHub Desktop.
Save vanakenm/b9865bd979e16a42bc2d83305d94121d to your computer and use it in GitHub Desktop.
<h1>Header</h1>
<h2 id="subheader">Subheader</h2>
<p>Paragraph</p>
<h2 id="subheader-2">Subheader 2</h2>
<h3 id="subsubheader-1">Subsubheader 1</h3>
<p>Subsub paragraph</p>
<p>Subsub paragraph 2</p>
<h3 id="subsubheader-2">Subsubheader 2</h3>
<p>Subsub paragraph 3</p>
<h2 id="subheader-3">Subheader 3</h2>
<p>Paragraph</p>
<pre><code>&lt;html&gt;code&lt;/html&gt;
</code></pre>
<p>Paragraph</p>
require "nokogiri"
html = File.read("./doc.html")
html_doc = Nokogiri::HTML(html)
headings = html_doc.xpath("//*[self::h1 or self::h2 or self::h3]")
headings.each do |h|
elements_inside = []
current_element = h.next
while current_element && !current_element.name.include?("h")
elements_inside << current_element
current_element = current_element.next
end
puts "For header #{h.name} we found #{elements_inside.map(&:name)}"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment