Skip to content

Instantly share code, notes, and snippets.

@henrikj242
Created February 5, 2019 10:59
Show Gist options
  • Save henrikj242/66201c7a3f2980513e0e7a4fc9f86b61 to your computer and use it in GitHub Desktop.
Save henrikj242/66201c7a3f2980513e0e7a4fc9f86b61 to your computer and use it in GitHub Desktop.
require 'nokogiri'
unordered_list = <<~END
<ul>
<li>
First item
</li>
<li>
Second item
<ul>
<li>
Second item's first item
<ul>
<li>
Second item's first item's first item
</li>
<li>
Second item's first item's second item
</li>
</ul>
</li>
</ul>
</li>
</ul>
END
def parse_html_list_item(li, indents)
first_text_child = li.children.select{|c| c.text? }[0].text.strip
ul = li.at_css('ul')
if ul
' ' * indents + "• #{first_text_child}\n" + parse_html_list(ul, indents + 1)
else
' ' * indents + "• #{first_text_child}\n"
end
end
def parse_html_list(ul, indents)
text = ''
ul.children.select{|l| l.name == 'li' }.each do |li|
text += parse_html_list_item(li, indents)
end
text
end
def parse_html(content)
all = Nokogiri::HTML(content)
ul = all.css('ul')[0]
parse_html_list(ul, 0) if ul
end
puts parse_html(unordered_list)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment