Skip to content

Instantly share code, notes, and snippets.

@koduki
Created November 5, 2017 16:34
Show Gist options
  • Save koduki/88b14187ce9033c5155cf5e5ed0064d3 to your computer and use it in GitHub Desktop.
Save koduki/88b14187ce9033c5155cf5e5ed0064d3 to your computer and use it in GitHub Desktop.
require 'nokogiri'
def f node
node.children.map do |x|
[x.name, x.children.size, f(x)]
end
end
def nodes2text(nodes, level)
nodes.map { |n|
"#{"\t"*level}#{n[0]}, #{n[1]}" + nodes2text(n[2], level+1)
}.reduce(""){|r, x|
r += "\n" + x
}
end
html = Nokogiri::HTML(open('hoge02.htm'));1
nodes = f html;1
text = nodes2text(nodes, 0)
open('hoge2.txt', 'w'){|f| f.puts text}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment