Skip to content

Instantly share code, notes, and snippets.

@hanzochang
Created October 24, 2018 04:18
Show Gist options
  • Save hanzochang/61c4a5563494ff8a1205d76dd89b14a7 to your computer and use it in GitHub Desktop.
Save hanzochang/61c4a5563494ff8a1205d76dd89b14a7 to your computer and use it in GitHub Desktop.
htmlデータをパースして、<a name>でのアンカリングと、目次一覧を生成する
# 本文に<a name>設置
#
# @return [String]
def anchored_content
doc = Nokogiri::HTML.parse(content)
doc = add_anchors(doc, 'h3')
doc = add_anchors(doc, 'h4')
doc.children.to_html
end
# 目次一覧取得
#
# @return [Array] { tag: 'h3', label: node.text, path: "##{node.name}-#{h3_count}", children: [...] }
def table_of_contents
contents = heading_list
arr = []
# TODO revise 抽象化
contents.each do |content|
if content[:tag] == 'h3'
arr << content
else
arr.last[:children] << content
end
end
arr
end
private
# tagnameのparentに<a name>を付与してtagnameを囲む処理
#
# @param [String] Nokogiri::xxx パース済Nokogiriインスタンス
# @param [String] tagname タグ名
# @return [Array]
def add_anchors(doc, tagname)
doc.css(tagname).each_with_index do |node, index|
a = doc.create_element('a', node.text)
a['name'] = "#{tagname}-#{index + 1}"
a['id'] = "#{tagname}-#{index + 1}"
node.children = a
end
doc
end
# contentをparseしてh3とh4を任意のhashに納めたArrayを返却する
#
# @return [Array]
def heading_list
doc = Nokogiri::HTML.parse(content)
h3_count = 0
h4_count = 0
doc.xpath('//h3|//h4').map do |node|
if node.name == 'h3'
h3_count += 1
{ tag: 'h3', label: node.text, path: "##{node.name}-#{h3_count}", children: [] }
else
h4_count += 1
{ tag: 'h4', label: node.text, path: "##{node.name}-#{h4_count}" }
end
end
end
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment