Skip to content

Instantly share code, notes, and snippets.

@vzvu3k6k
Created March 14, 2013 16:51
Show Gist options
  • Save vzvu3k6k/5162977 to your computer and use it in GitHub Desktop.
Save vzvu3k6k/5162977 to your computer and use it in GitHub Desktop.
Split OPML with Ruby, nokogiri
# -*- coding: utf-8 -*-
# Livedoor ReaderにOPMLで一度にインポートできるのは260件までっぽい感じだったので適当に分割する
require 'nokogiri'
xml = Nokogiri::XML.parse(File.read(ARGV[0]))
feeds = xml.search("//outline[@type]")
count = 0
feeds = feeds.to_a.uniq{|i| i["htmlUrl"]}
feeds.each_slice(250).map do |i|
sliced_xml = Nokogiri::XML.parse(<<XML)
<?xml version="1.0" encoding="UTF-8"?>
<opml version="1.0">
<head>
<title />
</head>
<body />
</opml>
XML
classifieds = i.group_by do |i|
if i.parent.name == "body"
nil
else
i.parent["title"]
end
end
sliced_xml.at_xpath("//title").content = "#{count}"
body = sliced_xml.at_xpath("//body")
classifieds.each do |tag, feeds|
if tag.nil?
place = body
else
place = Nokogiri::XML::Node.new("outline", sliced_xml)
place["text"] = place["title"] = tag
body << place
end
feeds.each do |feed|
place << feed.dup
end
end
File.write("#{count}.xml", sliced_xml.to_xml(:indent => 5, :encoding => 'UTF-8'))
count += 1
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment