Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save weatheredwatcher/1018209 to your computer and use it in GitHub Desktop.
Save weatheredwatcher/1018209 to your computer and use it in GitHub Desktop.
#!/usr/bin/ruby
require 'rubygems'
require 'nokogiri'
puts 'parsing xml file'
parsed = Nokogiri::XML(open("./wordpress.2010-10-06.xml"))
puts 'pulling titles'
i = 0
title = Array.new
parsed.xpath('//item/title').each do |n|
title[i] = n.text
i += 1
end
puts 'pulling dates'
i = 0
date = Array.new
parsed.xpath('//item/pubDate').each do |n|
date[i] = n.text
i += 1
end
puts 'pulling content'
i = 0
content = Array.new
parsed.xpath('//item/content:encoded').each do |n|
content[i] = n.text
i += 1
end
puts 'pulling name'
i = 0
name = Array.new
parsed.xpath('//item/wp:post_name').each do |n|
name[i] = n.text
i += 1
end
puts 'muxing arrays'
if title.length == date.length and date.length == content.length and content.length == name.length then
posts = [title, date, content, name]
else
puts 'length broken!'
end
puts 'printing'
i = 0
while i < title.length do
filename = "articles/" + DateTime.parse(posts[1][i]).strftime("%Y-%m-%d") + "-" + posts[3][i] + ".txt"
file = File.new(filename, "w")
# puts "filename: " + filename
file.puts "title: " + posts[0][i]
file.puts "date: " + DateTime.parse(posts[1][i]).strftime("%Y/%m/%d")
file.puts "author: weatheredwatcher"
file.puts "\n"
file.puts "#{posts[2][i]}"
i += 1
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment