Skip to content

Instantly share code, notes, and snippets.

@zanshin
Forked from melwin/wp-xml-import.rb
Created July 30, 2011 23:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zanshin/1116129 to your computer and use it in GitHub Desktop.
Save zanshin/1116129 to your computer and use it in GitHub Desktop.
require 'fileutils'
require 'date'
require 'yaml'
require 'rexml/document'
include REXML
doc = Document.new File.new(ARGV[0])
FileUtils.mkdir_p "_posts"
doc.elements.each("rss/channel/item[wp:status = 'publish' and wp:post_type = 'post']") do |e|
post = e.elements
slug = post['wp:post_name'].text
date = DateTime.parse(post['wp:post_date'].text)
name = "%02d-%02d-%02d-%s.markdown" % [date.year, date.month, date.day, slug]
date_string = date.year.to_s + "-" + date.month.to_s + '-' + date.day.to_s
# clean up years of category randomness in one fell swoop.
case post['category'].text
when "diversions",
"elsewhere",
"family",
"health",
"life",
"links",
"meme",
"nerdliness",
"photography",
"random",
"relationships",
"social issues"
category_string = post['category'].text
else
category_string = "life"
end
content = post['content:encoded'].text
content = content.gsub(/<code>(.*?)<\/code>/, '`\1`')
content = content.gsub(/<pre lang="([^"]*)">(.*?)<\/pre>/m, '')
(1..3).each do |i|
content = content.gsub(/<h#{i}>([^<]*)<\/h#{i}>/, ('#'*i) + ' \1')
end
puts "Converting: #{name}"
data = {
'layout' => 'post',
'title' => post['title'].text,
'date' => date_string,
'comments' => 'false',
'categories' => category_string,
'author' => 'Mark'
}.delete_if { |k,v| v.nil? || v == ''}.to_yaml
File.open("_posts/#{name}", "w") do |f|
f.puts data
f.puts "---"
f.puts content
end
end
@zanshin
Copy link
Author

zanshin commented Aug 1, 2011

Modified gist to suit my needs. Added capture of WordPress category, and cleaned up categories into 12 tidy little piles, with "life" being the default category. Also capturing original posting date (not time) for posterity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment