Skip to content

Instantly share code, notes, and snippets.

@maektwain
Created January 12, 2019 11:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save maektwain/e930fa20b39d6b98502c4deb1a32e395 to your computer and use it in GitHub Desktop.
Save maektwain/e930fa20b39d6b98502c4deb1a32e395 to your computer and use it in GitHub Desktop.
Tumblr RB
# Put this file inside script/import_scripts
# How to run ?
# Just get the posts.xml file from the tumblr export that is all.
# bundle exec ruby script/import_scripts/tumblr.rb script/import_scripts/posts.xml
#
require "nokogiri"
require File.expand_path(File.dirname(__FILE__) + "/base.rb")
class ImportScripts::Tumblr < ImportScripts::Base
def initialize
super
@tumblr_file = ARGV[0]
raise ArgumentError.new('Tumblr file argument missing. Provide full path to tumblr xml file.') if @tumblr_file.blank?
#Not sure if i want the latest activity.
end
def execute
check_file_exist
create_tumblr
parse_file
end
private
def check_file_exist
raise ArgumentError.new("File does not exist: #{@tumblr_file}") unless File.exist?(@tumblr_file)
end
def parse_file
puts "parsing file...."
file = read_file
end
def create_tumblr
puts "Creating Catorgy"
create_category({
name: 'Tumblr Blog',
user_id: -1,
description: "Articles from the Tumblr blog"
}, nil) unless Category.find_by_name('Blog')
end
def read_file
puts "reading file..."
string = Nokogiri::XML(File.read(@tumblr_file))
posts = Array.new
string.xpath("//post").each do |node|
title = node.xpath('regular-title').text
url = node.attr('url')
id = node.attr('id')
post_date = node.attr('date')
slug = node.attr('slug')
post_content = node.xpath('regular-body').text
topic = node.attr('tumblelog')
post = {
id: id,
user_id: Discourse::SYSTEM_USER_ID,
raw: post_content,
created_at: post_date,
topic_id: 1,
title: slug
}
posts.push(post)
end
import_posts(posts)
end
def import_posts(posts)
puts "Importing Posts"
create_posts(posts) do |post|
puts "Creating Post"
{
id: post[:id],
user_id: Discourse::SYSTEM_USER_ID,
category: 'Tumblr Blog',
raw: post[:raw],
created_at: post[:post_date],
title: post[:title]
}
end
end
end
ImportScripts::Tumblr.new.perform
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment