Skip to content

Instantly share code, notes, and snippets.

@posaunehm
Last active January 2, 2016 09:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save posaunehm/8284020 to your computer and use it in GitHub Desktop.
Save posaunehm/8284020 to your computer and use it in GitHub Desktop.
require 'nokogiri'
require 'open-uri'
require "FileUtils"
html = Nokogiri::HTML(open("http://peta.okechan.net/blog/llvm%E3%81%AB%E3%82%88%E3%82%8B%E3%83%97%E3%83%AD%E3%82%B0%E3%83%A9%E3%83%9F%E3%83%B3%E3%82%B0%E8%A8%80%E8%AA%9E%E3%81%AE%E5%AE%9F%E8%A3%85"))
url_list = html.css("article li a").collect{|a| a["href"]}
article_list = url_list.collect{|url|
article_doc = Nokogiri::HTML(open(url))
article_doc.css("article").first
}
article_list.each{ |article_doc|
article_doc.css("img").each{ |ele|
path = ele["src"]
fileName = File.basename(path)
dir = "img/"
filepath = dir + fileName
FileUtils.mkdir_p(dir) unless FileTest.exist?(dir)
open(filepath, 'wb') do |output|
open(path) do |data|
output.write(data.read)
end
end
ele["src"] = filepath
}
}
open("temp.html", "w"){ |f|
article_list.each {|article|
f.puts(article)
}
}
title = html.title;
author = "@peta_okechan"
open("temp_title.txt", "w"){ |f|
f.puts("% #{title}")
f.puts("% #{author}")
}
`pandoc temp.html -o "#{title}.md"`
`pandoc temp_title.txt "#{title}.md" -o "#{title}.epub"`
`ebook-convert "#{title}.epub" "#{title}.mobi"`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment