Skip to content

Instantly share code, notes, and snippets.

@hackugyo
Last active May 27, 2016 10:43
Show Gist options
  • Save hackugyo/c0744cf46ba21c1243460751a930a49b to your computer and use it in GitHub Desktop.
Save hackugyo/c0744cf46ba21c1243460751a930a49b to your computer and use it in GitHub Desktop.
isbn(10 or 13)からタイトルを取り出す。isbnの抽出は `sed -e 's/^.*:\([0-9A-Z\-]*\):.*$/\1/g' `とかで。
vendor/bundle
.bundle/config
# A sample Gemfile
source "https://rubygems.org"
gem 'rakuten_web_service'
gem 'lisbn'
gem 'open_uri_redirections'
GEM
remote: https://rubygems.org/
specs:
faraday (0.9.2)
multipart-post (>= 1.2, < 3)
faraday_middleware (0.10.0)
faraday (>= 0.7.4, < 0.10)
lisbn (0.2.2)
nokogiri
nori (~> 2.0)
mini_portile2 (2.0.0)
multipart-post (2.0.0)
nokogiri (1.6.7.2)
mini_portile2 (~> 2.0.0.rc2)
nori (2.6.0)
open_uri_redirections (0.2.1)
rakuten_web_service (0.6.3)
faraday (~> 0.9.0)
faraday_middleware
PLATFORMS
ruby
DEPENDENCIES
lisbn
open_uri_redirections
rakuten_web_service
BUNDLED WITH
1.11.2
# coding: utf-8
require 'rakuten_web_service'
require 'lisbn'
require 'open-uri'
require 'nokogiri'
require 'open_uri_redirections'
def get_detail(isbn)
items = RakutenWebService::Books::Book.search(:isbn => isbn) # This returns Enumerable object
items.first(10).each do |item|
genre_result = nil
genre_name = item.books_genre_id
begin
genre_id = item.books_genre_id.split('/')[0] # たまにこれが入っているものがある
genre = RakutenWebService::Books::Genre.search(:booksGenreId => genre_id).first(1)[0]
genre_level = genre['genreLevel'].to_i
genre_name = genre_level > 2 ? genre['parents'][0]['booksGenreName'] : genre.name
rescue RakutenWebService::NotFound => e
STDERR.puts "Genre Id #{item.books_genre_id} is not found."
rescue RakutenWebService::WrongParameter
STDERR.puts "Genre Id #{item.books_genre_id} is wrong."
end
return {genre_name: genre_name, item: item, to_s: "#{genre_name}\tisbn:#{item.isbn}:detail:small\t#{item.title}, #{item.item_price} yen."}
end
end
def get_unknown_isbn_detail(isbn_10_or_13)
isbn10 = Lisbn.new(isbn_10_or_13).isbn10
isbn_or_asin = "isbn";
if isbn10 == nil then
isbn10 = isbn_10_or_13
isbn_or_asin = "asin"
end
url = "https://www.amazon.co.jp/dp/#{isbn10}"
result = get_title_from url
url = result unless !result
return "unknown\t#{isbn_or_asin}:#{isbn_10_or_13}:detail:small\t#{url}"
end
def get_title_from(url)
begin
res = OpenURI.open_uri(url, :allow_redirections => :all)
rescue OpenURI::HTTPError => e
STDERR.puts "cannot reach #{url}. #{e}"
return nil
end
code, message = res.status # res.status => ["200", "OK"]
if code == '200'
doc = Nokogiri::HTML.parse(res.read)
return doc.title
else
STDERR.puts "cannot reach #{url}. code: #{code}"
return nil
end
end
RakutenWebService.configuration do |c|
c.application_id = 'YOUR_APPLICATION_ID'
c.affiliate_id = 'YOUR_AFFILIATE_ID'
end
isbns = ARGV
isbns
.map { |isbn|
isbn = isbn.chomp.gsub('-', '')
isbn_obj = Lisbn.new(isbn)
isbn_obj.valid? ? isbn_obj.isbn13 : isbn
}
.map { |isbn|
result = ""
begin
result = get_detail(isbn)
rescue RakutenWebService::WrongParameter => e
end
[result, isbn]
}
.sort {|a, b| a[0][:item].books_genre_id <=> b[0][:item].books_genre_id }
.each { |item|
if item[0] == nil || item[0].length == 0 then
puts get_unknown_isbn_detail item[1]
sleep 1
else
puts item[0][:to_s]
end
}
978-4488754013
4488722016
978-4122012714
@hackugyo
Copy link
Author

$ cat sample_isbn.txt | xargs -I {} bundle exec ruby main.rb {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment