Skip to content

Instantly share code, notes, and snippets.

@eggplants
Last active April 28, 2019 17:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save eggplants/989b3f0ad457bc89101ce10a3d1f47dd to your computer and use it in GitHub Desktop.
Save eggplants/989b3f0ad457bc89101ce10a3d1f47dd to your computer and use it in GitHub Desktop.
retrieve list(work_id,title,num of episode) from d-anime store(https://anime.dmkt-sp.jp/animestore/)
require "open-uri"
require 'htmlentities'
require 'benchmark'
result = Benchmark.realtime do
data=[]
for i in 10000..25000
puts "now:#{i}"
printf "\e[1A"
STDOUT.flush
begin
url="https://anime.dmkt-sp.jp/animestore/ci_pc?workId=#{i}"
begin
a=open(url).read
rescue OpenURI::HTTPError
next
end
begin
title=HTMLEntities.new.decode(a.scan(/<meta property=og:title content=\"(.*?)\"/)[0])
rescue NoMethodError
data<< [i,"PC非対応作品",0]
next
end
t=title.scan(/(全([0-9]+)話)/)[0]
begin
title.delete!($&)
episode=t
rescue TypeError
episode=0
end
tmp=[i,title.delete("[").delete("]").gsub(/,/,"','"),episode]
if tmp[1]!=nil
data<< tmp
end
rescue SocketError,Errno::ENOENT,Errno::ENETUNREACH
sleep(rand(10))
retry
rescue Interrupt
puts "Interrupt!"
end
end
#export CSV file
open("d_data#{`date +"%Y%m%d_%H%M%S"`.chomp}.csv","w") do |ww|
ww.puts "id,title,episode"
for i in data
ww.puts i.join(",")
end
end
end
puts result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment