Skip to content

Instantly share code, notes, and snippets.

@iurikura
Last active January 9, 2017 08:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iurikura/91714cc86a0fbd56ddf4f96ff3b7f943 to your computer and use it in GitHub Desktop.
Save iurikura/91714cc86a0fbd56ddf4f96ff3b7f943 to your computer and use it in GitHub Desktop.
# sample5.rb をベースに Clips を取り出します。Clips は /username/ディレクトリ配下にあるため、観た映画より取り出しやすいです。
require 'nokogiri'
require 'anemone'
opts = {
depth_limit: 2
}
URL = "https://filmarks.com/users/hogehoge" # hogehoge に Username を入れてください
Anemone.crawl(URL, opts) do |anemone|
anemone.focus_crawl do |page|
page.links.keep_if { |link|
link.to_s.match(/clips/)  
}
end
anemone.on_every_page do |page|
doc = Nokogiri::HTML.parse(page.body)
clipscores = doc.xpath('//html/body/div[3]/div[3]/div[1]/div/h3/a/text()|//html/body/div[3]/div[3]/div[1]/div/div/div[3]/a/span/text()')
clipscores.each do |titlescore|
p clipscore.text
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment