Skip to content

Instantly share code, notes, and snippets.

@ryosuke-endo
Last active September 5, 2016 12:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ryosuke-endo/4e26e159b243551376beb64b647ca3c6 to your computer and use it in GitHub Desktop.
Save ryosuke-endo/4e26e159b243551376beb64b647ca3c6 to your computer and use it in GitHub Desktop.
parallerのサンプル 2
require 'open-uri'
require 'nokogiri'
require 'benchmark'
require 'parallel'
Benchmark.bm do |r|
r.report do
pages = {}
Parallel.each([*1..100], in_threads: 4) do |no|
url = "http://matome.naver.jp/search?q=%e5%90%8d%e8%a8%80&page=#{no}"
doc = Nokogiri::HTML.parse(open(url))
witticisms = {}
doc.css('.mdMTMTtlList03Txt').each_with_index do |node, index|
text = node.css('.mdMTMTtlList03Ttl').text.gsub(/\s/, '')
favorite = node.css('.mdSocialCountList01Li.mdSocialCountList01FV').text.gsub(/\s/, '')
count = node.css('.mdSocialCountList01Li.mdSocialCountList01View').text.gsub(/\s|\|/, '')
witticisms["#{index}"] = { text: text, favorite: favorite, count: count }
end
pages["#{no}"] = witticisms
end
pages = pages.sort_by { |k, v| k.to_i }.to_h
File.open('p_result.txt', 'a') do |f|
pages.each do |no, witticism|
f.puts "#{no}, #{witticism}"
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment