Skip to content

Instantly share code, notes, and snippets.

@tkawachi
Created November 21, 2011 11:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tkawachi/1382338 to your computer and use it in GitHub Desktop.
Save tkawachi/1382338 to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
# require safariwatir gem.
require 'safariwatir'
browser = Watir::Safari.new
browser.goto 'http://tr.twipple.jp/hotword/'
# browser.goto 'http://tr.twipple.jp/talent/' # required another regexes
html = browser.html
chunk_regex = / START : (.+?) END : (.*)/m
rank_regex = /<div class="rankTtl"><a href="[^"]+\?q=([^"]+)" target="_blank">([^<]+)<\/a>(.*)/m
hash_regex = /<div class="rankTtl"><p [^>]+><a href="#" onclick="window.open\('[^=]+?q=([^']+)'\)"[^>]+>([^<]+)<\/a>(.*)/m
i = 0
while chunk_regex =~ html
chunk = $1
html = $2
i += 1
# puts i
if rank_regex =~ chunk
puts $2
puts $1
elsif hash_regex =~ chunk
puts $2
puts $1
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment