Skip to content

Instantly share code, notes, and snippets.

@naoto
Created October 26, 2009 09:00
Show Gist options
  • Save naoto/218521 to your computer and use it in GitHub Desktop.
Save naoto/218521 to your computer and use it in GitHub Desktop.
ターンエーのセリフ抽出
#!/usr/local/bin/ruby
require 'rubygems'
require 'hpricot'
require 'open-uri'
require 'uri'
require 'kconv'
def getWord(page)
html = Hpricot(open(URI.parse(@uri) + page))
html.search("tr"){ |tr|
td = tr.search("td")
if td[0].inner_html.toutf8 =~ /#{@char}/
puts $1 if !td[1].nil? && td[1].inner_text.toutf8 =~ /^(.+?)$/
end
}
end
@uri = ARGV[0] || "http://www.geocities.co.jp/AnimeComic-Pastel/3829/portal_TurnA.html"
@char = ARGV[1] || "ロラン"
html = Hpricot(open(@uri))
html.search("a"){ |a|
getWord $1 if a.attributes['href'] =~ /^(words.+?)$/
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment