Skip to content

Instantly share code, notes, and snippets.

@gongo
Created October 26, 2009 08:49
Show Gist options
  • Save gongo/218518 to your computer and use it in GitHub Desktop.
Save gongo/218518 to your computer and use it in GitHub Desktop.
# -*- coding: utf-8 -*-
require 'open-uri'
require 'kconv'
#= ∀ガンダムのセリフをキャラ毎に抽出する
#
# Authors:: 宮國 渡
# Copyright:: Copyright (c) 2009 Wataru MIYAGUNI <gonngo@gmail.com>
# URL:: http://github.com/gongo
#
#== 説明
# http://www.geocities.co.jp/AnimeComic-Pastel/3829/portal_TurnA.html
# に記載されているセリフをキャラごとに抽出する。
#
#== 注意
# 設定によっては全話一気に!とかできるんですが負荷とかかかりそうなので自重しる
#
#== For example execute
#
# ruby turn_a_word.rb
# #=> ロランのセリフ
#
# ruby turn_a_word.rb ギンガナム
# #=> ギンガナムのセリフ
#
#
uri_format = 'http://www.geocities.co.jp/AnimeComic-Pastel/3829/words%02d_TurnA.html'
reg = Regexp.new('<tr><td class="td1">(.*)</td><td>(「|『)(.*)(」|』)</td></tr>')
target = ARGV[0] || "ロラン"
cont = " "
def get_story(uri, number)
return open(uri % number)
end
(1..50).each do |s|
words = get_story(uri_format, s)
prev = ""
words.each_line do |row|
if reg =~ row.toutf8
if $1 == target || ($1 == cont && prev == target)
puts $3
end
prev = $1 if $1 != cont
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment