Skip to content

Instantly share code, notes, and snippets.

@krtx
Created November 4, 2012 22:08
Show Gist options
  • Save krtx/4013980 to your computer and use it in GitHub Desktop.
Save krtx/4013980 to your computer and use it in GitHub Desktop.
langrid scraper for back-translation
# coding: utf-8
require 'watir'
require 'pp'
# !!! EDIT HERE !!!
userid = ''
password = ''
language = 'ja' # 'ja' or 'en'
topic_id =
# !!! KOKOMADE !!!
navi = {'ja' => ['原文編集', '翻訳'], 'en' => ['Pre-edit', 'Translate']}
browser = Watir::Browser.new
browser.goto 'http://langrid.org/tools/toolbox'
browser.text_field(:name => 'uname').set userid
browser.text_field(:name => 'pass').set password
browser.form(:name => 'login_form').submit
browser.goto "http://langrid.org/tools/toolbox/modules/forum/?topicId=#{topic_id.to_s}&lang=#{language}&ml_lang=#{language}"
loop do
links = browser.links(:text => navi[language][0]).length
links.times do |i|
browser.link(:text => navi[language][0], :index => i).click
browser.button(:text => navi[language][1], :index => 0).click
text = ''
loop do
sleep 2
text = browser.div(:class => 'bbs-preview-back-translation-area').text
break if text != ''
end
puts text
browser.back
sleep 2
end
begin
browser.link(:text => 'Next >>').click
rescue Watir::Exception::UnknownObjectException
break
end
sleep 2
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment