Skip to content

Instantly share code, notes, and snippets.

@yuya-matsushima
Created March 21, 2012 12:59
Show Gist options
  • Save yuya-matsushima/2146732 to your computer and use it in GitHub Desktop.
Save yuya-matsushima/2146732 to your computer and use it in GitHub Desktop.
広報ひがしあがつまを掲載サイトから取得
#encoding: utf-8
require 'nokogiri'
require 'open-uri'
url = 'http://www1.town.higashiagatsuma.gunma.jp/www/contents/1320623094013/index.html'
base = 'http://www1.town.higashiagatsuma.gunma.jp/'
name = '広報ひがしあがつま'
html = Nokogiri::HTML(open(url))
html.css('li a').each do |pdf|
# 全角→半角変換を行いながらタイトル取得
title = pdf.attribute('title').to_s.tr('0-9()', '0-9()')
url = pdf.attribute('href')
open(title + name + '.pdf', 'wb') do |output|
open(base + pdf.attribute('href')) do |data|
output.write(data.read)
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment