Skip to content

Instantly share code, notes, and snippets.

@rsato
Last active March 14, 2020 07:23
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rsato/820018dc8965254900fd to your computer and use it in GitHub Desktop.
Save rsato/820018dc8965254900fd to your computer and use it in GitHub Desktop.
Groovy + Geb + Selenium による、電子書籍試し読みサイト tameshiyo.me の画像を保存するスクリプトです。解説はこちら:https://reisato.plala.jp/rsato/weblog/2016/03/15/1830.html
@Grab('org.gebish:geb-core')
@Grab('org.seleniumhq.selenium:selenium-firefox-driver')
@GrabExclude('org.codehaus.groovy:groovy-all')
import geb.Browser
Browser.drive {
go args[0] // URL
sleep(2000)
// get number of pages
def pages = Integer.parseInt($('div#page_indicator')?.text().split(' ')[2])
// get encoded iamges in data URI
def list = []
(0..(pages-1)).each { page ->
list << $('div.page_pnl', 'data-page-prefix': "${page % 2 * 2}")?.children()[0].@src
list << $('div.page_pnl', 'data-page-prefix': "${page % 2 * 2 + 1}")?.children()[0].@src
$('img#tb_right').click()
sleep(1000)
}
// decode data & save iamges
list.eachWithIndex { url, index ->
if(url.startsWith('data:') && url.split(',').size() == 2) {
def encodedImage = url.split(',')[1]
new File("page-${index}.jpg").bytes = Base64.getDecoder().decode(encodedImage)
}
}
}.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment