Skip to content

Instantly share code, notes, and snippets.

@amane-katagiri
Last active May 18, 2020 03:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amane-katagiri/269c9a291f0b5b899d1aef780fb4be66 to your computer and use it in GitHub Desktop.
Save amane-katagiri/269c9a291f0b5b899d1aef780fb4be66 to your computer and use it in GitHub Desktop.
画像で議論
#!/bin/bash -eu
GIRON_URL='https://rara.jp/giron/photo'
DOWNLOAD_DIR=$HOME/Pictures/tmp
MAX_PAGE=10
CURSOR=$DOWNLOAD_DIR/.last_download_url
VERBOSE=0
CURL_OPT="-Ss"
if [ $# -gt 0 ]; then
if [ "$1" = "-v" ]; then
VERBOSE=1
CURL_OPT=""
fi
fi
CURL_TMP=`mktemp`
log() {
if [ $VERBOSE -gt 0 ]; then
echo $1 1>&2
fi
}
touch $CURSOR
if [ -s $CURSOR ]; then
CURSOR_ID=$(cat $CURSOR | grep -oP '(?<=_)[0-9]+(?=\.)')
else
CURSOR_ID=0
fi
for i in '' $(seq 2 $MAX_PAGE); do
if [ -n "$i" ]; then
log "start download page $i."
else
log 'start download page 1.'
fi
log $GIRON_URL$i
curl $CURL_OPT $GIRON_URL$i | xmllint --xpath '//img/@data-original' --html - 2>/dev/null | grep -oP '(?<=")http[^"]+(?=")' | sed -e "s/p_/_/g" >> $CURL_TMP
if test -s $CURSOR && grep -q $(cat $CURSOR) $CURL_TMP; then
log 'found cursor image.'
break
fi
done
for u in $(cat $CURL_TMP); do
ID=$(echo $u | grep -oP '(?<=_)[0-9]+(?=\.)')
HASH=$(echo $u | grep -oP '(?<=/)[0-9a-f]+(?=_)')
EXT=$(echo $u | grep -oP '\.[a-z]+$')
if [ "$ID" -gt "$CURSOR_ID" ]; then
log "downloading: $u"
curl $CURL_OPT -o $DOWNLOAD_DIR/${ID}_${HASH}${EXT} $u
else
log "skipped: $u"
fi
done
head -n1 $CURL_TMP > $CURSOR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment