Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Use wget to scrape all URLs from a sitemap.xml Usage:
if [ "$SITEMAP" = "" ]; then
echo "Usage: $0"
exit 1
XML=`wget -O - --quiet $SITEMAP`
URLS=`echo $XML | egrep -o "<loc>[^<>]*</loc>" | sed -e 's:</*loc>::g'`
echo $URLS | tr ' ' '\n' | wget -O /dev/null -i - --wait=1 --random-wait -nv

This comment has been minimized.

Copy link

hanchiang commented Dec 1, 2018

Thanks a lot for this bash script! Saved me a lot of head banging 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.