Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mttjohnson/b51ba7b3f5864f93adab5303ed801bdc to your computer and use it in GitHub Desktop.
Save mttjohnson/b51ba7b3f5864f93adab5303ed801bdc to your computer and use it in GitHub Desktop.
Load all urls from a sitemap.xml file
#!/bin/bash
# This script crawls all urls in a /sitemap.xml file and loads them, effectively priming the cache
# Usage: ./warm_cache.sh www.example.com
time wget --quiet https://$1/sitemap.xml --output-document - | \
egrep -o "https?://[^<]+" | \
grep $1 | \
grep -v "jpg" | \
xargs -i -d '\n' curl --output /dev/null --silent --write-out '%{http_code} %{time_total}ms %{url_effective} \n' {}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment