Skip to content

Instantly share code, notes, and snippets.

@juanibiapina
Created September 28, 2016 14:07
Show Gist options
  • Save juanibiapina/eeafa7c4819657e5779e46de6e488e07 to your computer and use it in GitHub Desktop.
Save juanibiapina/eeafa7c4819657e5779e46de6e488e07 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash
#
# Download an index of posts from a blogger blog
#
# Example usage: ./blogger-index techblog.netflix.com
#
# Dependencies: xml2json, jq
set -e
url="$1"
if [ -z "$url" ]; then
echo "Usage: ./blogger-index <url>"
exit 1
fi
urls="$(curl -s http://$url/sitemap.xml | xml2json | sed -e "s/\\$//g" | jq -r ".sitemapindex .sitemap | map(.loc) | map(.t) | .[]")"
for url in $urls; do
curl -s "$url" | xml2json | sed -e "s/\\$//g" | jq -r ".urlset .url | map(.loc) | map(.t) | .[]" >> index.txt
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment