Skip to content

Instantly share code, notes, and snippets.

@steverobbins
Last active April 1, 2020 18:42
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save steverobbins/11bac3bc5d3b6156e634d9aaf30978bd to your computer and use it in GitHub Desktop.
Save steverobbins/11bac3bc5d3b6156e634d9aaf30978bd to your computer and use it in GitHub Desktop.
Quick and dirty script to save a whole site to archive.org
#!/bin/bash
URL=$1
PATH=$2
EXCLUDE=$3
/bin/rm -rf "$URL"
echo "Gathering URLs..."
/usr/local/bin/wget -q -r -nc "$URL$PATH" -D "$URL" -X "$EXCLUDE"
echo "Sending to archive.org..."
/usr/bin/find "$URL" -type f -print0 | while IFS= read -r -d $'\0' LINE; do
echo "Saving $LINE"
/usr/bin/curl -s "https://web.archive.org/save/$LINE" > /dev/null &
/bin/sleep 1
done
echo "Done"
/bin/rm -rf "$URL"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment