Skip to content

Instantly share code, notes, and snippets.

@dcosson
Created June 12, 2012 15:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dcosson/2918201 to your computer and use it in GitHub Desktop.
Save dcosson/2918201 to your computer and use it in GitHub Desktop.
Archive a website (in this case, tourbie.com)
# Just a note to myself on how to archive a website
mkdir tourbie_archive
cd tourbie_archive
wget --mirror -p -nH -e robots=off --convert-links http://tourbie.com
# --mirror mirrors the site (recurses all links)
# -p downloads all the links necessary to view the site
# --convert-links converts all links starting with http://tourbie.com to be relative
# -e robots=off optional, ignore robots.txt (on tourbie.com, I had disallowed the static files directory in robots.txt)
# -nH take out domain name (won't put everything in a "tourbie.com" folder)
@BradKML
Copy link

BradKML commented Jun 6, 2022

Mind if you take a look at the spoofers with https://gist.github.com/mullnerz/9fff80593d6b442d5c1b ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment