Skip to content

Instantly share code, notes, and snippets.

@stevemclaugh
Last active October 3, 2017 17:16
Show Gist options
  • Save stevemclaugh/13066423b4de059decb4f520211678d3 to your computer and use it in GitHub Desktop.
Save stevemclaugh/13066423b4de059decb4f520211678d3 to your computer and use it in GitHub Desktop.

Download a list of URLs

wget --wait=0.2 --random-wait --no-check-certificate --page-requisites -erobots=off --tries="inf" -c --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0" -i /path/to/list_of_urls.txt

Recursively download a full website

wget -r --wait=0.2 --random-wait --no-check-certificate --page-requisites -erobots=off --tries="inf" -c --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0" http://principalhand.org

Note: These commands are appropriate for backing up a personal website. Otherwise, check the site's terms of service first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment