Skip to content

Instantly share code, notes, and snippets.

@vancetran
Last active January 5, 2020 10:26
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save vancetran/c6201bf99829fbe4b20c to your computer and use it in GitHub Desktop.
Save vancetran/c6201bf99829fbe4b20c to your computer and use it in GitHub Desktop.
Wget recipes

Make a static copy of a dynamic site, including images

via Stanford

wget -P /destination/ -mpck --user-agent="" -e robots=off --random-wait -E http://example.com/

Without Images, PPT, PDF

source

wget -P /destination/ -mpck --user-agent="" -e robots=off --random-wait -R gif,jpg,jpeg,png,pdf,ppt,GIF,JPG,JPEG,PNG,PDF,PPT -E http://example.com/

Wget fun at work

with a 30 second wait between requests…

wget -mkEpnp --wait=30 -U "Mozilla/5.0 (X11; U; Linux; en-US; rv:1.9.1.16) Gecko/20110929 Firefox/3.5.16" https://join-mosaic.squarespace.com

Basics

http://www.linuxjournal.com/content/downloading-entire-web-site-wget

Wait x seconds between requests

https://wiki.hackzine.org/scripts/wget-grab-website.html

Another

https://swsblog.stanford.edu/blog/creating-static-copy-website

wget -mpck -e robots=off --wait 5 -E https://join-mosaic.squarespace.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment