Skip to content

Instantly share code, notes, and snippets.

@brutuscat
Created October 4, 2011 09:09
Show Gist options
  • Save brutuscat/1261202 to your computer and use it in GitHub Desktop.
Save brutuscat/1261202 to your computer and use it in GitHub Desktop.
wget command to scrap just a portion of a website's HTML
# This way you can test your scripts in your localhost. It's like using the
# --mirror option without downloading all the needed resources, just the HTML.
# Add "-e robots=off" to be evil
# Add --continue so wget won't download already downloaded files
wget -r -N -E --convert-links --random-wait -l 3 --wait=2 --user-agent=" Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:6.0.1) Gecko/20100101 Firefox/6.0.12011-09-09 13:03:08" "URL-WEBSITE?and=1&params=2"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment