Skip to content

Instantly share code, notes, and snippets.

@simonw
Created December 9, 2016 06:38
Show Gist options
  • Save simonw/27e810771137408fd7834ad153750c41 to your computer and use it in GitHub Desktop.
Save simonw/27e810771137408fd7834ad153750c41 to your computer and use it in GitHub Desktop.
Recursive wget ignoring robots
$ wget -e robots=off -r -np 'http://example.com/folder/'
  • -e robots=off causes it to ignore robots.txt for that domain
  • -r makes it recursive
  • -np = no parents, so it doesn't follow links up to the parent folder
@thewhitegrizzli
Copy link

not fixed

@jimsy3
Copy link

jimsy3 commented Dec 6, 2023

what is the recursive thing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment