Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Use wget to spider a site as a logged-in user.
http://addictivecode.org/FrequentlyAskedQuestions
To spider a site as a logged-in user:
1. post the form data (_every_ input with a name in the form, even if it doesn't have a value) required to log in (--post-data).
2. save the cookies that get generated (--save-cookies), including session cookies (--keep-session-cookies), which are not saved when --save-cookies alone is specified.
2. load the cookies, continue saving the session cookies, and recursively (-r) spider (--spider) the site, ignoring (-R) /logout.
# log in and save the cookies
wget --post-data='username=my_username&password=my_password&next=' --save-cookies=cookies.txt --keep-session-cookies https://foobar.com/login
# spider the site from this user's perspective, skipping /logout since hitting that page will log you out
wget -R logout -r --spider --load-cookies=cookies.txt --save-cookies=cookies.txt --keep-session-cookies https://foobar.com/home
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.