Instantly share code, notes, and snippets.

Embed
What would you like to do?
httrack for mirroring site on archive.org (waybackmachine)
#this is useful for copying snapshotted sites at archive.org
#copied from http://superuser.com/questions/532036/trouble-using-wget-or-httrack-to-mirror-archived-website
#replace ${wayback_url} with the full URL i.e. http://web.archive.org/web/20020705161639/http://kict.iiu.edu.my/
#replace ${domain_name} with the domain name of the site you'r mirroring without the 'http', so kict.iiu.edu.my
httrack\
${wayback_url}\
'-*'\
'+*/${domain_name}/*'\
-N1005\
--advanced-progressinfo\
--can-go-up-and-down\
--display\
--keep-alive\
--mirror\
--robots=0\
--user-agent='Mozilla/5.0 (X11;U; Linux i686; en-GB; rv:1.9.1) Gecko/20090624 Ubuntu/9.04 (jaunty) Firefox/3.5'\
--verbose
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment