Skip to content

Instantly share code, notes, and snippets.

@adliwahid
Created April 17, 2016 10:27
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save adliwahid/8256ab77cf585a249c215e8f831efbd4 to your computer and use it in GitHub Desktop.
Save adliwahid/8256ab77cf585a249c215e8f831efbd4 to your computer and use it in GitHub Desktop.
httrack for mirroring site on archive.org (waybackmachine)
#this is useful for copying snapshotted sites at archive.org
#copied from http://superuser.com/questions/532036/trouble-using-wget-or-httrack-to-mirror-archived-website
#replace ${wayback_url} with the full URL i.e. http://web.archive.org/web/20020705161639/http://kict.iiu.edu.my/
#replace ${domain_name} with the domain name of the site you'r mirroring without the 'http', so kict.iiu.edu.my
httrack\
${wayback_url}\
'-*'\
'+*/${domain_name}/*'\
-N1005\
--advanced-progressinfo\
--can-go-up-and-down\
--display\
--keep-alive\
--mirror\
--robots=0\
--user-agent='Mozilla/5.0 (X11;U; Linux i686; en-GB; rv:1.9.1) Gecko/20090624 Ubuntu/9.04 (jaunty) Firefox/3.5'\
--verbose
@jabiertxof
Copy link

Sorry it not work to me :(

@danielbair
Copy link

It downloads all the archive.org snapshots, and I want only one snapshot date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment