|# One liner|
|wget --recursive --page-requisites --adjust-extension --span-hosts --convert-links --restrict-file-names=windows --domains yoursite.com --no-parent yoursite.com|
|--recursive \ # Download the whole site.|
|--page-requisites \ # Get all assets/elements (CSS/JS/images).|
|--adjust-extension \ # Save files with .html on the end.|
|--span-hosts \ # Include necessary assets from offsite as well.|
|--convert-links \ # Update links to still work in the static version.|
|--restrict-file-names=windows \ # Modify filenames to work in Windows as well.|
|--domains yoursite.com \ # Do not follow links outside this domain.|
|--no-parent \ # Don't follow links outside the directory you pass in.|
|yoursite.com/whatever/path # The URL to download|
This is just using wget, just look up how to use wget. There are tons of examples online.
Either way you need to make sure you have wget installed already:
Here are some usage examples to download an entire site:
One more example to download an entire site with wget:
Explanation of the various flags:
--mirror – Makes (among other things) the download recursive.
Alternatively, the command above may be shortened:
If you still insist on running this script, it is a BASH script so first set it as executable:
and then this to run the script:
if you still can't run the script edit it by adding this as the first line:
Also you need to specify the site in the script that you want to download. At this point you are really better off just using wget outright.
H!, if I am wrong you can virtually shoot me, but the no-parent command is maybe hit by a typo because when I tried with ----no-parent it did not recognize the command but when I did some surgery I endid up with --no-parent and it worked so if I am right cool if I am wrong I am sorry:
as quoted from docs:
Does anyone know how I'd go about downloading all the get requests a site does? My wget mirror only gets the links from the site html into my folders. Not the requests the site does (probably within the JS ...? ) should I get some kinda proxy thing and wget the links extracted from that?