Skip to content

Instantly share code, notes, and snippets.

@mikewlange
Last active July 29, 2017 15:45
Show Gist options
  • Save mikewlange/9383ca28f12af019013f2c0f399103ea to your computer and use it in GitHub Desktop.
Save mikewlange/9383ca28f12af019013f2c0f399103ea to your computer and use it in GitHub Desktop.
wget, curl, httrack crap
HTTRACK -- BEST
// BEST
sudo apt install httrack
httrack -w website.com
WGET
Download a web page with all assets – like stylesheets and inline images – that are required to properly display the web page offline.
wget ‐‐page-requisites ‐‐span-hosts ‐‐convert-links ‐‐adjust-extension https://www.couponcabin.com/coupons
Download an entire website including all the linked pages and files
wget ‐‐execute robots=off ‐‐recursive ‐‐no-parent ‐‐continue ‐‐no-clobber https://www.couponcabin.com/
WGET - Mirror
wget -r -N -l inf -nr -k https://www.couponcabin.com/
=
wget -m --limit-rate=200k -p www.objgen.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment