Skip to content

Instantly share code, notes, and snippets.

@aduzsardi
Created July 25, 2017 08:26
Show Gist options
  • Save aduzsardi/8b6d65d5a61431ad4489af71032f39a8 to your computer and use it in GitHub Desktop.
Save aduzsardi/8b6d65d5a61431ad4489af71032f39a8 to your computer and use it in GitHub Desktop.
Wget usage examples
https://www.rationallyparanoid.com/articles/wget.html
Wget is a command-line tool for retrieving files using HTTP, HTTPS or FTP.
Retrieve /images/pic01.jpg from host example.com:
wget http://example.com/images/pic01.jpg
Retrieve /images/pic01.jpg from host example.com, specifying the user agent for IE7 on a Windows XP SP2 system instead of "Wget/1.xx":
wget --user-agent="Mozilla/4.0 (Windows; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)" http://example.com/images/pic01.jpg
Retrieve /images/pic01.jpg from host example.com, specifying "http://anothersite.example.com/search?hl=en&q=pictures" as the referer:
wget --referer="http://anothersite.example.com/search?hl=en&q=pictures" http://example.com/images/pic01.jpg
Resume download of /images/pic01.jpg from host example.com for partially downloaded files:
wget -c http://example.com/images/pic01.jpg
Mirror host example.com, but do not follow any links to external sites:
wget -m http://example.com/
Mirror host example.com, but only retrieve files with extensions .jpg and .gif:
wget -m --accept=jpg,gif http://example.com/
Only retrieve files with extensions .jpg and .gif from example.com. Do not create any directories:
wget -m --accept=jpg,gif -nd http://example.com/
Retrieve all files except those with extension .html from example.com. Do not create any directories:
wget -m --accept=* --reject=html -nd http://example.com/
Download all files with extension .pdf from FTP site ftp.example.com
wget -m --accept=pdf -nd ftp://ftp.example.com
Mirror host example.com. Wait exactly 5 seconds between retrievals:
wget -m --wait=5 http://example.com/
Mirror host example.com. Wait between 0 to 2 times the wait value between retrievals (i.e. by specifying a wait value of 5 seconds, the wait period between retrievals will vary between 0 to 10 seconds):
wget -m --wait=5 --random-wait http://example.com/
Mirror host example.com, ignoring the instructions specified in the robots.txt file (use this with caution), and limit the download rate to 50KB/s:
wget -m -e robots=off --limit-rate=50k http://example.com/
Mirror the hosts specified in the file input_file (enter one URL per line):
wget -m -i input_file
Same as the command above but run wget in the background after startup:
wget -b -m -i input_file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment