Skip to content

Instantly share code, notes, and snippets.

@kumekay
Forked from anonymous/download.sh
Last active January 13, 2016 08:48
Show Gist options
  • Save kumekay/d32d32ad4e22ae46511e to your computer and use it in GitHub Desktop.
Save kumekay/d32d32ad4e22ae46511e to your computer and use it in GitHub Desktop.
Wget download data in background with cookies

Download large files in background with cookies for auth using wget

Sometimes it is necessary to dowload large files to remote server (for example datasets for Kaggle competitions), but download is available only for authenticated users. You can do it in backround task using cookies from your browser for authentication. This note generally copy wget manual https://www.gnu.org/software/wget/manual/wget.html

Cookies

First you have to create cookies.txt file with copy of all your cookies for required site If you use chrome, this extension is useful https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg

Example:

wget --load-cookies ./cookies.txt -x -nH -bqc --cut-dirs=3 https://www.example.com/very-large.zip

Used wget options:

  • -x Force creation of directories wget -x http://example.com/some/files.txt will save the downloaded file to example.com/some/files.txt
  • -nH Disable generation of host-prefixed directories. By default, invoking Wget with -r http://example.com/ will create a structure of directories beginning with example.com. This option disables such behavior
  • --cut-dirs=NUMBER Ignore NUMBER remote directory components
  • -b Go to background after startup
  • -q Quiet
  • -c Resume getting a partially-downloaded file

if you need just one file, command will be shorter:

wget --load-cookies ./cookies.txt -bqc https://www.example.com/very-large.zip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment