Skip to content

Instantly share code, notes, and snippets.

@dbaynard
Created May 31, 2018 10:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dbaynard/6c88ce61d48b70c3972fbc2ef1ef96f9 to your computer and use it in GitHub Desktop.
Save dbaynard/6c88ce61d48b70c3972fbc2ef1ef96f9 to your computer and use it in GitHub Desktop.
Downloading purchases (pdf books) from humble bundle

Super hacky, original method

Requires

  • jq
  • pandoc
  • Browser with dev tools (much easier than navigating source)
  • aria2c for batch downloading
  • text editor with f&r
  1. Go to purchase page.

  2. Select html containing the download links (usually div .whitebox-redux).

  3. Copy to new file (e.g. html/humble.html).

  4. Repeat 1–3 for all pages, using the same file for step 3.

  5. Convert html file to pandoc json.

    pandoc -r html -w json -o html/humble.json html/humble.html 
    
  6. Filter json with jq (quicker than writing pandoc filter for this purpose).

    This is hacky but glorious.

    Either direct to file (with > or tee) or copy output from terminal.

    jq '..?|select(.t=="Link")?|..?|select(contains(".pdf"))?' html/humble.json
    
  7. Strip the flanking quote marks.

  8. Replace the html-encoded entities with their actual values.

    I’ve just used regex replace in vim,

    :%s/\v\&/\&/g
    
  9. Run aria2 from directory where the files should end up.

    `aria2c -i file_with_list`
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment