Requires
jq
pandoc
- Browser with dev tools (much easier than navigating source)
aria2c
for batch downloading- text editor with f&r
-
Go to purchase page.
-
Select html containing the download links (usually
div .whitebox-redux
). -
Copy to new file (e.g.
html/humble.html
). -
Repeat 1–3 for all pages, using the same file for step 3.
-
Convert html file to pandoc json.
pandoc -r html -w json -o html/humble.json html/humble.html
-
Filter json with jq (quicker than writing pandoc filter for this purpose).
This is hacky but glorious.
Either direct to file (with
>
ortee
) or copy output from terminal.jq '..?|select(.t=="Link")?|..?|select(contains(".pdf"))?' html/humble.json
-
Strip the flanking quote marks.
-
Replace the html-encoded entities with their actual values.
I’ve just used regex replace in vim,
:%s/\v\&/\&/g
-
Run aria2 from directory where the files should end up.
`aria2c -i file_with_list`