Skip to content

Instantly share code, notes, and snippets.

@EtherTyper
Created June 3, 2022 09:31
Show Gist options
  • Save EtherTyper/f311a62f11e0140df05f6bf9c81353e8 to your computer and use it in GitHub Desktop.
Save EtherTyper/f311a62f11e0140df05f6bf9c81353e8 to your computer and use it in GitHub Desktop.
Mises Scraper Script
#!/usr/bin/env bash
URL='https://mises.org/library/austrian'
ARTICLES=$(curl "$URL" | grep -oE 'of [0-9]+' | sed 's/of //')
PAGES=$((ARTICLES/10))
rm files.txt
for i in $(eval echo {0..$PAGES})
do
curl "$URL?page=$i" | grep -o 'href=\"[^"]*\.pdf' | sed 's/href="//g' >> files.txt
done
pdfs=$(sort files.txt | uniq)
while read -r pdf
do
wget -nc "$pdf"
done <<< "$pdfs"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment