Skip to content

Instantly share code, notes, and snippets.

@traek
Last active May 20, 2024 14:26
Show Gist options
  • Save traek/02c3634e1e439434a256e9ff6f289df8 to your computer and use it in GitHub Desktop.
Save traek/02c3634e1e439434a256e9ff6f289df8 to your computer and use it in GitHub Desktop.
Simple script to download files from single web page
#!/usr/bin/env bash
if (( $# > 0 )); then destination=$1; else destination="."; fi
# Check for required commands
required=(awk grep lynx wget); missing=()
for command in ${required[@]}; do
hash $command 2>/dev/null || missing+=($command)
done
if (( ${#missing[@]} > 0 )); then
echo "[FATAL] could not find command(s): ${missing[@]}. Exiting!"
exit 1
fi
# Edit these two variables as needed (example usage for 'The MagPi' magazine issues)
sourcepath="https://www.raspberrypi.org/magpi-issues"
pattern="^MagPi[0-9]*.pdf"
echo -n "[GET] Reading source links... "
file=($(lynx -listonly -dump $sourcepath | awk -F'/' '{print $NF}' | grep $pattern))
echo "DONE"
for dl in ${file[@]}; do
if [[ ! -f $destination/$dl ]]; then
echo -n "[MISSING] Downloading to $destination/$dl"
wget -q -P $destination --show-progress $sourcepath/$dl
else
echo "[FOUND] Skipping $destination/$dl"
fi
done
@traek
Copy link
Author

traek commented Jul 15, 2022

This method no longer works for The MagPi (it was broken quite some time ago) but is still useful for similar pages. I made another attempt that specifically downloads any available Raspberry Pi Press publications here: https://github.com/traek/pi-tools/blob/main/raspipress.py

@jclack2
Copy link

jclack2 commented Sep 26, 2023

This is cool - thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment