Skip to content

Instantly share code, notes, and snippets.

@frafra
Created February 26, 2020 13:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save frafra/10ca8d7cb0ad4685a14e77c5d17a1ccc to your computer and use it in GitHub Desktop.
Save frafra/10ca8d7cb0ad4685a14e77c5d17a1ccc to your computer and use it in GitHub Desktop.
Extract a table from Wikipedia (example)
#!/bin/bash -x
set -e
title="List_of_most-viewed_YouTube_channels"
host="en.wikipedia.org"
tmp="$title.$(date +'%Y%m%d').html"
cleanup() { rm -f "$tmp"; }
trap cleanup 0
curl -L "$host/api/rest_v1/page/mobile-sections/$title" |
jq -r .remaining.sections[0].text |
sed -e "s;\"/;\"//$host/;g" -e 's;//;https://;g' |
tail -n+3 > "$tmp"
libreoffice --headless --calc --convert-to ods "$tmp"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment