Skip to content

Instantly share code, notes, and snippets.

@realeroberto
Created March 2, 2021 06:29
Show Gist options
  • Save realeroberto/ba6844f529419266db9f92b8fa031c69 to your computer and use it in GitHub Desktop.
Save realeroberto/ba6844f529419266db9f92b8fa031c69 to your computer and use it in GitHub Desktop.
Download summaries of all Guardian articles for a given keyword
#!/bin/sh
# Download summaries of all Guardian articles for a given keyword
TAG=${1:-covid}
APIKEY=test
APIROOT=https://content.guardianapis.com
# LOOP 1: get all tags containing $TAG
for tag in $(
curl -s "$APIROOT/tags?q=$TAG&api-key=$APIKEY" \
| jq -r .response.results[].id
)
do
# LOOP 2: get all articles for each tag
for url in $(
curl -s "$APIROOT/search?tag=$tag&api-key=$APIKEY" \
| jq -r .response.results[].apiUrl
)
do
# LOOP 3: download each article's summary
curl -s "$url?show-blocks=body&api-key=$APIKEY" \
| jq -r .response.content.blocks.body[].bodyTextSummary
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment