Skip to content

Instantly share code, notes, and snippets.

@edsu
Created March 20, 2020 16:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edsu/070696dd14f0a5a35f000c4db6c2c930 to your computer and use it in GitHub Desktop.
Save edsu/070696dd14f0a5a35f000c4db6c2c930 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python3
import csv
import requests
url = 'https://partner.archive-it.org/api/seed'
params = {
"collection": 13529,
"limit": 100,
"offset": 0
}
out = csv.writer(open('archiveit-covid19.csv', 'w'))
out.writerow(['Url', 'Title', 'Last Update'])
while True:
resp = requests.get(url, params=params)
seeds = resp.json()
if len(seeds) == 0: break
for seed in seeds:
if "Title" in seed["metadata"]:
title = seed["metadata"]["Title"][0]["value"]
else:
title = ""
out.writerow([
seed['url'],
title,
seed["last_updated_date"]
])
params['offset'] += 100
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment