Skip to content

Instantly share code, notes, and snippets.

@stitchinthyme
Last active August 29, 2015 14:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stitchinthyme/dfeac2c8579bbd2d2fb0 to your computer and use it in GitHub Desktop.
Save stitchinthyme/dfeac2c8579bbd2d2fb0 to your computer and use it in GitHub Desktop.
Script to query OpenDoar for science sites in English that allow free access to their metadata; parses the XML into JSON
import urllib2
import xml.etree.ElementTree as etree
import json
# Query for sites in English that have science data, an OAI URL, and allow free access to metadata.
resp = urllib2.urlopen('http://opendoar.org/api13.php?la=en&oai=y&pograde=11,12&subject=Ca,Ce,Cur,Cuv')
data = resp.read()
tree = etree.fromstring(data)
l = []
for child in tree.iter('repository'):
d = {}
for item in child:
if item.tag in ['rUrl', 'rOaiBaseUrl', 'rName']:
d[item.tag] = item.text
l.append(d)
# Prints a list of all returned repositories' names, URLs, OAI URLs in JSON format (list of dicts)
print json.dumps(l)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment