Last active
August 29, 2015 14:19
-
-
Save stitchinthyme/dfeac2c8579bbd2d2fb0 to your computer and use it in GitHub Desktop.
Script to query OpenDoar for science sites in English that allow free access to their metadata; parses the XML into JSON
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import urllib2 | |
import xml.etree.ElementTree as etree | |
import json | |
# Query for sites in English that have science data, an OAI URL, and allow free access to metadata. | |
resp = urllib2.urlopen('http://opendoar.org/api13.php?la=en&oai=y&pograde=11,12&subject=Ca,Ce,Cur,Cuv') | |
data = resp.read() | |
tree = etree.fromstring(data) | |
l = [] | |
for child in tree.iter('repository'): | |
d = {} | |
for item in child: | |
if item.tag in ['rUrl', 'rOaiBaseUrl', 'rName']: | |
d[item.tag] = item.text | |
l.append(d) | |
# Prints a list of all returned repositories' names, URLs, OAI URLs in JSON format (list of dicts) | |
print json.dumps(l) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment