Skip to content

Instantly share code, notes, and snippets.

View anarchivist's full-sized avatar
🌬️

maría a. matienzo anarchivist

🌬️
View GitHub Profile
{
"@context": {
"@vocab": "http://purl.org/dc/elements/1.1/",
"dcterms": "http://purl.org/dc/terms/",
"dpla": "http://dp.la/terms/",
"edm": "http://www.europeana.eu/schemas/edm/",
"geo": "http://www.w3.org/2003/01/geo/wgs84_pos#",
"lcsh": "http://id.loc.gov/authorities/subjects/",
"ore": "http://www.openarchives.org/ore/terms/",
"skos": "http://www.w3.org/2004/02/skos/core#",
{
"@id": "http://dp.la/api/items/ce6ceb0e0273486ce791bf9bc36389eb",
"@context": "http://rawgithub.com/anarchivist/8718510/raw/dpla-mapv3-context.json",
"@type": "oreAggregation",
"aggregatedCHO": "#sourceResource",
"score": 13.814219,
"provider": "Mountain West Digital Library",
"dataProvider": "Utah State Historical Society",
"isShownAt": "http://thoth.library.utah.edu:1701/primo_library/libweb/action/dlDisplay.do?vid=MWDL&afterPDS=true&docId=digcoll_uuu_11USHS_Olyleg/51",
"object": "http://content.lib.utah.edu/utils/getthumbnail/collection/USHS_Olyleg/id/51",
# before logging in in the browser
halfsour:~ mark$ curl -I http://dev.dp.la/exhibitions/exhibits/show/my-kittens-exhibition
HTTP/1.1 404 Not Found
Server: nginx/1.1.19
Date: Thu, 27 Feb 2014 21:50:17 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
X-Powered-By: PHP/5.3.10-1ubuntu3.9
Set-Cookie: b074a3e09d770cb04ecc862508f729f0=3cst5il0nifde5jb62r7pki6e1; path=/; HttpOnly
@anarchivist
anarchivist / mapv3.schema.json
Last active August 29, 2015 13:57
Draft JSON Schema for DPLA MAPv3
{
"type": "object",
"$schema": "http://json-schema.org/draft-04/schema#",
"version": 3,
"title": "DPLA Metadata Application Profile v3 JSON Schema",
"description": "Experimental JSON Schema for use in validating records against the DPLA MAPv3",
"id": "http://dp.la/mapv3.schema",
"required": [
"@context",
"@id",
{
"id": "http://api.dp.la/schemas/mapv3#",
"type": "object",
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "DPLA Metadata Application Profile v3 JSON Schema",
"oneOf": [
{
"$ref": "#/definitions/item"
},
{
@anarchivist
anarchivist / gist:9496151
Last active August 29, 2015 13:57
DPLA RDF Application Profile - draft use case

Overview

DPLA maintains an access portal to digitized cultural heritage objects held by libraries, archives, museums, and historical societies throughout the United States, and provides bulk and programmatic access to this data. The DPLA Metadata Application Profile version 3 (MAPv3) builds on the Europeana Data Model. As such, our use case is somewhat similar to the EDM and the http://wiki.dublincore.org/index.php/DDB-EDM DDB-EDM] use cases.http://wiki.dublincore.org/index.php/DDB-EDM DDB-EDM] use cases.

We harvest data using several different methods (file transfer, OAI-PMH, site-specific APIs, etc.) and process data in different formats (MODS, MARCXML, qualified and unqualified DC, and site-specific serializations). DPLA augments and normalizes data received from partners (content hubs and service hubs) as an enrichment pipeline that is part of our ingestion process. While MAPv3 builds on EDM, we currently use JSON-LD as our sol

@anarchivist
anarchivist / gist:10349071
Last active August 29, 2015 13:58
Postprocessing QA reports from CouchDB using jq
# This works much better, if you've got the dumped reports already
# use three tabs as separator in case there are any tabs in the data...
$ jq -r '.rows[] | "\"\(.key[1])\"\t\t\t\(.value)"' preprocessed-reports/subject_count.json | \
sort | awk -F"\t\t\t" '{a[$1]+=$2;}END{for(i in a)print i"\t"a[i];}' | \
sort --field-separator=$'\t' --key=2 -gr |awk -F"\t" '{print $1","$2;}' > subject_count.csv
# NOTE: this is only really OK for small reports
$ curl "http://repo-proxy:5984/dpla/_design/qa_reports/_view/type_count?group=true" | jq -r '
"term,count",
(
@anarchivist
anarchivist / gist:85b79293cdf4f1d5bb7f
Created May 10, 2014 21:06
Streamtools demo to harvest and parse data from an OAI-PMH endpoint
{
"Connections": [
{
"ToRoute": "in",
"ToId": "17",
"FromId": "18",
"Id": "19"
},
{
"ToRoute": "in",
@anarchivist
anarchivist / gist:9f4188940dd2e3a01fc2
Last active August 29, 2015 14:01
Streamtools demo to begin pulling CouchDB changes
{
"Connections": [
{
"ToRoute": "in",
"ToId": "2",
"FromId": "1",
"Id": "3"
},
{
"ToRoute": "in",
@anarchivist
anarchivist / dpla-mapv3-collection.jsonld
Last active August 29, 2015 14:01
Updated DPLA @context mapping
{
"@context": {
"dc": "http://purl.org/dc/elements/1.1/",
"dcmitype": "http://purl.org/dc/dcmitype/",
"dcterms": "http://purl.org/dc/terms/",
"@vocab": "http://purl.org/dc/elements/1.1/",
"admin": null,
"id": null,
"_id": null,
"ingestionSequence": null,