Download Artworks.json and Artists.json from https://github.com/MuseumofModernArt/collection.
First, pre-process Artworks.json with jq:
- convert Artworks.json to newline-delimited json records:
jq -c '.[]' Artworks.json > Artworks.ndjson
- Add a
MediachainWKI
field to each record, based on theObjectID
field. The originalObjectID
field is unchanged.jq -c '.MediachainWKI = (.ObjectID | tostring | "moma:artworks:" + .)'
- Add an
ArtistMediachainWKIs
field that mapsConstituentID
entries to mediachain WKIs:jq -c '.ArtistMediachainWKIs = (.ConstituentID | map(tostring | "moma:artists:" + .))
Those three steps can be combined into a single command, to get the whole file ready for ingestion:
jq -c '.[] | .MediachainWKI = (.ObjectID | tostring | ("moma:artwork:" + .)) | .ArtistMediachainWKIs = (.ConstituentID | map(tostring | "moma:artists:" + .))' Artworks.json > Artworks-Mediachain.ndjson
Next, pass the Artworks-Mediachain.ndjson
file to the mcclient
command:
mcclient publish --idSelector MediachainWKI images.moma Artworks-Mediachain.ndjson
Do a simlar jq preprocess step for Artists.json:
jq -c '.[] | .MediachainWKI = (.ConstituentID | tostring | "moma:artist:" + .)' ./Artists.json > Artists-Mediachain.ndjson
Then publish Artists-Mediachain.ndjson
mcclient publish --idSelector MediachainWKI images.moma.artists Artists-Mediachain.ndjson