This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| "MARC is concise as a physical format (something that is less important today)" | |
| ^--- I know it used to be a lot MORE important, this doesn't feel unimportant when I'm extracting 6.5 million records from one system and transferring them to another server! (And I'm aware there are more concise ways to express data that we often see expressed in XML) | |
| -=-=-=-=-=-=-=-=-=- | |
| Ease of transforming/exporting/analyzing | |
| -=-=-=-=-=-=-=-=-=- | |
| I work with big batches of MARC (in MARC-binary and MARC-XML) and other metadata formats (DC, EAD, DDI, MODS, Oracle Endeca format for indexed data) expressed in XML. | |
| I whip up XSLT to do stuff when I have to. But Terry Reese gave us MARCedit which makes it so easy to do fairly complex transformations across a set of MARC records, or just get an overview of what fields are in a record set and with what frequency. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| **Default 'usable date' range is 500 to current year + 6** | |
| MarcToArgot | |
| MarcToArgot::Macros::Shared::PublicationYear | |
| usable_date? - determine whether a given date value is usable for deriving a reasonable single value for sorting/filtering | |
| 1997 is usable - 4 digits, in usable range | |
| 688 is usable - fewer than 4 digits, but in usable range | |
| 9999 is usable - gets translated into current year + 1 | |
| 499 is NOT usable - out of usable range | |
| 6754 is NOT usable - 4 digits, but out of usable range |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| [ | |
| { | |
| "op": "core/column-removal", | |
| "description": "Remove column date of birth", | |
| "columnName": "date of birth" | |
| }, | |
| { | |
| "op": "core/column-removal", | |
| "description": "Remove column date of death", | |
| "columnName": "date of death" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <?xml version="1.0" encoding="UTF-8"?> | |
| <document name="persons"> | |
| <persons_common> | |
| <personTermGroupList> | |
| <personTermGroup> | |
| <termDisplayName>John Mellon; J. T. Mellon</termDisplayName> | |
| <termType>urn:cspace:core.collectionspace.org:vocabularies:name(persontermtype):item:name(descriptor)'descriptor'</termType> | |
| <termSourceID>123</termSourceID> | |
| <termSourceDetail>detail text</termSourceDetail> | |
| <surName>Mellon</surName> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| rake spec SPEC=spec/collectionspace/converter/core/concept_spec.rb | |
| /Users/spurgin/.rvm/rubies/ruby-2.5.3/bin/ruby -I/Users/spurgin/.rvm/gems/ruby-2.5.3/gems/rspec-core-3.9.0/lib:/Users/spurgin/.rvm/gems/ruby-2.5.3/gems/rspec-support-3.9.0/lib /Users/spurgin/.rvm/gems/ruby-2.5.3/gems/rspec-core-3.9.0/exe/rspec spec/collectionspace/converter/core/concept_spec.rb | |
| CORE CONCEPT: | |
| <?xml version="1.0" encoding="UTF-8"?> | |
| <document name="concepts"> | |
| <ns2:concepts_common xmlns:ns2="http://collectionspace.org/services/concept" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> | |
| <conceptTermGroupList> | |
| <conceptTermGroup> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <?xml version="1.0" encoding="UTF-8"?> | |
| <mods xmlns="http://www.loc.gov/mods/v3" xmlns:drs="info://lyrasis/drs-admin/v1" xmlns:dc="http:://purl.org/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dwc="http://rs.tdwg.org/dwc/terms/" xmlns:edm="http://pro.europeana.eu/edm-documentation" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> | |
| <titleInfo usage="primary"> | |
| <title>Boris the Guineafowl</title> | |
| <subTitle>Portrait of a birb</subTitle> | |
| </titleInfo> | |
| <titleInfo type="abbreviated"> | |
| <title>Boris</title> | |
| </titleInfo> | |
| <titleInfo> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "structuredDate": { | |
| "fields": { | |
| "dateDisplayDate": { | |
| "[config]": { | |
| "extensionName": "structuredDate" | |
| } | |
| }, | |
| "dateAssociation": { | |
| "[config]": { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { "accessionDateGroup": { | |
| "[config]": { | |
| "dataType": "DATA_TYPE_STRUCTURED_DATE", | |
| "messages": { | |
| "name": { | |
| "id": "field.acquisitions_common.accessionDateGroup.name", | |
| "defaultMessage": "Accession date" | |
| } | |
| }, | |
| "searchView": { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| I use a little awk oneliner derived from https://www.datafix.com.au/cookbook/structure1.html | |
| to verify the structure of client-supplied CSVs (that I convert to TSVs) or TSVs. One client's | |
| table of object data provided as TSV used CRLF row endings, AND included TAB, CRLF, CR, and LF | |
| characters inside individual fields to format multiline notes. | |
| The result of my check on this ONE FILE was as follows: | |
| 292 rows are broken into 82 columns | |
| 606 rows are broken into 1 columns | |
| 486 rows are broken into 0 columns |
OlderNewer