Created
November 8, 2011 22:35
-
-
Save MagnusEnger/1349518 to your computer and use it in GitHub Desktop.
Transform CSV data from DOAJ into MARC suitable for ingestion into Koha
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# doaj2koha.sh | |
# | |
# Get the DOAJ (http://www.doaj.org/) data in CSV format here: | |
# http://www.doaj.org/doaj?func=csv | |
# Save the data as doaj.csv | |
# | |
# If you are only interested in some of the journals in DOAJ you | |
# can probably use the grep command to extract only the lines that | |
# contain words that are relevant to you. Make sure you keep the | |
# first line with the column definiteions, though! | |
# | |
# Get csvtomarc.pl and csvutils.pm from here: | |
# http://git.catalyst.net.nz/gw?p=koha.git;a=tree;f=import/csv;h=82facb6056faf0d70cab1264c121b67ef4c00c1f;hb=refs/heads/import_branch | |
# | |
# Put all of these files in the same directory as this script. | |
# | |
# Adjust these values: | |
KOHACONF=/home/magnus/sites/kohanor32-dev/etc/koha-conf.xml | |
KOHALIBS=/home/magnus/scripts/kohanor32/ | |
# | |
# Run as ./doaj2koha.sh and you will get the MARC records in a file | |
# called out.mrc | |
# | |
# FIXME | |
# There is a problem with encoding which results in some garbled chars | |
perl ./csvtomarc.pl \ | |
--kohaconf $KOHACONF \ | |
--kohalibs $KOHALIBS \ | |
-m 'Title=title' \ | |
-m 'ISSN=biblioitems.issn?' \ | |
-m 'EISSN=biblioitems.issn?' \ | |
-m 'Title Alternative=marc:245_b?' \ | |
-m 'Country=marc:260_a?' \ | |
-m 'Publisher=marc:260_b?' \ | |
-m 'Further Information=notes?' \ | |
-m 'func:prefix:Language(s): :Language=notes?' \ | |
-m 'Subjects=marc:653_a?' \ | |
-m 'Keyword=marc:653_a?' \ | |
-m 'Identifier=biblioitems.url!' \ | |
-o out.mrc \ | |
-i doaj.csv \ | |
--format usmarc | |
# TODO | |
# Columns that still need to be mapped: | |
# Start Year | |
# End Year | |
# | |
# This did not work: | |
# -m 'func:combine:append::Start Year:append: - :append::End Year=notes?' \ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment