Skip to content

Instantly share code, notes, and snippets.

@MagnusEnger
Created November 8, 2011 22:35
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save MagnusEnger/1349518 to your computer and use it in GitHub Desktop.
Save MagnusEnger/1349518 to your computer and use it in GitHub Desktop.
Transform CSV data from DOAJ into MARC suitable for ingestion into Koha
#!/bin/bash
# doaj2koha.sh
#
# Get the DOAJ (http://www.doaj.org/) data in CSV format here:
# http://www.doaj.org/doaj?func=csv
# Save the data as doaj.csv
#
# If you are only interested in some of the journals in DOAJ you
# can probably use the grep command to extract only the lines that
# contain words that are relevant to you. Make sure you keep the
# first line with the column definiteions, though!
#
# Get csvtomarc.pl and csvutils.pm from here:
# http://git.catalyst.net.nz/gw?p=koha.git;a=tree;f=import/csv;h=82facb6056faf0d70cab1264c121b67ef4c00c1f;hb=refs/heads/import_branch
#
# Put all of these files in the same directory as this script.
#
# Adjust these values:
KOHACONF=/home/magnus/sites/kohanor32-dev/etc/koha-conf.xml
KOHALIBS=/home/magnus/scripts/kohanor32/
#
# Run as ./doaj2koha.sh and you will get the MARC records in a file
# called out.mrc
#
# FIXME
# There is a problem with encoding which results in some garbled chars
perl ./csvtomarc.pl \
--kohaconf $KOHACONF \
--kohalibs $KOHALIBS \
-m 'Title=title' \
-m 'ISSN=biblioitems.issn?' \
-m 'EISSN=biblioitems.issn?' \
-m 'Title Alternative=marc:245_b?' \
-m 'Country=marc:260_a?' \
-m 'Publisher=marc:260_b?' \
-m 'Further Information=notes?' \
-m 'func:prefix:Language(s): :Language=notes?' \
-m 'Subjects=marc:653_a?' \
-m 'Keyword=marc:653_a?' \
-m 'Identifier=biblioitems.url!' \
-o out.mrc \
-i doaj.csv \
--format usmarc
# TODO
# Columns that still need to be mapped:
# Start Year
# End Year
#
# This did not work:
# -m 'func:combine:append::Start Year:append: - :append::End Year=notes?' \
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment