Skip to content

Instantly share code, notes, and snippets.

@fauxneticien
Last active June 26, 2017 07:13
Show Gist options
  • Save fauxneticien/a002d7e7f19812e300cb3242e5c1b61e to your computer and use it in GitHub Desktop.
Save fauxneticien/a002d7e7f19812e300cb3242e5c1b61e to your computer and use it in GitHub Desktop.
Vertically bind all .eafs in a folder into a single XML file
#/bin/bash
export SEARCH_FOLDER=/git-repos/asr-daan/komnzo_text
time find "$SEARCH_FOLDER" -name "*.eaf" |
parallel -j 4 '
grep -v "^<?xml" {} |
sed "s%<ANNOTATION_DOCUMENT \(.*\)>%<ANNOTATION_DOCUMENT \1 SRC=\"{}\">%g"
' |
awk 'BEGIN{print "<?xml version=\"1.0\" encoding=\"UTF-8\"?><EAFS_DB>"}
{print $0}
END{print "</EAFS_DB>"}' > all.xml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment