Last active
August 29, 2015 14:17
-
-
Save neelsmith/7549ad5942750ef7ad55 to your computer and use it in GitHub Desktop.
Converts text content of XML from beta-code Greek to UTF-8 while preseving markup
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
Uses grapes to grab all dependencies needed to convert textual data in | |
XML texts of Greek like ancient Perseus texts from beta code representation | |
of Greek to the polytonic Greek range of Unicode in UTF-8 while preserving markup. | |
Writes a UTF-8 version of the XML file to standard output. | |
*/ | |
String usage = "Usage: groovy betaToUtf8Xml.groovy <FILENAME>" | |
// Both transcoder and greekutils are available from beta.hpcc.uh.edu: | |
@GrabResolver(name='beta', root='http://beta.hpcc.uh.edu/nexus/content/repositories/releases') | |
@Grab(group='edu.unc.epidoc', module='transcoder', version='1.2-SNAPSHOT') | |
@Grab(group='edu.harvard.chs', module='greekutils', version='0.8.9') | |
import edu.harvard.chs.f1k.* | |
import edu.unc.epidoc.transcoder.TransCoder | |
if (args.size() != 1){ | |
println usage | |
System.exit(-1) | |
} | |
File f = new File(args[0]) | |
String betaStr = f.getText() | |
GreekNode gn = new GreekNode(betaStr) | |
gn.setCharEnc("beta") | |
println gn.transcodeXml("UTF8") | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment