Skip to content

Instantly share code, notes, and snippets.

@neelsmith
Last active August 29, 2015 14:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save neelsmith/7549ad5942750ef7ad55 to your computer and use it in GitHub Desktop.
Save neelsmith/7549ad5942750ef7ad55 to your computer and use it in GitHub Desktop.
Converts text content of XML from beta-code Greek to UTF-8 while preseving markup
/*
Uses grapes to grab all dependencies needed to convert textual data in
XML texts of Greek like ancient Perseus texts from beta code representation
of Greek to the polytonic Greek range of Unicode in UTF-8 while preserving markup.
Writes a UTF-8 version of the XML file to standard output.
*/
String usage = "Usage: groovy betaToUtf8Xml.groovy <FILENAME>"
// Both transcoder and greekutils are available from beta.hpcc.uh.edu:
@GrabResolver(name='beta', root='http://beta.hpcc.uh.edu/nexus/content/repositories/releases')
@Grab(group='edu.unc.epidoc', module='transcoder', version='1.2-SNAPSHOT')
@Grab(group='edu.harvard.chs', module='greekutils', version='0.8.9')
import edu.harvard.chs.f1k.*
import edu.unc.epidoc.transcoder.TransCoder
if (args.size() != 1){
println usage
System.exit(-1)
}
File f = new File(args[0])
String betaStr = f.getText()
GreekNode gn = new GreekNode(betaStr)
gn.setCharEnc("beta")
println gn.transcodeXml("UTF8")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment