Skip to content

Instantly share code, notes, and snippets.

Last active August 29, 2015 13:56
Show Gist options
  • Save garfieldnate/8967321 to your computer and use it in GitHub Desktop.
Save garfieldnate/8967321 to your computer and use it in GitHub Desktop.
WICS Knowledge Transfer
First, you can go and watch an intro video on ITS here:
Here's the ITS 2.0 specification: I recommend reading sections 1 and 2 straightaway.
The ITS Interest Group mailing archive:
You can join and ask good questions. You'll find lots from me. But do make sure they are good questions about ITS.
You can go and watch an intro video on WICS here:
Here are the GitHub repositories. The first two are managed by Logrus, and I have a commit bit to the converter one. The third repo is managed by us and is a fork of the converter repo managed by Logrus (please use it and not the Logrus one). It has the added functionality of a WICS "project", as well as some bug fixes:
I have written lots of documentation for the converters, including a programming guide and an end-user guide. The docs are located in docs/html. Another very important document is docs/ That has my honest opinions about lots of stuff, as well as design options that were not taken and possible future work. There is also a PDF report for the whole WICS project written by Renat. The converters are a very short mention in that document, but is important to understand the viewer functionality, too.
Please read all of the documents I mentioned in the last paragraph. That should catch you up on ITS and the WICS converters, or at least help to.
Quick overview- we want to make the ITS markup in an XML, HTML5, or XLIFF document viewable in a web browser, because there's no way you can expect a human to read that stuff raw. There were two different approaches to this originally. The first was to convert a document into static HTML, with all of the highlighting, coloring, toolboxing or whatever baked straight in. The second was a "dynamic" approach, where some JavaScript/CSS code would be linked to an HTML5 page that had regular ITS markup in it and the code would apply styling to make the ITS info visible. The converters I wroter were to take ITS-decorated content that wasn't in HTML5 and re-render it HTML5. That could then be viewed with the dynamic viewer.
In the end neither of the approaches apparently worked out. Logrus produced a "hybrid" viewer that required pre-processing into static HTML, and then also added some JS/CSS code to make it slightly interactive. My opinion is that this is all static and no dynamic and should not be called "hybrid". But whatever, that's what they did, and they said that the static converter didn't work out even though that's clearly what they made.
The trouble with this is that it requires this annoying pre-processing, which is written in C#. That's fine really for some situations, but I think it's a dead-end. We need the dynamic viewer. We could host the JS/CSS online somewhere and let all of Wikipedia or CNN or whoever link to it for a special text worker view of their website. Translation companies could host use it for massively translating. It would really help the cause of efficient internationalization and localization.
So, the dynamic viewer. We should fix that. There is a demo version of it that displays 4 ITS data categories. It's in the viewer repository in code/dynamic_convert, though I also put a copy in the byutrg converter repo in the share directory for output in WICS projects. I would make a new repo just for the dynamic converter (Logrus didn't use a VCS and so have almost no history).
Then look at the top of wics.js. You can see the four supported categories (with only two supported rule types). I don't understand much about this code, but I suppose you could start off by trying to add support for, say, termRule.
Besides supporting almost no data categories, this script is completely broken because it simply passes XPath selectors to jQuery. jQuery does not understand XPath, so while some expressions work by chance, most expressions will not work. The next step for you is to use Document.evaluate instead of just $() to get document elements. Test that out with some things like "html/body/p[3]/i[2]" and see if it works. If so, great! Note that to make Document.evaluate work on all browsers, you may need to use Wicked Good XPath from Google.
After this there are plenty more things to do. You'll need to go through all of the data categories in the ITS spec (section 8) one at a time and make sure the full functionality is implemented, making some sort of styling or display for each one. A rough outline of the functionality you need:
* local attributes
* global rules/selectors
* global rules with relative selectors
* global rules with relative text selectors (ID value)
* global rules with relative URI selectors (termInfoRef, etc.; have to load up an iframe to display external stuff)
* inheritance, for both local and global markup (ugh!)
* some categories inherit and others don't (ugh!!)
* see (and overview document)
* also see (and overview document)
* possibly default values (not sure if those will ever need displaying but you never know)
* standoff markup
There's a lot there. And it can all be very crazy. Please PLEASE test everything. Use the Karma framework and have it running on multiple web browsers, because I guarantee that they will all be different. I recommend James Shore's automatopia to start out with: You'll have to intsall NodeJS. It's worth it.
I just remembered, I did a demo to get us the WICS contract, and it had tests. It was different because I just loaded up an XML document into the HTML body and did ITS stuff from there, which is NOT how it will work with real HTML5/ITS combos. I'll send the code along to you.
Also, there's the XLIFF2HTML converter. That had only demo functionality implemented. It has a TODO list a mile long. I don't know if that will get funded, but I think I documented it and how much is left to do pretty well.
Call or email if you need more! Or even comment on this gist and I can add stuff.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment