This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<!-- | |
Pipeline for converting pairs of raw Android SDK values*/strings.xml | |
files into TMX (for translation reference, etc.). | |
Load this pipeline into Okapi Rainbow and set the input files, e.g.: | |
Input List 1: $ANDROID_HOME/.../values/strings.xml | |
Input List 2: $ANDROID_HOME/.../values-ja/strings.xml | |
Use the okf_xml@AndroidStringsImproved.fprm filter config included |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
m — 記号 em dash | |
ん – 記号 en dash | |
おー ō アルファベット | |
おー Ō アルファベット | |
うー ū アルファベット | |
うー Ū アルファベット |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;; Increase default font size. | |
(set-face-attribute 'default nil :height 180) | |
;; Set decent default fonts for Japanese and Chinese, | |
;; but *only* if in a graphical context. | |
;; Set Japanese second so that Japanese glyphs override Chinese | |
;; when both charsets cover the same codepoints. | |
(if (fboundp 'set-fontset-font) | |
(progn | |
(set-fontset-font |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
Count the number of characters in Journey to the West | |
''' | |
import urllib2 | |
URL = 'http://www.sdmz.net/xy/%03d.htm' | |
CHAPTERS = 100 | |
def do_count(): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> | |
<!-- Convert Apple *.lg glossaries to TMX. Usage: | |
xsltproc [-o <output file>] [-\-stringparam srclang <source lang>] lg2tmx.xsl <input file> | |
If not specified, the TMX header's srclang attribute defaults to "*all*". | |
Get glossaries at https://developer.apple.com/downloads/?name=glossaries --> | |
<xsl:output method="xml" indent="yes" encoding="UTF-8" /> | |
<xsl:param name="srclang" select="'*all*'" /> | |
<xsl:template match="/"> | |
<tmx version="1.4"> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
Created on Dec 19, 2013 | |
A script to convert TMXs into parallel corpuses for machine | |
translation (e.g. Moses: http://www.statmt.org/moses/) training. | |
Pass in either paths to TMX files, or directories containing TMX files. | |
The script will recursively traverse directories and process all TMXs. | |
To perform tokenization or to filter the output, use the convert() method |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/************************************************************************** | |
Public Domain | |
To the extent possible under law, Aaron Madlon-Kay has waived all | |
copyright and related or neighboring rights to this work. | |
This work is published from: Japan | |
**************************************************************************/ | |
package org.amk; | |
import java.util.ArrayList; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
TMXTimekeeper.py | |
Analyze a TMX file and try to figure out how much time | |
has been spent translating. Assume a minimum of 5 minutes | |
for translating "sessions". | |
Created on 2013/02/17 | |
@author: Aaron Madlon-Kay |
NewerOlder