Skip to content

Instantly share code, notes, and snippets.

@ssut
Created February 9, 2016 07:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ssut/4efb8870e8b5e9c07792 to your computer and use it in GitHub Desktop.
Save ssut/4efb8870e8b5e9c07792 to your computer and use it in GitHub Desktop.
Simplest way to sort Japanese in python
import icu
import romkan
from unihandecode import Unihandecoder
def main():
d = Unihandecoder(lang='ja')
collator = icu.Collator.createInstance(icu.Locale('ja_JP.UTF-8'))
table = [
u"女言葉の消失", # 2
u"キセキ", # 3
u"ふしぎなくすり", # 4
u"恋愛観測", # 5
u"嘘憑きとサルヴァドール", # 1
u"愛と勇気の三度笠ポン太", # 0
]
corresponds = []
for i, item in enumerate(table):
kana = romkan.to_hiragana(d.decode(item))
corresponds.append(kana)
result = sorted(zip(table, corresponds), key=lambda x: collator.getSortKey(x[1]))
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment