Skip to content

Instantly share code, notes, and snippets.

@korakot
Created July 13, 2017 09:57
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save korakot/bebc866180ce873ac25b38e285a20521 to your computer and use it in GitHub Desktop.
Save korakot/bebc866180ce873ac25b38e285a20521 to your computer and use it in GitHub Desktop.
Thai Sort
import icu
thkey = icu.Collator.createInstance(icu.Locale('th_TH')).getSortKey
words = 'ไก่ ไข่ ก ฮา'.split()
print(sorted(words, key=thkey)) # ['ก', 'ไก่', 'ไข่', 'ฮา']
@korakot
Copy link
Author

korakot commented Jul 19, 2017

ก่อนจะ run ได้ ต้องลง icu กะ pyicu ก่อน
conda install icu
กับ
pip install pyicu

@korakot
Copy link
Author

korakot commented Feb 7, 2018

เขียนใหม่เป็น pure python

def thkey(word):
    cv = re.sub('[็-์]', '', word,re.U) # remove tone
    cv = re.sub('([เ-ไ])([ก-ฮ])', '\\2\\1', cv,re.U) # switch lead vowel
    tone = re.sub('[^็-์]', ' ', word,re.U) # just tone
    return cv+tone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment