Skip to content

Instantly share code, notes, and snippets.

@cllu
Last active January 11, 2016 18:47
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save cllu/dc0ec896b9b5152ca7e6 to your computer and use it in GitHub Desktop.
Save cllu/dc0ec896b9b5152ca7e6 to your computer and use it in GitHub Desktop.
Rime Custom Schema
# Rime dictionary
# encoding: utf-8
---
name: cllu_pinyin
version: "2014.12.24"
sort: by_weight
use_preset_vocabulary: true
# import dict from luna_pinyin.dict.yaml
import_tables:
- luna_pinyin
...
# table begins
吕春良 lv chun liang 10000
吕春良 lcl 10000
# Rime schema
# encoding: utf-8
schema:
schema_id: cllu_pinyin
name: cllu
version: "0.22"
author:
- Chunliang Lyu <hi@chunlianglyu.com>
description: |
朙月拼音,簡化字輸出模式。
switches:
- name: ascii_mode
reset: 0
states: [ 中文, 西文 ]
- name: full_shape
states: [ 半角, 全角 ]
- name: zh_simp
reset: 1
states: [ 漢字, 汉字 ]
- name: ascii_punct
states: [ 。,, ., ]
engine:
processors:
- ascii_composer
- recognizer
- key_binder
- speller
- punctuator
- selector
- navigator
- express_editor
segmentors:
- ascii_segmentor
- matcher
- abc_segmentor
- punct_segmentor
- fallback_segmentor
translators:
- punct_translator
- table_translator@custom_phrase
- script_translator
filters:
- simplifier
- uniquifier
speller:
alphabet: zyxwvutsrqponmlkjihgfedcba
delimiter: " '"
algebra:
- erase/^xx$/
- abbrev/^([a-z]).+$/$1/
- abbrev/^([zcs]h).+$/$1/
- derive/^([nl])ve$/$1ue/
- derive/^([jqxy])u/$1v/
- derive/un$/uen/
- derive/ui$/uei/
- derive/iu$/iou/
- derive/([aeiou])ng$/$1gn/
- derive/([dtngkhrzcs])o(u|ng)$/$1o/
- derive/ong$/on/
- derive/ao$/oa/
- derive/([iu])a(o|ng?)$/a$1$2/
translator:
dictionary: cllu_pinyin
prism: luna_pinyin_simp
preedit_format:
- xform/([nl])v/$1ü/
- xform/([nl])ue/$1üe/
- xform/([jqxy])v/$1u/
custom_phrase:
dictionary: ""
user_dict: custom_phrase
db_class: stabledb
enable_completion: false
enable_sentence: false
initial_quality: 1
simplifier:
option_name: zh_simp
punctuator:
import_preset: default
key_binder:
import_preset: default
bindings:
- { when: always, accept: Control+Shift+4, toggle: zh_simp }
- { when: always, accept: Control+Shift+dollar, toggle: zh_simp }
recognizer:
import_preset: default
patch:
schema_list:
- schema: cllu_pinyin
#!/usr/bin/env python3
import csv
import re
chinese_re = re.compile('[\u4e00-\u9fa5]+')
names = []
with open('contacts.csv', encoding='utf-16') as f:
content = csv.DictReader(f)
for row in content:
name = row['Name']
if name and chinese_re.search(name):
names.append(name)
with open('names.txt', 'w') as f:
for name in names:
f.write('%s\n' % name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment