Created
January 2, 2018 23:44
-
-
Save takehiko/61223dc234fd39149202306fcc24b9a2 to your computer and use it in GitHub Desktop.
Print kanji (Chinese characters) within 15 stroke counts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# kanji-by-kakusu.rb : print kanji (Chinese characters) within 15 stroke counts | |
# by takehikom | |
# 小学校で学習する15画以下の漢字をすべて出力する(他の画数,制限なしにも対応) | |
# 画数情報は,漢字辞典オンライン http://kanji.jitenon.jp/ にアクセスして取得 | |
require "open-uri" | |
require "kconv" | |
class KanjiByKakusu | |
def initialize(opt = {:el => true}) | |
@flag_elementary_only = opt[:el] # 小学校で学習する漢字のみなら真 | |
end | |
def find_kanji(kakusu, quiet: false) | |
kakusu = kakusu.to_i # 画数(整数) | |
raise if kakusu <= 0 | |
# ファイルがあれば読み出す | |
# なければアクセスして取得し,ファイルに保存 | |
filename = "kakusu%02d.html" % kakusu | |
if test(?f, filename) | |
puts "open #{filename}" if $DEBUG | |
html = open(filename).read | |
else | |
url = "http://kanji.jitenon.jp/cat/" + filename | |
puts "access to #{url}" if $DEBUG | |
html = OpenURI.open_uri(url).read | |
open(filename, "w").print html | |
sleep 2 | |
end | |
# htmlを1行ずつ見て学年・漢字を取得 | |
kanji_a = [] | |
kanken_class = "" | |
grade = nil | |
html.each_line do |line| | |
case line.toutf8 | |
when /<th .*?nowrap>(.+?)</ | |
kanken_class = $1 | |
grade = nil | |
when /^(小学校.*年生)<\/th>/ | |
grade = $1 | |
when />(.)<\/a><\/td>$/ | |
kanji = $1 | |
if !@flag_elementary_only || grade | |
print kanji if !quiet | |
kanji_a << kanji | |
end | |
end | |
break if line.index('<div class="kanrenads2">') | |
end | |
puts if !kanji_a.empty? && !quiet | |
kanji_a | |
end | |
end | |
if __FILE__ == $0 | |
if ARGV.empty? | |
# 引数がないときは,15画以下の漢字を画数ごとに出力 | |
kbk = KanjiByKakusu.new | |
1.upto(15) do |kakusu| | |
print "%2d: " % kakusu | |
kanji_a = kbk.find_kanji(kakusu, quiet: true) | |
puts "#{kanji_a.join} (#{kanji_a.length}字)" | |
end | |
else | |
# 引数があるときは,その画数だけ | |
kanji_a = KanjiByKakusu.new.find_kanji(ARGV.first, quiet: true) | |
puts kanji_a.join | |
end | |
end | |
__END__ | |
1: 一 (1字) | |
2: 九七十人二入八力刀丁 (10字) | |
3: 下口三山子女小上夕千川大土丸弓工才万士久干己寸亡 (24字) | |
4: 円王火月犬五手水中天日木文六引牛今元戸午公止少心切太内父分方毛友区化反予欠氏不夫支比仏尺収仁片 (47字) | |
5: 玉左四出正生石田白本目右立外兄古広矢市台冬半母北用央去号皿仕写主申世他打代皮氷平由礼以加功札史司失必付辺包末未民令圧永可刊旧句示犯布弁穴冊処庁幼 (72字) | |
6: 休気糸字耳先早竹虫年百名羽会回交光考行合寺自色西多池地当同肉米毎安曲血向死次式守州全有羊両列衣印各共好成争仲兆伝灯老因仮件再在舌団任宇灰危机吸后至存宅 (75字) | |
7: 花貝見車赤足村男町何角汽近形言谷作社図声走体弟売麦来里医究局君決住助身対投豆坂返役位囲改完希求芸告材児初臣折束低努兵別利良冷労応快技均災志似序状条判防余我系孝困私否批忘卵乱 (86字) | |
8: 雨学空金青林画岩京国姉知長直店東歩妹明門夜委育泳岸苦具幸始使事実者昔取受所注定波板表服物放味命油和英果芽官季泣協径固刷参治周松卒底的典毒念府法牧例易往価河居券効妻枝舎述承招性制版肥非武沿延拡供呼刻若宗垂担宙忠届乳拝並宝枚 (110字) | |
9: 音草科海活計後室首秋春食星前茶昼点南風思屋界客急級係県研指持拾重昭乗神相送待炭柱追度畑発美秒品負面洋胃栄紀軍型建昨祝省信浅単飛変便約勇要逆限故厚査政祖則退独保迷律映革看巻皇紅砂姿城専宣染泉洗奏段派肺背 (100字) | |
10: 校家夏記帰原高紙時弱書通馬院員荷起宮庫根酒消真息速庭島配倍病勉旅流案害挙訓郡候航差殺残借笑席倉孫帯徒特梅粉脈浴料連益桜恩格個耕財師修素造能破俵容留株胸降骨座蚕射従純除将針値展党討納俳班秘陛朗 (95字) | |
11: 魚教強黄黒細週雪船組鳥野理悪球祭習終宿章商進深族第帳笛転都動部問貨械救健康菜産唱清巣側停堂得敗票副望陸移液眼規基寄許経険現混採授術常情責設接断張貧婦務率略異域郷済視捨推盛窓探著頂脳閉訪密訳郵翌欲 (97字) | |
12: 森雲絵間場晴朝答道買番飲運温開階寒期軽湖港歯集暑勝植短着等登湯童悲筆遊葉陽落街覚喜給極景結最散順焼象隊達貯然博飯費無満量営過賀検減証税絶測属貸程提統備評富復報貿割揮貴筋勤敬裁策詞衆就善装創尊痛晩補棒 (99字) | |
13: 遠園楽新数電話暗意感漢業詩想鉄農福路愛塩試辞照節戦続置腸働解幹義禁群鉱罪資飼準勢損墓豊夢預絹源署傷蒸誠聖暖賃腹幕盟裏 (58字) | |
14: 歌語算読聞鳴駅銀鼻様緑練関管旗察種静説漁歴演慣境構際雑酸製精銭総増像態適銅徳複綿領閣疑誤穀誌磁障層認暮模 (52字) | |
15: 線横談調箱億課器賞選熱標養輪確潔賛質敵導編暴遺劇権熟諸蔵誕潮論 (31字) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment