Skip to content

Instantly share code, notes, and snippets.

@glowinthedark
Last active August 9, 2023 19:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save glowinthedark/4acc24b22ce0eb8050390f28a0a83002 to your computer and use it in GitHub Desktop.
Save glowinthedark/4acc24b22ce0eb8050390f28a0a83002 to your computer and use it in GitHub Desktop.
Chinese Simplified/Traditional characters detection
import java.util.regex.Pattern;
public class SimpTradDetect {
// data from https://github.com/tsroten/zhon/tree/main/src/zhon/cedict
// with shared characters between simplified and traditional removed
static final String TRAD = "㠯㵎㼝䠶䰾䳘丟並亂亙亞亾佀佇佈佔併來侖侶侷俁係俠俲俻倀倆倉個們倖倫偉偪側偵偽傌傑傖傘備傚傭傯傳傴債傷傾僂僅僉僑僕僞僥僨僱價儀儂億儈儉儐儔儕儗儘償優儲儷儸儺儻儼兇兌兒兗內兩冊冪凈凍凜凱別刦刧刪刼剄則剉剋剎剛剝剮剴創剷劃劄劇劉劊劌劍劑効勁動務勛勝勞勢勣勦勱勲勳勵勸勻匭匯匱區協卹卻厙厠厭厲厴參叢吳吶呂咷咼員哶唄唚問啞啟啣喚喪喫喬單喲嗆嗇嗎嗚嗩嗶嘆嘊嘍嘓嘔嘖嘗嘜嘩嘮嘯嘰嘵嘷嘸嘽噁噓噝噠噥噦噯噲噴噸噹嚀嚇嚌嚐嚕嚙嚥嚦嚨嚲嚳嚴嚶囀囁囂囅囈囉囌囑囪圇國圍園圓圖團埡執堅堊堖堝堯報場塊塋塏塒塗塚塢塤塲塵塹墊墜墮墳墻墾壇壓壘壙壚壞壟壠壢壩壯壺壼壽夠夢夾奐奧奩奪奮奼妝姍姦姪娛婁婦婭媧媯媼媽嫋嫗嫵嫻嬀嬈嬋嬌嬙嬝嬡嬤嬪嬰嬸孃孌孫學孿宂宮寢實寧審寫寬寳寵寶將專尋對導尒尷屆屍屓屜屢層屨屬屭岡峩峯峴島峽崍崐崑崗崙崢崬嵐嵗嶁嶃嶄嶇嶔嶗嶠嶧嶴嶸嶺嶼嶽巋巒巔巖巹帥師帬帳帶幀幃幇幗幘幚幟幣幫幬幹幾庫廁廂廄廈廕廚廝廟廠廡廢廣廩廬廳廹弒弔弳張強彆彈彊彌彎彙彞彠彥彫彿後徑從徠復徹怳恆恥悅悳悵悶悽惡惥惪惱惲惷惻愙愛愜愨愴愷愾慂慄態慍慘慚慟慣慪慫慮慳慶慼慾憂憊憐憑憒憚憤憫憮憲憶懇應懌懍懟懣懨懲懶懷懸懺懼懾戀戇戔戞戧戩戰戲戶戼抝拋挾捨捫捲掃掄掙掛採揀揚換揮揹損搖搗搯搵搶摑摜摟摯摳摶摻撈撏撐撓撝撟撣撥撧撫撲撳撻撾撿擁擄擇擊擋擔據擠擬擯擰擱擲擴擷擺擻擼擾攄攆攏攔攖攙攛攜攝攢攣攤攪攬敍敗敘敱敵數斂斃斕斬斷旂旹昰時晉晝暈暉暘暢暫暱曄曆曇曉曖曠曡曨曬書會朧朶東柵栢桮桿梔梘條梟棄棖棗棟棧棲椀椏楄楊楓楨業楳極榪榮榿構槍槓槤槧槨槮槳樁樂樅樓標樞樣樸樹樺橈橋機橢橫檁檉檔檜檟檢檣檮檯檳檸檻櫂櫃櫓櫚櫛櫝櫞櫟櫥櫧櫨櫪櫫櫬櫱櫳櫸櫺櫻欄權欏欒欖欞欽歎歐歟歡歲歷歸歿殀殘殞殤殫殭殮殯殲殺殼殽毀毆毿氂氈氌氣氫氬氳氷決沒沖況洶浹涇涖涼淒淚淥淨淪淵淶淺渙減渦測渾湊湞湧湯湼溈準溝溫滄滅滌滎滙滬滯滲滷滸滻滾滿漁漚漢漣漬漲漵漸漿潁潑潔潛潤潯潰潷潿澀澆澇澗澠澤澮澱濁濃濕濘濛濟濤濫濬濰濱濺濼濾瀅瀆瀉瀋瀏瀕瀘瀝瀟瀠瀦瀧瀨瀰瀲瀾灃灄灑灕灘灝灣灤灩災為烏烴無煉煒煗煙煢煥煩煬熒熗熱熲熾燁燈燉燒燙燜營燦燬燭燴燼燾爍爐爛爭爲爺爾牆牎牕牘牽犖犢犧狀狹狽猙猶猻獁獄獅獎獘獨獪獫獮獰獲獵獷獸獺獻獼玀玨珮現琊琹琺琿瑋瑣瑤瑩瑪瑯璉璣璦環璵璽瓊瓏瓔瓚甌甕產甦甯畝畢畫畬異當疇疉疊痙痺痾瘂瘋瘍瘓瘞瘡瘧瘮瘲瘻療癆癇癉癒癘癟癡癢癤癥癧癩癬癭癮癰癱癲發皐皚皰皸皺盃盌盜盞盡監盤盧盪眥眾睏睜睞睺瞇瞘瞜瞞瞼矇矓矚矯砲硃硤硨硯碩碭碸確碼磑磚磣磧磯磽礄礆礎礙礡礦礪礫礬礱祿禍禎禑禕禡禦禪禮禰禱禿秌稅稈稜稟稭種稱稾穀穌積穎穠穡穢穩穫窓窩窪窮窯窰窶窺窻竄竅竇竊竝竪競筆筍筧筩筯箇箋箏節範築篋篔篤篩篳簀簍簞簡簣簫簷簹簽簾籃籌籐籖籙籛籜籟籠籤籩籪籬籮籲粃粵粺糝糞糧糭糰糲糴糶糾紀紂約紅紆紇紈紉紋納紐紓純紕紗紘紙級紛紜紝紡紥紮細紱紲紳紵紹紺紼紿絀終組絆絎結絕絛絝絞絡絢給絨統絲絳絶絹綁綃綆綈綉綏綑經綜綠綢綣綫綬維綰綱網綴綵綸綹綺綻綽綾綿緄緇緊緋緒緔緗緘緙線緜緝緞締緡緣緥緦編緩緬緯緱緲練緶緹緻縈縉縊縋縐縑縗縛縝縞縟縣縧縫縭縮縱縲縴縵縶縷縹總績繃繅繆繒織繕繚繞繡繢繩繪繫繭繮繯繰繳繹繼繽繾纈纉纊續纍纏纓纔纖纘纜缽罈罌罰罵罷羅羆羈羋羗羥羨義翄習翬翹翺耑耬聖聞聯聰聲聳聵聶職聹聽聾肅肈脅脈脛脫脹腎腖腡腦腫腳腸膃膆膕膚膠膩膽膾膿臉臍臏臕臘臚臟臠臢臥臨臺與興舉舊舖舘艙艤艦艫艱艶艷芻苧茲荊荳莊莖莢莧菕華萇萊萬萵萹葉葒葦葯葷蒓蒔蒞蒼蓀蓋蓮蓯蓴蓽蔔蔞蔣蔥蔦蔭蔴蕁蕆蕎蕒蕓蕕蕘蕢蕩蕪蕭蕷薈薊薌薑薔薙薦薩薺藍藎藝藥藪藶藹藺蘄蘆蘇蘊蘋蘚蘞蘢蘭蘺蘿處虛虜號虧虯虵蛺蛻蜆蝕蝟蝦蝨蝯蝱蝸螄螞螢螮螻螿蟄蟈蟎蟣蟬蟯蟲蟶蟻蠅蠆蠍蠐蠑蠔蠟蠣蠨蠭蠱蠶蠻衂衆衊術衕衚衛衝袞裊裌裏補裝裡製複褌褘褭褲褳褸褻襃襇襖襝襠襤襪襯襲襴覇覈見規覓視覘覡覥覦親覬覯覲覷覺覽覿觀觴觶觸訂訃計訊訌討訐訓訕訖託記訛訝訟訢訣訥訪設許訴訶診註証詁詆詎詐詒詔評詖詘詛詞詠詡詢詣試詧詩詫詬詭詮詰話該詳詵詼詿誄誅誆誇誌認誑誒誕誖誘誚語誠誡誣誤誥誦誨說誰課誶誹誼誾調諂諄談諉請諍諏諑諒論諗諛諜諞諠諡諢諤諦諧諫諭諮諱諳諶諷諸諺諼諾謀謁謂謄謅謊謌謎謐謔謖謗謙謚講謝謠謨謫謬謳謹謾譁證譌譎譏譔譖識譙譚譜譟譫譭譯議譴護譽譾讀變讋讌讎讒讓讕讖讚讜讞谿豈豎豐豬貍貎貓貙貝貞負財貢貧貨販貪貫責貯貰貲貳貴貶買貸貺費貼貽貿賀賁賂賃賄賅資賈賊賑賒賓賕賙賚賜賞賠賡賢賣賤賦賧質賫賬賭賴賺賻購賽賾贄贅贇贈贊贍贏贐贓贔贖贗贛趕趙趨趲跡踐踰踴蹌蹕蹟蹣蹤蹺躂躉躊躋躍躑躒躓躕躚躛躡躥躦躪躳軀車軋軌軍軒軔軛軟軫軲軸軹軺軻軼軾較輅輇載輊輒輓輔輕輙輛輜輝輞輟輥輦輩輪輯輳輸輻輾輿轀轂轄轅轆轉轍轎轔轟轡轢轤辠辢辤辦辭辮辯農迴逕這連週進遉遊運過達違遙遜遞遠適遲遶遷選遺遼邁還邇邊邏邐郟郵鄆鄉鄒鄔鄖鄧鄭鄰鄲鄴鄶鄺酈醃醜醞醫醬醱釀釁釃釅釋釐釓釔釕釗釘釙針釣釤釦釧釩釵釷釹釺鈀鈁鈄鈈鈉鈍鈎鈐鈑鈔鈕鈞鈣鈥鈦鈧鈮鈰鈳鈴鈷鈸鈹鈺鈽鈾鈿鉀鉅鉈鉉鉋鉍鉑鉕鉗鉚鉛鉞鉢鉤鉦鉬鉭鉸鉺鉻鉿銀銃銅銑銓銖銘銚銛銜銠銣銥銦銨銩銪銫銬銲銳銷銹銻銼銾鋁鋃鋅鋇鋌鋏鋒鋝鋟鋤鋥鋦鋨鋩鋪鋮鋯鋰鋱鋸鋻鋼錁錄錆錇錈錏錐錒錕錘錙錚錛錟錠錡錢錦錨錫錮錯錳錶錸鍀鍁鍆鍇鍊鍋鍍鍔鍘鍛鍠鍤鍥鍩鍬鍰鍵鍶鍺鍼鍾鎂鎄鎇鎊鎔鎖鎘鎛鎡鎢鎣鎦鎧鎩鎪鎬鎮鎰鎳鎵鎿鏃鏇鏈鏌鏐鏑鏗鏘鏜鏝鏞鏟鏡鏢鏤鏨鏰鏵鏷鏹鐃鐐鐒鐓鐔鐘鐙鐝鐠鐦鐧鐨鐫鐮鐲鐳鐵鐸鐺鐿鑄鑊鑌鑑鑒鑔鑞鑠鑣鑥鑭鑰鑲鑷鑼鑽鑾鑿钁長門閂閃閆閈閉開閌閎閏閑間閔閘閡閣閥閧閨閩閫閬閭閱閶閹閻閼閽閾閿闃闆闇闈闊闋闌闍闐闓闔闕闖關闞闡闢闥陘陝陞陣陰陳陸陽隂隄隉隊階隕際隨險隱隴隸隻雋雖雙雛雜雞離難雲電霑霧霽靂靄靆靈靉靚靜靦靨鞌鞏鞦韁韃韆韉韋韌韓韙韜韞韮韻響頁頂頃項順頇須頊頌頎頏預頑頒頓頗領頜頡頤頦頭頰頴頷頸頹頻顆顇題額顎顏顒顓顔願顙顛類顢顥顧顫顬顯顰顱顳顴風颭颮颯颱颳颶颺颻颼飄飆飈飛飜飢飤飩飪飫飭飯飱飲飴飼飽飾餃餄餅餉養餌餎餑餒餓餔餕餘餛餜餞餡館餬餱餳餵餺餻餼餽餾餿饁饃饅饈饉饊饋饌饍饑饒饗饜饞饢馬馭馮馱馳馴駁駐駑駒駔駕駘駙駛駝駟駡駢駭駮駰駱駸駿騁騅騍騎騏騖騙騫騭騮騰騶騷騸騾驀驁驂驃驄驅驊驍驏驕驗驚驛驟驢驤驥驪骯髈髏髐髒體髕髖髮鬆鬍鬚鬢鬥鬧鬨鬩鬪鬭鬮鬱魎魘魚魯魴魷鮁鮃鮌鮍鮎鮐鮑鮒鮚鮞鮪鮫鮭鮮鯀鯁鯇鯉鯊鯔鯖鯗鯛鯡鯢鯤鯧鯨鯪鯫鯰鯷鯽鯿鰂鰆鰈鰉鰍鰐鰒鰓鰟鰣鰥鰨鰩鰭鰮鰱鰲鰳鰷鰹鰻鰾鱅鱈鱉鱒鱓鱔鱖鱗鱘鱝鱟鱣鱧鱭鱷鱸鱺鳥鳧鳩鳳鳴鳶鴆鴇鴉鴒鴕鴛鴝鴞鴟鴣鴦鴨鴯鴰鴴鴻鴿鵂鵑鵒鵓鵜鵝鵞鵠鵡鵪鵬鵯鵰鵲鵾鶇鶉鶓鶖鶘鶚鶩鶯鶲鶴鶺鶻鶼鷀鷂鷈鷉鷓鷖鷗鷙鷥鷦鷯鷲鷳鷴鷸鷹鷺鷽鷿鸊鸕鸚鸛鸝鸞鹵鹹鹺鹼鹽麗麥麩麪麵麼黃黌點黨黲黴黷黽黿鼉鼕鼴齊齋齎齏齒齔齙齜齟齠齡齣齦齪齬齲齶齷龍龎龐龔龕龜";
static final String SIMP = "㑩㝉㧑䁖䌽䗖䜣䜩䝙䲠䴘䴙与专业丛东丝丢两严丧个丬临为丽举义乌乐乔习乡书买乱争亏亘亚产亩亲亵亸亿仅从仑仓仪们价众优会伛伞伟传伤伥伦伧伪伫体佥侠侣侥侦侧侨侩侪侬俣俦俨俩俪俭债倾偬偻偾偿傥傧储傩儿兑兖党兰关兴兹养兽冁内冈册写军农冯冲决况冻净凉减凑凛凤凫凭凯击凿刍刘则刚创删别刭刹刽刿剀剂剐剑剥剧劝办务劢动励劲劳势勋匀匦匮区医华协单卖卢卤卧卫却卺厅历厉压厌厍厐厕厢厣厦厨厩县参叆叇双发变叙叠号叹叽吓吕吗吣吨听启吴呐呒呓呕呖呗员呙呛呜咏咙咛咝响哑哒哓哔哕哗哙哜哝哟唛唠唢唤啀啧啬啭啮啯啰啴啸喷喽喾嗫嗳嘘嘤嘱噜嚣团园囱围囵国图圆圣圹场坏块坚坛坜坝坞坟坠垄垅垆垒垦垩垫垭垲垴埘埙埚堑堕墙壮声壳壶壸处备复够头夹夺奁奂奋奖奥妆妇妈妩妪妫姗姹娄娅娆娇娈娱娲娴婴婵婶媪嫒嫔嫱嬷孙学孪宝实宠审宪宫宽宾寝对寻导寿将尔尘尝尧尴尽层屃屉届属屡屦屿岁岂岖岗岘岙岚岛岭岽岿峄峡峤峥峦崂崃崭嵘嵚嵝巅巩币帅师帏帐帜带帧帮帱帻帼幂并幺广庆庐庑库应庙庞废廪开异弃弑张弥弪弯弹强归当录彝彦彻径徕忆忏忧忾怀态怂怃怄怅怆怜总怼怿恋恳恶恸恹恺恻恼恽悦悫悬悭悯惊惧惨惩惫惬惭惮惯愠愤愦愿慑懑懒懔戆戋戏戗战戬户扑执扩扪扫扬扰抚抛抟抠抡抢护报担拟拢拣拥拦拧拨择挂挚挛挝挞挟挠挡挢挣挤挥挦捞损捡换捣掳掴掷掸掺掼揽揾揿搀搁搂搅搒携摄摅摆摇摈摊撄撑撵撷撸撺擞攒敌敛数敳斋斓斩断无旧时旷旸昙昼昽显晋晓晔晕晖暂暧术机杀杂权条来杨杩极构枞枢枣枥枧枨枪枫枭柠柽栀栅标栈栉栊栋栌栎栏树栖样栾桠桡桢档桤桥桦桧桨桩梦梼梿检棂椁椟椠椭椮楼榄榇榈榉槚槛槟槠横樯樱橥橱橹橼檩欢欤欧歼殁殇残殒殓殚殡殴毁毂毕毙毡毵氇气氢氩氲汇汉汤汹沟没沣沤沥沦沧沩沪泞泪泷泸泺泻泼泽泾洁洒浃浅浆浇浈浊测浍济浏浐浑浒浓浔涛涝涞涟涠涡涣涤润涧涨涩渊渌渍渎渐渑渔渖渗温湾湿溃溅溆滗滚滞滟滠满滢滤滥滦滨滩潆潇潋潍潜潴澜濑濒灏灭灯灵灾灿炀炉炖炜炝点炼炽烁烂烃烛烟烦烧烨烩烫烬热焕焖焘爱爷牍牦牵牺犊状犷犸犹狈狝狞独狭狮狯狰狱狲猃猎猕猡猪猫猬献獭玑玙玛玮环现玺珏珐珑珲琏琐琼瑶瑷璎瓒瓯电画畅畲畴疖疗疟疠疡疬疭疮疯疱疴痈痉痖痨痪痫痹瘅瘆瘗瘘瘪瘫瘾瘿癞癣癫皑皱皲盏盐监盖盗盘眍眬着睁睐睑瞒瞩矫矶矾矿砀码砖砗砚砜砺砻砾础硕硖硗硙硚硷碍碛碜碱礴礼祃祎祢祦祯祷祸禀禄禅秃秆种积称秽秾税稣稳穑穷窃窍窑窜窝窥窦窭竖竞笃笋笔笕笺笼笾筚筛筜筝筹筼签简箓箦箧箨箩箪箫篑篓篮篯篱簖籁籴类粜粝粤粪粮糁糇紧絷纠纡红纣纤纥约级纨纩纪纫纬纭纮纯纰纱纲纳纴纵纶纷纸纹纺纻纽纾线绀绁绂练组绅细织终绉绊绋绌绍绎经绐绑绒结绔绕绗绘给绚绛络绝绞统绠绡绢绣绥绦继绨绩绪绫续绮绯绰绱绲绳维绵绶绷绸绺绻综绽绾绿缀缁缂缃缄缅缆缇缈缉缋缌缎缏缑缒缓缔缕编缗缘缙缚缛缜缝缞缟缠缡缢缣缤缥缦缧缨缩缪缫缬缭缮缯缰缱缲缳缴缵罂网罗罚罢罴羁羟羡翘翚耧耸耻聂聋职聍联聩聪肃肠肤肮肾肿胀胁胆胜胧胨胪胫胶脉脍脏脐脑脓脔脚脱脶脸腘腭腻腼腽腾膑臜舆舣舰舱舻艰艳艺节芈芗芜芦芲苁苇苈苋苌苍苎苏茎茏茑茔茕茧荆荐荚荛荜荞荟荠荡荣荤荥荦荧荨荩荪荫荬荭药莅莱莲莳莴获莸莹莺莼萝萤营萦萧萨葱蒇蒉蒋蒌蓝蓟蓠蓣蓥蓦蔷蔹蔺蔼蕲蕴薮藓蘖虏虑虚虫虬虮虽虾虿蚀蚁蚂蚕蚝蚬蛊蛎蛏蛮蛰蛱蛲蛳蛴蜕蜗蜡蝇蝈蝉蝼蝾螀螨蟏衅衔补衬衮袄袅袆袜袭装裆裈裢裣裤裥褛褴襕见观规觅视觇览觉觊觋觌觍觎觏觐觑觞触觯訚詟誉誊讠计订讣认讥讦讧讨让讪讫讬训议讯记讲讳讴讵讶讷许讹论讼讽设访诀证诂诃评诅识诈诉诊诋诌词诎诏诐译诒诓诔试诖诗诘诙诚诛诜话诞诟诠诡询诣诤该详诧诨诩诫诬语诮误诰诱诲诳说诵诶请诸诹诺读诼诽课诿谀谁谂调谄谅谆谇谈谊谋谌谍谎谏谐谑谒谓谔谕谖谗谘谙谚谛谜谝谟谠谡谢谣谤谥谦谧谨谩谪谫谬谭谮谯谰谱谲谳谴谵谶贝贞负贡财责贤败账货质贩贪贫贬购贮贯贰贱贲贳贴贵贶贷贸费贺贻贼贽贾贿赀赁赂赃资赅赆赇赈赉赊赋赌赍赎赏赐赑赒赓赔赕赖赘赙赚赛赜赝赞赟赠赡赢赣赵赶趋趱趸跃跄跞践跶跷跸跹跻踌踪踬踯蹑蹒蹰蹿躏躗躜躯车轧轨轩轫转轭轮软轰轱轲轳轴轵轶轸轹轺轻轼载轾轿辁辂较辄辅辆辇辈辉辊辋辍辎辏辐辑辒输辔辕辖辗辘辙辚辞辩辫边辽达迁过迈运还这进远违连迟迩迳选逊递逦逻遗遥邓邝邬邮邹邺邻郏郐郑郓郦郧郸酝酦酱酽酾酿释鉴銮錾钆钇针钉钊钋钌钍钎钏钐钒钓钔钕钗钙钚钛钜钝钞钟钠钡钢钣钤钥钦钧钨钩钪钫钬钭钮钯钰钱钲钳钴钵钶钷钸钹钺钻钼钽钾钿铀铁铂铃铄铅铆铇铈铉铊铋铌铍铎铐铑铒铓铔铕铖铗铙铛铜铝铟铠铡铢铣铤铥铦铧铨铩铪铫铬铭铮铯铰铱铲铳铵银铷铸铹铺铼铽链铿销锁锂锃锄锅锆锇锈锉锊锋锌锎锏锐锑锒锓锔锕锖锗锘错锚锛锜锝锞锟锡锢锣锤锥锦锨锩锫锬锭键锯锰锱锲锴锵锶锷锸锹锺锻锼锽锾锿镀镁镂镃镄镅镆镇镈镉镊镌镍镎镏镐镑镒镓镔镕镖镗镘镚镛镜镝镞镟镠镡镢镣镤镥镦镧镨镪镫镬镭镯镰镱镲镳镴镶长门闩闪闫闬闭问闯闰闱闲闳间闵闶闷闸闹闺闻闼闽闾闿阀阁阂阃阄阅阆阇阈阉阊阋阌阍阎阏阐阑阒阔阕阖阗阙阚队阳阴阵阶际陆陇陈陉陕陧陨险随隐隶隽难雏雠雳雾霁霭靓静靥鞑鞯韦韧韩韪韫韬韵页顶顷顸项顺须顼顽顾顿颀颁颂颃预颅领颇颈颉颊颌颍颎颏颐频颓颔颕颖颗题颙颚颛颜额颞颟颠颡颢颤颥颦颧风飏飐飑飒飓飕飖飘飙飚飞飨餍饥饧饨饩饪饫饬饭饮饯饰饱饲饴饵饶饷饸饹饺饼饽饿馀馁馂馃馄馅馆馈馊馋馌馍馎馏馐馑馒馓馔馕马驭驮驯驰驱驳驴驵驶驷驸驹驺驻驼驽驾驿骀骁骂骃骄骅骆骇骈骊骋验骎骏骐骑骒骓骖骗骘骚骛骜骝骞骟骠骡骢骣骤骥骧髅髇髋髌鬓魇魉鱼鱿鲁鲂鲃鲆鲇鲈鲋鲍鲎鲏鲐鲑鲒鲔鲕鲗鲚鲛鲜鲞鲟鲠鲡鲢鲣鲤鲥鲦鲧鲨鲩鲫鲭鲮鲰鲱鲲鲳鲵鲶鲷鲸鲻鲼鲽鳀鳁鳃鳄鳅鳆鳇鳊鳌鳍鳎鳏鳐鳑鳓鳔鳕鳖鳗鳙鳜鳝鳞鳟鳢鳣鸟鸠鸡鸢鸣鸥鸦鸨鸩鸪鸫鸬鸭鸮鸯鸰鸱鸲鸳鸴鸵鸶鸷鸸鸹鸺鸻鸽鸾鸿鹁鹂鹃鹄鹅鹆鹇鹈鹉鹊鹋鹌鹍鹎鹏鹑鹕鹗鹘鹙鹚鹜鹞鹟鹡鹣鹤鹥鹦鹧鹩鹪鹫鹬鹭鹰鹳鹾麦麸黄黉黩黪黾鼋鼍鼹齐齑齿龀龃龄龅龆龇龈龉龊龋龌龙龚龛龟";
static final Pattern PATTERN_CHINESE = Pattern.compile("[\\u4E00-\\u9FA5]");
public static boolean isCjk(String s) {
return PATTERN_CHINESE.matcher(s).find();
}
public static boolean isSimplified(String s) {
for (char c : s.toCharArray() ) {
if (SIMP.indexOf(c) != -1) {
return true;
}
}
return false;
}
public static boolean isTraditional(String s) {
for (char c : s.toCharArray() ) {
if (TRAD.indexOf(c) != -1) {
return true;
}
}
return false;
}
public static void main(String[] args) {
for (String s : new String[]{
"汉字",
"漢字",
"他的兒子在學校",
"繁簡轉換器 343",
"繁简转换器 46"
}) {
System.out.print(s + ": ");
if (isTraditional(s)) {
System.out.println("Traditional");
} else if (isSimplified(s)) {
System.out.println("Simplified");
} else if (isCjk(s)) {
System.out.println("Generic Hanzi");
} else {
System.out.println("unknown??");
}
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment