Artoria2e5/gccs-non-verifiable-remap.md

## gccs-non-verifiable-remap.md

      
    Raw
  

              gccs-non-verifiable-remap.md
            
          
    This gist contains maps "not verifiable" GCCS characters documented in
HKSCS-2004 Annex IV
to Unihan or U-source records.
The mapping for U-source characters should be mostly correct as their
Adobe-CNS1 sources match the range of Adobe-CNS1-1, a supplement created
for GCCS support.
This voluntary enrichment activity is followed by cake rewards. The mapping
generated is released into public domain. For a more authoritive description
of characters written in IDS, please refer to
TR45, which was found accidentally by
googling for the IDS sequence for 9FE6 on this page.
You can find a font with support for such code points at
GlyphWiki.


GCCS
EUDC PUA
Unicode 9.0


9EAC
U+ED2B
U+8C9B


9EC4
U+ED43
UTC-00877


9EF4
U+ED73
UTC-00879


9F4E
U+ED8C
UTC-00880


9FAD
U+EDC9
UTC-00882


9FB1
U+EDCD
U+2B473


9FC0
U+EDDC
UTC-00883


9FC8
U+EDE4
UTC-00884


9FDA
U+EDF6
UTC-00886


9FE6
U+EE02
UTC-00887


9FEA
U+EE06
UTC-00888


9FEF
U+EE0B
UTC-00889


A054
U+EE2F
UTC-00890


A057
U+EE32
U+2AE67


A05A
U+EE35
UTC-00891


A062
U+EE3D
UTC-00892


A072
U+EE4D
UTC-00893


A0A5
U+EE5E
UTC-00894


A0AD
U+EE66
UTC-00895


A0AF
U+EE68
UTC-00896


A0D3
U+EE8C
UTC-00897


A0E1
U+EE9A
UTC-00898


Useful GlyphWiki Links


Prefix
Font generation
idsedit

Big5-EUDC to PUA

See http://kanji-database.sourceforge.net/charcode/big5.html.
def big5_eudc_pua(byteseq: str):
    H = int(byteseq[0:2], 16)
    L = int(byteseq[2:4], 16)
    if L < 0x40 or (L > 0x7e and L < 0xa1) or L == 0xff:
        raise ValueError(byteseq)  # Not valid Big5

    _eudc_row = lambda L: (L - 0x40) if (L < 0x80) else (L - 0x62)
    if H >= 0x81 and H <= 0x8D:
        return 0xeeb8 + (157 * (H - 0x81)) + _eudc_row(L)
    elif H >= 0x8E and H <= 0xA0:
        return 0xe311 + (157 * (H - 0x8e)) + _eudc_row(L)
    elif (H >= 0xC7 or (H == 0xC6 and L >= 0xA1)) and H <= 0xC8:
        return 0xf672 + (157 * (H - 0xc6)) + _eudc_row(L)
    elif H >= 0xFA and H <= 0xFE:
        return 0xe000 + (157 * (H - 0xfa)) + _eudc_row(L)
    else:
        return None  # DummyVal
GCCS	EUDC PUA	Unicode 9.0
9EAC	U+ED2B	U+8C9B
9EC4	U+ED43	UTC-00877
9EF4	U+ED73	UTC-00879
9F4E	U+ED8C	UTC-00880
9FAD	U+EDC9	UTC-00882
9FB1	U+EDCD	U+2B473
9FC0	U+EDDC	UTC-00883
9FC8	U+EDE4	UTC-00884
9FDA	U+EDF6	UTC-00886
9FE6	U+EE02	UTC-00887
9FEA	U+EE06	UTC-00888
9FEF	U+EE0B	UTC-00889
A054	U+EE2F	UTC-00890
A057	U+EE32	U+2AE67
A05A	U+EE35	UTC-00891
A062	U+EE3D	UTC-00892
A072	U+EE4D	UTC-00893
A0A5	U+EE5E	UTC-00894
A0AD	U+EE66	UTC-00895
A0AF	U+EE68	UTC-00896
A0D3	U+EE8C	UTC-00897
A0E1	U+EE9A	UTC-00898