Skip to content

Instantly share code, notes, and snippets.

@zmughal
Last active March 25, 2024 22:35
Show Gist options
  • Save zmughal/5130211 to your computer and use it in GitHub Desktop.
Save zmughal/5130211 to your computer and use it in GitHub Desktop.
D.1 Latin Character Set and Encodings
CHAR NAME CHAR CODE (OCTAL)
STD MAC WIN PDF
A A 101 101 101 101
Æ AE 341 256 306 306
Á Aacute — 347 301 301
 Acircumflex — 345 302 302
Ä Adieresis — 200 304 304
À Agrave — 313 300 300
Å Aring — 201 305 305
à Atilde — 314 303 303
B B 102 102 102 102
C C 103 103 103 103
Ç Ccedilla — 202 307 307
D D 104 104 104 104
E E 105 105 105 105
É Eacute — 203 311 311
Ê Ecircumflex — 346 312 312
Ë Edieresis — 350 313 313
È Egrave — 351 310 310
Ð Eth — — 320 320
€ Euro 1 — — 200 240
F F 106 106 106 106
G G 107 107 107 107
H H 110 110 110 110
I I 111 111 111 111
Í Iacute — 352 315 315
Î Icircumflex — 353 316 316
Ï Idieresis — 354 317 317
Ì Igrave — 355 314 314
J J 112 112 112 112
K K 113 113 113 113
L L 114 114 114 114
Ł Lslash 350 — — 225
M M 115 115 115 115
N N 116 116 116 116
Ñ Ntilde — 204 321 321
O O 117 117 117 117
ΠOE 352 316 214 226
Ó Oacute — 356 323 323
Ô Ocircumflex — 357 324 324
Ö Odieresis — 205 326 326
Ò Ograve — 361 322 322
Ø Oslash 351 257 330 330
Õ Otilde — 315 325 325
P P 120 120 120 120
Q Q 121 121 121 121
R R 122 122 122 122
S S 123 123 123 123
Š Scaron — — 212 227
T T 124 124 124 124
Þ Thorn — — 336 336
U U 125 125 125 125
Ú Uacute — 362 332 332
Û Ucircumflex — 363 333 333
Ü Udieresis — 206 334 334
Ù Ugrave — 364 331 331
V V 126 126 126 126
W W 127 127 127 127
X X 130 130 130 130
Y Y 131 131 131 131
Ý Yacute — — 335 335
Ÿ Ydieresis — 331 237 230
Z Z 132 132 132 132
Ž Zcaron 2 — — 216 231
a a 141 141 141 141
á aacute — 207 341 341
â acircumflex — 211 342 342
́ acute 302 253 264 264
ä adieresis — 212 344 344
æ ae 361 276 346 346
à agrave — 210 340 340
& ampersand 046 046 046 046
å aring — 214 345 345
^ asciicircum 136 136 136 136
~ asciitilde 176 176 176 176
* asterisk 052 052 052 052
@ at 100 100 100 100
ã atilde — 213 343 343
b b 142 142 142 142
\ backslash 134 134 134 134
| bar 174 174 174 174
{ braceleft 173 173 173 173
} braceright 175 175 175 175
[ bracketleft 133 133 133 133
] bracketright 135 135 135 135
̆ breve 306 371 — 030
¦ brokenbar — — 246 246
• bullet 3 267 245 225 200
c c 143 143 143 143
ˇ caron 317 377 — 031
ç ccedilla — 215 347 347
̧ cedilla 313 374 270 270
¢ cent 242 242 242 242
ˆ circumflex 303 366 210 032
: colon 072 072 072 072
, comma 054 054 054 054
© copyright — 251 251 251
¤ currency 1 250 333 244 244
d d 144 144 144 144
† dagger 262 240 206 201
‡ daggerdbl 263 340 207 202
° degree — 241 260 260
̈ dieresis 310 254 250 250
÷ divide — 326 367 367
$ dollar 044 044 044 044
̇ dotaccent 307 372 — 033
ı dotlessi 365 365 — 232
e e 145 145 145 145
é eacute — 216 351 351
ê ecircumflex — 220 352 352
ë edieresis — 221 353 353
è egrave — 217 350 350
8 eight 070 070 070 070
… ellipsis 274 311 205 203
— emdash 320 321 227 204
– endash 261 320 226 205
= equal 075 075 075 075
ð eth — — 360 360
! exclam 041 041 041 041
¡ exclamdown 241 301 241 241
f f 146 146 146 146
fi fi 256 336 — 223
5 five 065 065 065 065
fl fl 257 337 — 224
ƒ florin 246 304 203 206
4 four 064 064 064 064
⁄ fraction 244 332 — 207
g g 147 147 147 147
ß germandbls 373 247 337 337
` grave 301 140 140 140
> greater 076 076 076 076
« guillemotleft 4 253 307 253 253
» guillemotright 4 273 310 273 273
‹ guilsinglleft 254 334 213 210
› guilsinglright 255 335 233 211
h h 150 150 150 150
̋ hungarumlaut 315 375 — 034
- hyphen 5 055 055 055 055
i i 151 151 151 151
í iacute — 222 355 355
î icircumflex — 224 356 356
ï idieresis — 225 357 357
ì igrave — 223 354 354
j j 152 152 152 152
k k 153 153 153 153
l l 154 154 154 154
< less 074 074 074 074
¬ logicalnot — 302 254 254
ł lslash 370 — — 233
m m 155 155 155 155
̄ macron 305 370 257 257
− minus — — — 212
μ mu — 265 265 265
× multiply — — 327 327
n n 156 156 156 156
9 nine 071 071 071 071
ñ ntilde — 226 361 361
# numbersign 043 043 043 043
o o 157 157 157 157
ó oacute — 227 363 363
ô ocircumflex — 231 364 364
ö odieresis — 232 366 366
œ oe 372 317 234 234
̨ ogonek 316 376 — 035
ò ograve — 230 362 362
1 one 061 061 061 061
½ onehalf — — 275 275
¼ onequarter — — 274 274
1 onesuperior — — 271 271
a ordfeminine 343 273 252 252
o ordmasculine 353 274 272 272
ø oslash 371 277 370 370
õ otilde — 233 365 365
p p 160 160 160 160
¶ paragraph 266 246 266 266
( parenleft 050 050 050 050
) parenright 051 051 051 051
% percent 045 045 045 045
. period 056 056 056 056
· periodcentered 264 341 267 267
‰ perthousand 275 344 211 213
+ plus 053 053 053 053
± plusminus — 261 261 261
q q 161 161 161 161
? question 077 077 077 077
¿ questiondown 277 300 277 277
" quotedbl 042 042 042 042
„ quotedblbase 271 343 204 214
“ quotedblleft 252 322 223 215
” quotedblright 272 323 224 216
‘ quoteleft 140 324 221 217
’ quoteright 047 325 222 220
‚ quotesinglbase 270 342 202 221
' quotesingle 251 047 047 047
r r 162 162 162 162
® registered — 250 256 256
̊ ring 312 373 — 036
s s 163 163 163 163
š scaron — — 232 235
§ section 247 244 247 247
; semicolon 073 073 073 073
7 seven 067 067 067 067
6 six 066 066 066 066
/ slash 057 057 057 057
space 6 040 040 040 040
£ sterling 243 243 243 243
t t 164 164 164 164
þ thorn — — 376 376
3 three 063 063 063 063
¾ threequarters — — 276 276
3 threesuperior — — 263 263
̃ tilde 304 367 230 037
™ trademark — 252 231 222
2 two 062 062 062 062
2 twosuperior — — 262 262
u u 165 165 165 165
ú uacute — 234 372 372
û ucircumflex — 236 373 373
ü udieresis — 237 374 374
ù ugrave — 235 371 371
_ underscore 137 137 137 137
v v 166 166 166 166
w w 167 167 167 167
x x 170 170 170 170
y y 171 171 171 171
ý yacute — — 375 375
ÿ ydieresis — 330 377 377
¥ yen 245 264 245 245
z z 172 172 172 172
ž zcaron 2 — — 236 236
0 zero 060 060 060 060
1. In PDF 1.3, the euro character was added to the Adobe standard Latin character set. It
is encoded as 200 in WinAnsiEncoding and 240 in PDFDocEncoding, assigning codes
that were previously unused. Apple changed the Mac OS Latin-text encoding for code
333 from the currency character to the euro character. However, this incompatible
change has not been reflected in PDF’s MacRomanEncoding, which continues to map
code 333 to currency. If the euro character is desired, an encoding dictionary can be
used to specify this single difference from MacRomanEncoding.
2. In PDF 1.3, the existing Zcaron and zcaron characters were added to WinAnsiEncoding
as the previously unused codes 216 and 236.
3. In WinAnsiEncoding, all unused codes greater than 40 map to the bullet character.
However, only code 225 is specifically assigned to the bullet character; other codes are
subject to future reassignment.
4. The character names guillemotleft and guillemotright are misspelled. The correct spell-
ing for this punctuation character is guillemet. However, the misspelled names are the
ones actually used in the fonts and encodings containing these characters.
5. The hyphen character is also encoded as 255 in WinAnsiEncoding. The meaning of this
duplicate code is “soft hyphen,” but it is typographically the same as hyphen.
6. The space character is also encoded as 312 in MacRomanEncoding and as 240 in
WinAnsiEncoding. The meaning of this duplicate code is “nonbreaking space,” but it is
typographically the same as space.
This invisible paragraph references MinionExp-Regular (), Symbol (3), and Zapf-
Dingbats (❁), which are needed in some of the PostScript-generated tables but are not
used anywhere else in the book.
@zmughal
Copy link
Author

zmughal commented Mar 10, 2013

From the PDF reference manual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment