IntegrityTest#test_"images on disk have no duplicates" [/Users/javan/Code/gemoji/test/integrity_test.rb:26]: These images share the same checksum: /emoji/unicode/25fc.png, /emoji/unicode/2b1b.png.
--
◼️ "BLACK MEDIUM SQUARE" Unicode: U+25FC U+FE0F, UTF-8: E2 97 BC EF B8 8F
⬛️ "BLACK LARGE SQUARE" Unicode: U+2B1B U+FE0F, UTF-8: E2 AC 9B EF B8 8F
(best viewed in Safari)
--
Find and confirm glyph IDs
>> ttf.cmap.unicode.first.code_map.select { |k, v| k == "25fc".to_i(16) }
=> {9724=>82}
>> ttf.postscript.glyph_for(82)
=> "u25FC"
>> ttf.cmap.unicode.first.code_map.select { |k, v| k == "2b1b".to_i(16) }
=> {11035=>163}
>> ttf.postscript.glyph_for(163)
=> "u2B1B"
Added some logging to ttfunk to inspect the offset and byte length of the PNG data
>> ttf.sbix.bitmap_data_for(82, 4)
"Offset and byte length for glyph_id 82: 7855049, 1658"
>> ttf.sbix.bitmap_data_for(163, 4)
"Offset and byte length for glyph_id 163: 8178441, 1658"
--
Glyph 82 and 163 are the only two with identical PNG data
md5s = {}
ttf.maximum_profile.num_glyphs.times do |glyph_id|
if bitmap = ttf.sbix.bitmap_data_for(glyph_id, 4)
digest = Digest::MD5.hexdigest(bitmap.data.read)
md5s[digest] ||= []
md5s[digest].push(glyph_id)
end
end
>> md5s.keys.uniq.size
=> 845
md5s.select { |k,v| v.length > 1 }
=> {"34b981a2dd163f1cb8a453189edc446c"=>[82, 163]}