Skip to content

Instantly share code, notes, and snippets.

Last active August 29, 2015 13:58
Show Gist options
  • Save Meroje/9994612 to your computer and use it in GitHub Desktop.
Save Meroje/9994612 to your computer and use it in GitHub Desktop.
The Twitter emoji scanner


This was used to find all possible urls of twitter's emojis.
The list is used for BetterTweetDeck's emojis replacement script.

Do it at home

First we export all characters from /System/Library/Input Methods/ (column uchr from table unihan_dict) as csv (could have used sqlite from nodejs). This file contains all unicode characters (54072).

Then this csv is parsed with nodejs, which outputs corresponding urls.

To find all emojis from there, you just have to test every urls (example, another, with paralelism) (be gentle, use HEAD) to remove those that return 404.

var fs = require('fs');
var sys = require('sys');
var csv = require('csv');
String.prototype.toCodePoints= function() {
chars = [];
for (var i= 0; i<this.length; i++) {
var c1= this.charCodeAt(i);
if (c1>=0xD800 && c1<0xDC00 && i+1<this.length) {
var c2= this.charCodeAt(i+1);
if (c2>=0xDC00 && c2<0xE000) {
chars.push(0x10000 + ((c1-0xD800)<<10) + (c2-0xDC00));
return chars;
.from.path(__dirname+'/uchr.txt', { delimiter: ',', escape: '"' })'/sample.out'))
.transform( function(row) {
var img = new String(row)
.map(function(str) {
return str.toString(16)
img = img.replace(/^-+|-+$/gm, '/');
// Twitter specific replacements, no idea why they do that
img = img.replace(/-fe0f/g,'').replace(/([\d])fe0f/g,"3$1").replace(/#fe0f/g,"23");
return '' + img + '.png\n';
.on('close', function(count){
// when writing to a file, use the 'close' event
// the 'end' event may fire before the file has been written
console.log('Number of lines: '+count);
.on('error', function(error){