Skip to content

Instantly share code, notes, and snippets.

@siqin
Created December 4, 2012 07:57
Show Gist options
  • Star 20 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save siqin/4201667 to your computer and use it in GitHub Desktop.
Save siqin/4201667 to your computer and use it in GitHub Desktop.
Remove Emoji in NSString
// XCode 4.2.1
@implementation NSString(EmojiExtension)
- (NSString*)removeEmoji {
__block NSMutableString* temp = [NSMutableString string];
[self enumerateSubstringsInRange: NSMakeRange(0, [self length]) options:NSStringEnumerationByComposedCharacterSequences usingBlock:
^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
const unichar hs = [substring characterAtIndex: 0];
// surrogate pair
if (0xd800 <= hs && hs <= 0xdbff) {
const unichar ls = [substring characterAtIndex: 1];
const int uc = ((hs - 0xd800) * 0x400) + (ls - 0xdc00) + 0x10000;
[temp appendString: (0x1d000 <= uc && uc <= 0x1f77f)? @"": substring]; // U+1D000-1F77F
// non surrogate
} else {
[temp appendString: (0x2100 <= hs && hs <= 0x26ff)? @"": substring]; // U+2100-26FF
}
}];
return temp;
}
@end
@ninjitaru
Copy link

I have no clue, what this does with chinese or japanese text. But it works for all german letters.

Came across this gist, and I happen to have strings with Chinese + emoji, this code will remove all Chinese character due to there strlen are 3 :)

@smorr
Copy link

smorr commented Nov 16, 2022

Much simpler way is to use a string transform: -- this will move all emoji code points, and preserve non-latin characters, accents etc

Eg
[@"🤯!!! ক❤️testé᏷🧡💚💛せぬ❤️‍🔥👩🏿‍🦰" stringByApplyingTransform: @"[:emoji:] remove" reverse:NO]

returns
!!! ক️testé᏷せぬ️‍‍

@smorr
Copy link

smorr commented Nov 18, 2022

just to followup. -- apparently the [:emoji:] property used in the ICU transform includes digits, some punctuation, other things not generally though to be emoji.

I am finding this method on an NSString category working better

- (NSString *)stringByRemovingEmoji {
    static NSRegularExpression * regex = nil;
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        // remove all emoji less those that are digits, punctuation, letters, latin 1 supplement or letter like symbols
        // or BIDI Non-Spacing Mark
        NSError * error = nil;
        regex = [NSRegularExpression regularExpressionWithPattern:@"([[:emoji:]--[:digit:]--[:punctuation:]--[:letter:]--[:block=Latin-1_sup:]--[:block=letter-like-symbols:]]|\\uFE0F)" options: 0 error:&error];
        if (error){
            NSLog(@"Error forming regex");
        }
    });
    
    return [regex stringByReplacingMatchesInString:self options:0 range:NSMakeRange(0, self.length) withTemplate:@""];
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment