Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Remove all traces of emoji from a text file.
#!/usr/bin/env python
"""
Remove emoji from a text file and print it to stdout.
Usage
-----
python remove-emoji.py input.txt > output.txt
"""
import re
import sys
# https://stackoverflow.com/a/49146722/330558
def remove_emoji(string):
emoji_pattern = re.compile("["
u"\U0001F600-\U0001F64F" # emoticons
u"\U0001F300-\U0001F5FF" # symbols & pictographs
u"\U0001F680-\U0001F6FF" # transport & map symbols
u"\U0001F1E0-\U0001F1FF" # flags (iOS)
u"\U00002702-\U000027B0"
u"\U000024C2-\U0001F251"
"]+", flags=re.UNICODE)
return emoji_pattern.sub(r'', string)
if __name__ == '__main__':
text = open(sys.argv[1]).read()
text = remove_emoji(text)
print(text)
@Demonstrandum

This comment has been minimized.

Copy link

@Demonstrandum Demonstrandum commented Feb 25, 2020

do you hate fun?

@engle-xu

This comment has been minimized.

Copy link

@engle-xu engle-xu commented Mar 6, 2020

very useful!! thank you

@saaranshM

This comment has been minimized.

Copy link

@saaranshM saaranshM commented May 23, 2020

Here is the updated one:

def remove_emoji(string):
    emoji_pattern = re.compile("["
                               u"\U0001F600-\U0001F64F"  # emoticons
                               u"\U0001F300-\U0001F5FF"  # symbols & pictographs
                               u"\U0001F680-\U0001F6FF"  # transport & map symbols
                               u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
                               u"\U00002500-\U00002BEF"  # chinese char
                               u"\U00002702-\U000027B0"
                               u"\U00002702-\U000027B0"
                               u"\U000024C2-\U0001F251"
                               u"\U0001f926-\U0001f937"
                               u"\U00010000-\U0010ffff"
                               u"\u2640-\u2642"
                               u"\u2600-\u2B55"
                               u"\u200d"
                               u"\u23cf"
                               u"\u23e9"
                               u"\u231a"
                               u"\ufe0f"  # dingbats
                               u"\u3030"
                               "]+", flags=re.UNICODE)
    return emoji_pattern.sub(r'', string)
@VpkPrasanna

This comment has been minimized.

Copy link

@VpkPrasanna VpkPrasanna commented Jun 2, 2020

Thanks a lot providing the function .it helps me a lot and it saved a lot of time
Thanks

@aderounmu

This comment has been minimized.

Copy link

@aderounmu aderounmu commented Jun 19, 2020

Thanks alot this help very much

@nmschorr

This comment has been minimized.

Copy link

@nmschorr nmschorr commented Jun 30, 2020

Here is the updated one:

def remove_emoji(string):
    emoji_pattern = re.compile("["
                               u"\U0001F600-\U0001F64F"  # emoticons
                               u"\U0001F300-\U0001F5FF"  # symbols & pictographs
                               u"\U0001F680-\U0001F6FF"  # transport & map symbols
                               u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
                               u"\U00002500-\U00002BEF"  # chinese char
                               u"\U00002702-\U000027B0"
                               u"\U00002702-\U000027B0"
                               u"\U000024C2-\U0001F251"
                               u"\U0001f926-\U0001f937"
                               u"\U00010000-\U0010ffff"
                               u"\u2640-\u2642"
                               u"\u2600-\u2B55"
                               u"\u200d"
                               u"\u23cf"
                               u"\u23e9"
                               u"\u231a"
                               u"\ufe0f"  # dingbats
                               u"\u3030"
                               "]+", flags=re.UNICODE)
    return emoji_pattern.sub(r'', string)
@ezequias

This comment has been minimized.

Copy link

@ezequias ezequias commented Jul 11, 2020

Dear @nmschorr

Some emojis your code didn't catch (could you :








#⃣

@wsebastiangroves

This comment has been minimized.

Copy link

@wsebastiangroves wsebastiangroves commented Aug 26, 2020

Here is the updated one:

def remove_emoji(string):
    emoji_pattern = re.compile("["
                               u"\U0001F600-\U0001F64F"  # emoticons
                               u"\U0001F300-\U0001F5FF"  # symbols & pictographs
                               u"\U0001F680-\U0001F6FF"  # transport & map symbols
                               u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
                               u"\U00002500-\U00002BEF"  # chinese char
                               u"\U00002702-\U000027B0"
                               u"\U00002702-\U000027B0"
                               u"\U000024C2-\U0001F251"
                               u"\U0001f926-\U0001f937"
                               u"\U00010000-\U0010ffff"
                               u"\u2640-\u2642"
                               u"\u2600-\u2B55"
                               u"\u200d"
                               u"\u23cf"
                               u"\u23e9"
                               u"\u231a"
                               u"\ufe0f"  # dingbats
                               u"\u3030"
                               "]+", flags=re.UNICODE)
    return emoji_pattern.sub(r'', string)

Thank you!!

@AmbiTyga

This comment has been minimized.

Copy link

@AmbiTyga AmbiTyga commented Sep 9, 2020

Use this link to get the unicode of every emojis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.