Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Find instagram hashtags through a regular expression in Python
import re
txt = """
hola @gorka!!!!!! #ñam???? #normal #subtêrráneo @nandoquintana #puravida!!! #espagne🇪🇸
#凤凰卫视
"""
not_in_hashtags = "\"$%&'()*+,-./:;<=>?[\]^`{|}~\n#@ "
hashtags = re.findall(f'\#[^{re.escape(not_in_hashtags)}]+', txt)
print(hashtags)
@zkwp5
Copy link

zkwp5 commented Jul 15, 2022

Verify Github on Galaxy. gid:mmqGht93YKVt5ytDukzQDG

@zkwp5
Copy link

zkwp5 commented Jul 15, 2022

gid:mmqGht93YKVt5ytDukzQDG

@fl0aten
Copy link

fl0aten commented Jul 15, 2022

Thanks! Nice Regex!

In my tests, some hashtags still had a space behind them, which turned out to be "Non Breakable Space".

"\u00A0" should fix it.

not_in_hashtags = "\"$%&'()*+,-./:;<=>?[\]^`{|}~\n#@ \u00A0"

(I am in a different programming language.... can't promise that it works that way in Python.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment