Skip to content

Instantly share code, notes, and snippets.

@khanzadimahdi
Created January 26, 2020 06:16
Show Gist options
  • Save khanzadimahdi/2ecfe1ba38860db132b4543ab5126926 to your computer and use it in GitHub Desktop.
Save khanzadimahdi/2ecfe1ba38860db132b4543ab5126926 to your computer and use it in GitHub Desktop.
regex to match utf8 hashtags (similar to Instagram and Facebook)
/**
* @see https://regexr.com/4suqt
* @see https://regex101.com/r/4SAxik/1
* @see https://www.regexpal.com/?fam=113956
**/
$regex = "(?:#)([\p{L}\p{N}_](?:(?:[\p{L}\p{N}_]|(?:\.(?!\.))){0,28}(?:[\p{L}\p{N}_]))?)";
$text = "here is a #hashtag.tail and #another_hashtag #هشتگ #__hehe #123 this is a test.";
preg_match_all($regex, $text, $matches);
var_dump($matches);
@khanzadimahdi
Copy link
Author

you can split regex like below:

$alphabets = '\p{L}\p{N}_';
$sign = '#';
$regex = "/(?:$sign)([$alphabets](?:(?:[$alphabets]|(?:\.(?!\.))){0,28}(?:[$alphabets]))?)/u";

can be used for mentions (change $sign to @) too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment