Skip to content

Instantly share code, notes, and snippets.

@rapee
Created July 1, 2016 09:37
Show Gist options
  • Save rapee/ab633dc45beda305b8042f06d7c8b75a to your computer and use it in GitHub Desktop.
Save rapee/ab633dc45beda305b8042f06d7c8b75a to your computer and use it in GitHub Desktop.
Regex for multiple languages (en/th/jp)
/**
* Make slug from string.
* Support: en, th, ja
* @see https://gist.github.com/mathewbyrne/1280286
* @see http://so-zou.jp/software/tech/programming/tech/regular-expression/meta-character/variable-width-encoding.htm
* @see https://python3.wannaphong.com/2015/12/regular-expression-ภาษาไทย-และภาษาอื่น-python.html
* @return {String} Slug
*/
const whitespace_like_regex = /[\s_]+/g;
// en: a-z, A-Z, 0-9, -
// th: Ko Kai - Sara Uu, Sara Ee - Thanthakhat, Thai Zero - Thai Nine
// (Excludes Phintu, Thai Baht, Nikhahit, Yamakkan, Fongman, Angkhankhu, Khomut)
// ja: Kanji, Hiragana, Katakana
const nonword_regex = /[^a-z0-9\-ก-\u0E39เ-\u0E4C๐-๙亜-熙ぁ-んァ-ヶ]+/g;
module.export = function slugify(str) {
return str.toString().toLowerCase()
.replace(whitespace_like_regex, '-') // Replace spaces with -
.replace(nonword_regex, '') // Remove all non-word chars
.replace(/\-\-+/g, '-') // Replace multiple - with single -
.replace(/^-+/, '') // Trim - from start of text
.replace(/-+$/, '');
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment