Skip to content

Instantly share code, notes, and snippets.

@msalahat
Last active March 22, 2017 23:18
Show Gist options
  • Save msalahat/33a499f6a29ea7f30152cb909367258e to your computer and use it in GitHub Desktop.
Save msalahat/33a499f6a29ea7f30152cb909367258e to your computer and use it in GitHub Desktop.
Match user search with hamza in Arabic ( أ، ا ، آ ، إ ) against a list of words - UTF8
let match_arabic = (user_input, word) => {
let user_input_regx = "";
for (let d = 0; d < user_input.length; d++) {
//البحث عن أ، ا ، آ و إ
let hamz_letters = ["أ", "ا", "آ", "إ"].join("|")
const hamz_regx = new RegExp(hamz_letters);
if (hamz_regx.test(user_input.charAt(d))) {
user_input_regx += "[" + hamz_letters + "]";
} else {
user_input_regx += user_input.charAt(d);
}
}
user_input_regx = new RegExp(user_input_regx);
if (user_input_regx.test(word)) {
return true;
}
return false;
}
let user_input_possiblites = [
"الاردن",
"الأردن",
"الآردن"
];
let word = 'الأردن';
user_input_possiblites.forEach((item) => {
console.log(item, match_arabic(item, word) ? "Matched" : "Not Matched");
})
@KhaledElAnsari
Copy link

KhaledElAnsari commented Mar 22, 2017

First of all this great ! thanks for sharing this man 😄 but I did some tiny edits for a better performance

1- declare the hamz_letters array outside the for loop along with the hamz_regx instead of declaring them inside of the loop which will cause a create and delete operations from the memory at each loop

2- RegExp#test returns type boolean so it's easier and a little bit faster to return the function itself (check MDN), same goes to user_input_regx inside the loop

let match_arabic = (user_input, word) => {
    let user_input_regx = "";
    
    // البحث عن أ، ا ، آ و إ
    let hamz_letters = ["أ", "ا", "آ", "إ"].join("|")
    const hamz_regx = new RegExp(hamz_letters);
    
    for (let d = 0, len = user_input.length; d < len; d++) {
        user_input_regx += ( hamz_regx.test(user_input.charAt(d)) ? "[" + hamz_letters + "]" : user_input.charAt(d) ); 
    }
    
    user_input_regx = new RegExp(user_input_regx);

    return user_input_regx.test(word);
}


let user_input_possiblites = [
    "الاردن",
    "الأردن",
    "الآردن"
];

let word = 'الأردن';
user_input_possiblites.forEach((item) => {
    console.log(item, match_arabic(item, word) ? "Matched" : "Not Matched");
})

@msalahat
Copy link
Author

Great edits indeed :)
Thanks for the input (y)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment