Remove accents and symbols not compatible with Latin base alphabet
This works by converting text to decomposed unicode form, such that the
accents are treated as separate characters. We then select the characters
we want, by means of a regex and then join the matched groups.
There are certain characters that won't work with this, such as 'ø', since
it is not an 'o' with a slash accent.
function asciiFriendlyText (text) {
return text.normalize("NFD").match(/([\u0009-\u0014\u0020-\u007E])+/g).join('')
// French
console.log(asciiFriendlyText('Je suis un élève'));
// Vietnamese
// Unsupported, since they would require different logic:
console.log(asciiFriendlyText('Æ, Ø, ß'));
console.log(asciiFriendlyText('Đà Nẵng, Quảng Nam, Quảng Ngãi, Bình Định, Phú Yên, Nha Trang'));
