Skip to content

Instantly share code, notes, and snippets.

@anxp
Last active March 26, 2019 16:04
Show Gist options
  • Save anxp/78682c75955530deb145bbbca234fd1c to your computer and use it in GitHub Desktop.
Save anxp/78682c75955530deb145bbbca234fd1c to your computer and use it in GitHub Desktop.
Function returns only allowed charachters from input string. Allowed charachters passes as array with ranges. Ranges are strings with start and end UFT-8 codes (as preg_match expects them).
<?php
/**
* Function filters out not-allowed symbols from string and return 'cleaned' string.
* Allowed symbols specified as array of UTF-8 address ranges.
* When it needs to change allowed range, consider this resource for all UTF-8 characters code sets: https://unicode-table.com/en/blocks/
* @param string $text input string to filter
* @param bool $stripTags remove HTML tags or not
* @param array $allowed_utf8_ranges array with ranges of allowed symbols
* @return mixed|null|string
* @throws Exception
*/
function sanitizeText($text, $stripTags = TRUE, $allowed_utf8_ranges = ['\x{0000}-\x{007E}', '\x{00A1}-\x{017F}',]) {
if (!is_array($allowed_utf8_ranges)) {
throw new Exception('Third argument in sanitizeText function need to be an array.');
}
if (is_array($text)) {
throw new Exception('First argument in sanitizeText need to be string, NOT array.');
}
if ($stripTags) {$text = filter_var($text, FILTER_SANITIZE_STRING, FILTER_FLAG_NO_ENCODE_QUOTES);} //Remove ALL html tags from string.
//Here we will dynamically construct pattern for preg_replace. Example of needed pattern: '/([^(\x{0020}-\x{007E})|(\x{00A1}-\x{017F})])+/u'
$dynamicPattern = '';
foreach ($allowed_utf8_ranges as $range) {
$dynamicPattern .= '('.$range.')|';
}
$dynamicPattern = rtrim($dynamicPattern, '|');
$dynamicPattern = '/([^'.$dynamicPattern.'])+/u';
$text = preg_replace($dynamicPattern, '', $text);
return $text;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment