Skip to content

Instantly share code, notes, and snippets.

@indented-automation
Last active January 18, 2019 12:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save indented-automation/05f1cbbce4f7a3f4cef1be9a26317853 to your computer and use it in GitHub Desktop.
Save indented-automation/05f1cbbce4f7a3f4cef1be9a26317853 to your computer and use it in GitHub Desktop.
function ConvertTo-NormalizedString {
<#
.SYNOPSIS
Attempts to replace diacritics within a the input string.
.DESCRIPTION
Uses String.Normalize to attempt to replace diacritic characters within a string.
#>
[CmdletBinding()]
param (
# The string to convert.
[Parameter(ValueFromPipeline)]
[AllowEmptyString()]
[String]$String,
# Allows removal of characters categorised as OtherSymbol.
[Switch]$RemoveOtherSymbol,
# Remove characters which are not in the ASCII character range after normalization.
[Switch]$RemoveNonAscii
)
process {
$normalizedString = $String.Normalize('FormD').ToCharArray() | Where-Object {
$category = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($_)
$category -ne 'NonSpacingMark' -and
(-not $RemoveOtherSymbol -or $category -ne 'OtherSymbol') -and
(-not $RemoveNonAscii -or [Int]$_ -le 0x7f)
}
[String]::new($normalizedString).Normalize('FormC')
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment