Skip to content

Instantly share code, notes, and snippets.

@JohnRoos
Last active September 3, 2019 09:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JohnRoos/436ea657295201b728574366a19a0fc1 to your computer and use it in GitHub Desktop.
Save JohnRoos/436ea657295201b728574366a19a0fc1 to your computer and use it in GitHub Desktop.
function Get-FileEncoding {
[CmdletBinding()]
param (
[ValidateScript({Test-Path -Path $_})]
[string]$Path
)
$fullpath = (Resolve-Path -Path $Path).Path
$file = [System.IO.FileStream]::new($fullpath,[System.IO.FileMode]::Open)
$bytes = [byte[]]::new(100)
$null = $file.Read($bytes,0,$bytes.Length)
$file.Close()
$file.Dispose()
$chars = [char[]]::new($bytes.Length)
for ($i = 0; $i -lt $bytes.length; $i++){
$chars[$i] = [char]$bytes[$i]
}
<#
Signatures and descripions from here:
https://en.wikipedia.org/wiki/List_of_file_signatures
https://www.garykessler.net/library/file_sigs.html
#>
$encoding = @{
'FF-FE' = 'Byte-order mark for text file encoded in little-endian 16-bit Unicode Transfer Format'
'FE-FF' = 'Byte-order mark for 16-bit Unicode Transformation Format/2-octet Universal Character Set (UTF-16/UCS-2), big-endian files.'
'FF-FE-00-00' = 'Byte-order mark for text file encoded in little-endian 32-bit Unicode Transfer Format'
'00-00-FE-FF' = 'Byte-order mark for 32-bit Unicode Transformation Format/4-octet Universal Character Set (UTF-32/UCS-4), big-endian files.'
'EF-BB-BF' = 'UTF-8 encoded Unicode byte order mark. (PowerShell friendly)'
'7B-5C-72-74-66-31' = 'Rich Text Format'
}
$hex = [System.BitConverter]::ToString($bytes)
Microsoft.PowerShell.Utility\Write-Verbose "First $($bytes.Length) bytes in hex: $hex"
foreach ($signature in $encoding.Keys) {
if ($hex.StartsWith($signature)) {
Microsoft.PowerShell.Utility\Write-Output "Found match: $signature, $($encoding.$signature) ($signature)"
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment