Skip to content

Instantly share code, notes, and snippets.

@mklement0
Last active August 16, 2024 05:32
Show Gist options
  • Save mklement0/8689b9b5123a9ba11df7214f82a673be to your computer and use it in GitHub Desktop.
Save mklement0/8689b9b5123a9ba11df7214f82a673be to your computer and use it in GitHub Desktop.
PowerShell function that emulates Out-File for creating UTF-8-encoded files *without a BOM* (byte-order mark).
<#
Prerequisites: PowerShell version 3 or above.
License: MIT
Author: Michael Klement <mklement0@gmail.com>
DOWNLOAD and DEFINITION OF THE FUNCTION:
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw/Out-FileUtf8NoBom.ps1 | iex
The above directly defines the function below in your session and offers guidance for making it available in future
sessions too. To silence the guidance information, append 4>$null
CAVEAT: If you run this command *from a script*, you'll get a spurious warning about dot-sourcing, which you can ignore
and suppress by appending 3>$null. However, it's best to avoid calling this command from scripts, because later versions
of this Gist aren't guaranteed to be backward-compatible; howevever, you can modify the command to lock in a
*specific revision* of this Gist, which is guaranteed not to change: see the instructions at
https://gist.github.com/mklement0/880624fd665073bb439dfff5d71da886?permalink_comment_id=4296669#gistcomment-4296669
DOWNLOAD ONLY:
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw > Out-FileUtf8NoBom.ps1
Downloads to the specified file, which you then need to dot-source to make the function available in the current session:
. ./Out-FileUtf8NoBom.ps1
To learn what the function does:
* see the next comment block
* or, once downloaded and defined, invoke the function with -? or pass its name to Get-Help.
#>
function Out-FileUtf8NoBom {
<#
.SYNOPSIS
Outputs to a UTF-8-encoded file *without a BOM* (byte-order mark).
.DESCRIPTION
Mimics the most important aspects of Out-File:
* Input objects are sent to Out-String first.
* -Append allows you to append to an existing file, -NoClobber prevents
overwriting of an existing file.
* -Width allows you to specify the line width for the text representations
of input objects that aren't strings.
However, it is not a complete implementation of all Out-File parameters:
* Only a literal output path is supported, and only as a parameter.
* -Force is not supported.
* Conversely, an extra -UseLF switch is supported for using LF-only newlines.
.NOTES
The raison d'être for this advanced function is that Windows PowerShell
lacks the ability to write UTF-8 files without a BOM: using -Encoding UTF8
invariably prepends a BOM.
Copyright (c) 2017, 2022 Michael Klement <mklement0@gmail.com> (http://same2u.net),
released under the [MIT license](https://spdx.org/licenses/MIT#licenseText).
#>
[CmdletBinding(PositionalBinding=$false)]
param(
[Parameter(Mandatory, Position = 0)] [string] $LiteralPath,
[switch] $Append,
[switch] $NoClobber,
[AllowNull()] [int] $Width,
[switch] $UseLF,
[Parameter(ValueFromPipeline)] $InputObject
)
begin {
# Convert the input path to a full one, since .NET's working dir. usually
# differs from PowerShell's.
$dir = Split-Path -LiteralPath $LiteralPath
if ($dir) { $dir = Convert-Path -ErrorAction Stop -LiteralPath $dir } else { $dir = $pwd.ProviderPath }
$LiteralPath = [IO.Path]::Combine($dir, [IO.Path]::GetFileName($LiteralPath))
# If -NoClobber was specified, throw an exception if the target file already
# exists.
if ($NoClobber -and (Test-Path $LiteralPath)) {
Throw [IO.IOException] "The file '$LiteralPath' already exists."
}
# Create a StreamWriter object.
# Note that we take advantage of the fact that the StreamWriter class by default:
# - uses UTF-8 encoding
# - without a BOM.
$sw = New-Object System.IO.StreamWriter $LiteralPath, $Append
$htOutStringArgs = @{}
if ($Width) { $htOutStringArgs += @{ Width = $Width } }
try {
# Create the script block with the command to use in the steppable pipeline.
$scriptCmd = {
& Microsoft.PowerShell.Utility\Out-String -Stream @htOutStringArgs |
. { process { if ($UseLF) { $sw.Write(($_ + "`n")) } else { $sw.WriteLine($_) } } }
}
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
$steppablePipeline.Begin($PSCmdlet)
}
catch { throw }
}
process
{
$steppablePipeline.Process($_)
}
end {
$steppablePipeline.End()
$sw.Dispose()
}
}
# --------------------------------
# GENERIC INSTALLATION HELPER CODE
# --------------------------------
# Provides guidance for making the function persistently available when
# this script is either directly invoked from the originating Gist or
# dot-sourced after download.
# IMPORTANT:
# * DO NOT USE `exit` in the code below, because it would exit
# the calling shell when Invoke-Expression is used to directly
# execute this script's content from GitHub.
# * Because the typical invocation is DOT-SOURCED (via Invoke-Expression),
# do not define variables or alter the session state via Set-StrictMode, ...
# *except in child scopes*, via & { ... }
if ($MyInvocation.Line -eq '') {
# Most likely, this code is being executed via Invoke-Expression directly
# from gist.github.com
# To simulate for testing with a local script, use the following:
# Note: Be sure to use a path and to use "/" as the separator.
# iex (Get-Content -Raw ./script.ps1)
# Derive the function name from the invocation command, via the enclosing
# script name presumed to be contained in the URL.
# NOTE: Unfortunately, when invoked via Invoke-Expression, $MyInvocation.MyCommand.ScriptBlock
# with the actual script content is NOT available, so we cannot extract
# the function name this way.
& {
param($invocationCmdLine)
# Try to extract the function name from the URL.
$funcName = $invocationCmdLine -replace '^.+/(.+?)(?:\.ps1).*$', '$1'
if ($funcName -eq $invocationCmdLine) {
# Function name could not be extracted, just provide a generic message.
# Note: Hypothetically, we could try to extract the Gist ID from the URL
# and use the REST API to determine the first filename.
Write-Verbose -Verbose "Function is now defined in this session."
}
else {
# Indicate that the function is now defined and also show how to
# add it to the $PROFILE or convert it to a script file.
Write-Verbose -Verbose @"
Function `"$funcName`" is now defined in this session.
* If you want to add this function to your `$PROFILE, run the following:
"``nfunction $funcName {``n`${function:$funcName}``n}" | Add-Content `$PROFILE
* If you want to convert this function into a script file that you can invoke
directly, run:
"`${function:$funcName}" | Set-Content $funcName.ps1 -Encoding $('utf8' + ('', 'bom')[[bool] (Get-Variable -ErrorAction Ignore IsCoreCLR -ValueOnly)])
"@
}
} $MyInvocation.MyCommand.Definition # Pass the original invocation command line to the script block.
}
else {
# Invocation presumably as a local file after manual download,
# either dot-sourced (as it should be) or mistakenly directly.
& {
param($originalInvocation)
# Parse this file to reliably extract the name of the embedded function,
# irrespective of the name of the script file.
$ast = $originalInvocation.MyCommand.ScriptBlock.Ast
$funcName = $ast.Find( { $args[0] -is [System.Management.Automation.Language.FunctionDefinitionAst] }, $false).Name
if ($originalInvocation.InvocationName -eq '.') {
# Being dot-sourced as a file.
# Provide a hint that the function is now loaded and provide
# guidance for how to add it to the $PROFILE.
Write-Verbose -Verbose @"
Function `"$funcName`" is now defined in this session.
If you want to add this function to your `$PROFILE, run the following:
"``nfunction $funcName {``n`${function:$funcName}``n}" | Add-Content `$PROFILE
"@
}
else {
# Mistakenly directly invoked.
# Issue a warning that the function definition didn't take effect and
# provide guidance for reinvocation and adding to the $PROFILE.
Write-Warning @"
This script contains a definition for function "$funcName", but this definition
only takes effect if you dot-source this script.
To define this function for the current session, run:
. "$($originalInvocation.MyCommand.Path)"
"@
}
} $MyInvocation # Pass the original invocation info to the helper script block.
}
@mklement0
Copy link
Author

Note: Thanks to a tip by @IarwainBen-adar, revision 6 now correctly handles UNC paths and paths based on PowerShell-only drives.

@hunandy14
Copy link

hunandy14 commented Apr 24, 2021

Hi, appreciate your article to help me figure out the problem of UTF-8 format.


My $PROFILE file was originally in UTF-8 format, which contains Chinese text.

When I follow the instructions below, the $PROFILE file will turn to ISO 8859-2 format, it will cause damage for the Chinese text and become a garbled message.

* If you want to add this function to your $PROFILE, run the following:
   "`nfunction Out-FileUtf8NoBom {`n${function:Out-FileUtf8NoBom}`n}" >> $PROFILE

Maybe it would be better to amend like the following below? (I think this problem only happen to people who doesn’t use English.)

"`nfunction Out-FileUtf8NoBom {`n${function:Out-FileUtf8NoBom}`n}" | Out-FileUtf8NoBom -Append $PROFILE

@mklement0
Copy link
Author

Thanks for pointing out the problem, @hunandy14 - and sorry that your $PROFILE got corrupted.

Yes, in Windows PowerShell >> (Out-File -Append) blindly appends UTF-16LE encoding, which can lead to file corruption here.
(In PowerShell (Core) 7+, everything now defaults to BOM-less UTF-8, so this function isn't necessary to begin with.)

The simpler solution is to use Add-Content, which actually tries to match the encoding of the preexisting content (and in Windows PowerShell that content may actually be ANSI-encoded).

I've updated the code accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment