Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
PowerShell function that emulates Out-File for creating UTF-8-encoded files *without a BOM* (byte-order mark).
<#
Prerequisites: PowerShell version 3 or above.
License: MIT
Author: Michael Klement <mklement0@gmail.com>
DOWNLOAD and DEFINITION OF THE FUNCTION:
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw/Out-FileUtf8NoBom.ps1 | iex
The above directly defines the function below in your session and offers guidance for making it available in future
sessions too.
DOWNLOAD ONLY:
irm https://gist.github.com/mklement0/8689b9b5123a9ba11df7214f82a673be/raw > Out-FileUtf8NoBom.ps1
Downloads to the specified file, which you then need to dot-source to make the function available in the current session:
. ./Out-FileUtf8NoBom.ps1
To learn what the function does:
* see the next comment block
* or, once downloaded and defined, invoke the function with -? or pass its name to Get-Help.
#>
function Out-FileUtf8NoBom {
<#
.SYNOPSIS
Outputs to a UTF-8-encoded file *without a BOM* (byte-order mark).
.DESCRIPTION
Mimics the most important aspects of Out-File:
* Input objects are sent to Out-String first.
* -Append allows you to append to an existing file, -NoClobber prevents
overwriting of an existing file.
* -Width allows you to specify the line width for the text representations
of input objects that aren't strings.
However, it is not a complete implementation of all Out-File parameters:
* Only a literal output path is supported, and only as a parameter.
* -Force is not supported.
* Conversely, an extra -UseLF switch is supported for using LF-only newlines.
Caveat: *All* pipeline input is buffered before writing output starts,
but the string representations are generated and written to the target
file one by one.
.NOTES
The raison d'être for this advanced function is that Windows PowerShell
lacks the ability to write UTF-8 files without a BOM: using -Encoding UTF8
invariably prepends a BOM.
Copyright (c) 2017, 2020 Michael Klement <mklement0@gmail.com> (http://same2u.net),
released under the [MIT license](https://spdx.org/licenses/MIT#licenseText).
#>
[CmdletBinding()]
param(
[Parameter(Mandatory, Position=0)] [string] $LiteralPath,
[switch] $Append,
[switch] $NoClobber,
[AllowNull()] [int] $Width,
[switch] $UseLF,
[Parameter(ValueFromPipeline)] $InputObject
)
#requires -version 3
# Convert the input path to a full one, since .NET's working dir. usually
# differs from PowerShell's.
$dir = Split-Path -LiteralPath $LiteralPath
if ($dir) { $dir = Convert-Path -ErrorAction Stop -LiteralPath $dir } else { $dir = $pwd.ProviderPath}
$LiteralPath = [IO.Path]::Combine($dir, [IO.Path]::GetFileName($LiteralPath))
# If -NoClobber was specified, throw an exception if the target file already
# exists.
if ($NoClobber -and (Test-Path $LiteralPath)) {
Throw [IO.IOException] "The file '$LiteralPath' already exists."
}
# Create a StreamWriter object.
# Note that we take advantage of the fact that the StreamWriter class by default:
# - uses UTF-8 encoding
# - without a BOM.
$sw = New-Object System.IO.StreamWriter $LiteralPath, $Append
$htOutStringArgs = @{}
if ($Width) {
$htOutStringArgs += @{ Width = $Width }
}
# Note: By not using begin / process / end blocks, we're effectively running
# in the end block, which means that all pipeline input has already
# been collected in automatic variable $Input.
# We must use this approach, because using | Out-String individually
# in each iteration of a process block would format each input object
# with an indvidual header.
try {
$Input | Out-String -Stream @htOutStringArgs | % {
if ($UseLf) {
$sw.Write($_ + "`n")
}
else {
$sw.WriteLine($_)
}
}
} finally {
$sw.Dispose()
}
}
# --------------------------------
# GENERIC INSTALLATION HELPER CODE
# --------------------------------
# Provides guidance for making the function persistently available when
# this script is either directly invoked from the originating Gist or
# dot-sourced after download.
# IMPORTANT:
# * DO NOT USE `exit` in the code below, because it would exit
# the calling shell when Invoke-Expression is used to directly
# execute this script's content from GitHub.
# * Because the typical invocation is DOT-SOURCED (via Invoke-Expression),
# do not define variables or alter the session state via Set-StrictMode, ...
# *except in child scopes*, via & { ... }
if ($MyInvocation.Line -eq '') {
# Most likely, this code is being executed via Invoke-Expression directly
# from gist.github.com
# To simulate for testing with a local script, use the following:
# Note: Be sure to use a path and to use "/" as the separator.
# iex (Get-Content -Raw ./script.ps1)
# Derive the function name from the invocation command, via the enclosing
# script name presumed to be contained in the URL.
# NOTE: Unfortunately, when invoked via Invoke-Expression, $MyInvocation.MyCommand.ScriptBlock
# with the actual script content is NOT available, so we cannot extract
# the function name this way.
& {
param($invocationCmdLine)
# Try to extract the function name from the URL.
$funcName = $invocationCmdLine -replace '^.+/(.+?)(?:\.ps1).*$', '$1'
if ($funcName -eq $invocationCmdLine) {
# Function name could not be extracted, just provide a generic message.
# Note: Hypothetically, we could try to extract the Gist ID from the URL
# and use the REST API to determine the first filename.
Write-Verbose -Verbose "Function is now defined in this session."
}
else {
# Indicate that the function is now defined and also show how to
# add it to the $PROFILE or convert it to a script file.
Write-Verbose -Verbose @"
Function `"$funcName`" is now defined in this session.
* If you want to add this function to your `$PROFILE, run the following:
"``nfunction $funcName {``n`${function:$funcName}``n}" | Add-Content `$PROFILE
* If you want to convert this function into a script file that you can invoke
directly, run:
"`${function:$funcName}" | Set-Content $funcName.ps1 -Encoding $('utf8' + ('', 'bom')[[bool] (Get-Variable -ErrorAction Ignore IsCoreCLR -ValueOnly)])
"@
}
} $MyInvocation.MyCommand.Definition # Pass the original invocation command line to the script block.
}
else {
# Invocation presumably as a local file after manual download,
# either dot-sourced (as it should be) or mistakenly directly.
& {
param($originalInvocation)
# Parse this file to reliably extract the name of the embedded function,
# irrespective of the name of the script file.
$ast = $originalInvocation.MyCommand.ScriptBlock.Ast
$funcName = $ast.Find( { $args[0] -is [System.Management.Automation.Language.FunctionDefinitionAst] }, $false).Name
if ($originalInvocation.InvocationName -eq '.') {
# Being dot-sourced as a file.
# Provide a hint that the function is now loaded and provide
# guidance for how to add it to the $PROFILE.
Write-Verbose -Verbose @"
Function `"$funcName`" is now defined in this session.
If you want to add this function to your `$PROFILE, run the following:
"``nfunction $funcName {``n`${function:$funcName}``n}" | Add-Content `$PROFILE
"@
}
else {
# Mistakenly directly invoked.
# Issue a warning that the function definition didn't effect and
# provide guidance for reinvocation and adding to the $PROFILE.
Write-Warning @"
This script contains a definition for function "$funcName", but this definition
only takes effect if you dot-source this script.
To define this function for the current session, run:
. "$($originalInvocation.MyCommand.Path)"
"@
}
} $MyInvocation # Pass the original invocation info to the helper script block.
}
@mklement0
Copy link
Author

mklement0 commented Apr 24, 2021

Thanks for pointing out the problem, @hunandy14 - and sorry that your $PROFILE got corrupted.

Yes, in Windows PowerShell >> (Out-File -Append) blindly appends UTF-16LE encoding, which can lead to file corruption here.
(In PowerShell (Core) 7+, everything now defaults to BOM-less UTF-8, so this function isn't necessary to begin with.)

The simpler solution is to use Add-Content, which actually tries to match the encoding of the preexisting content (and in Windows PowerShell that content may actually be ANSI-encoded).

I've updated the code accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment