Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Sanitize a string for use as a filename
* Extracted from node-sanitize (
* Replaces characters in strings that are illegal/unsafe for filenames.
* Unsafe characters are either removed or replaced by a substitute set
* in the optional `options` object.
* Illegal Characters on Various Operating Systems
* / ? < > \ : * | "
* Unicode Control codes
* C0 0x00-0x1f & C1 (0x80-0x9f)
* Reserved filenames on Unix-based systems (".", "..")
* Reserved filenames in Windows ("CON", "PRN", "AUX", "NUL", "COM1",
* "COM2", "COM3", "COM4", "COM5", "COM6", "COM7", "COM8", "COM9",
* "LPT1", "LPT2", "LPT3", "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", and
* "LPT9") case-insesitively and with or without filename extensions.
* Capped at 255 characters in length.
* @param {String} input Original filename
* @param {Object} options {replacement: String}
* @return {String} Sanitized filename
var truncate = require("truncate-utf8-bytes");
var illegalRe = /[\/\?<>\\:\*\|":]/g;
var controlRe = /[\x00-\x1f\x80-\x9f]/g;
var reservedRe = /^\.+$/;
var windowsReservedRe = /^(con|prn|aux|nul|com[0-9]|lpt[0-9])(\..*)?$/i;
function sanitize(input, replacement) {
var sanitized = input
.replace(illegalRe, replacement)
.replace(controlRe, replacement)
.replace(reservedRe, replacement)
.replace(windowsReservedRe, replacement);
return truncate(sanitized, 255);
module.exports = function (input, options) {
var replacement = (options && options.replacement) || '';
var output = sanitize(input, replacement);
if (replacement === '') {
return output;
return sanitize(output, '');
Copy link

Alynva commented Jun 27, 2020

I think that truncate(sanitized, 255) can be replaced with sanitized.split("").splice(0, 255).join("") so don't need the truncate-utf8-bytes lib...

Copy link

Also, it seems that truncate would cut the file name extension?

Copy link

@Alynva using this sanitized.split("").splice(0, 255).join("") is not a good idea, as complex symbols like emojis are made up of more than one character, so if you split a string containing a emoji, it will return an array of 2 elements

Copy link

Techn1x commented May 5, 2023

With modern javascript, we can use TextEncoder and TextDecoder to do the truncate for us, accurately and keeping in mind complex characters that take more than 1 byte (eg a☃ is 2 characters but 1 byte + 3 bytes = 4 bytes)

const truncate = (sanitized: string, length: number): string => {
  const uint8Array = new TextEncoder().encode(sanitized)
  const truncated = uint8Array.slice(0, length)
  return new TextDecoder().decode(truncated)

Extra points: new Blob([sanitized]).size will also provide you the byte size (though is less helpful in terms of truncation)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment