Skip to content

Instantly share code, notes, and snippets.

@eddieh
Last active December 10, 2019 21:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save eddieh/cb1c640cf2ac15097e4cfef55f5effdb to your computer and use it in GitHub Desktop.
Save eddieh/cb1c640cf2ac15097e4cfef55f5effdb to your computer and use it in GitHub Desktop.
gist.el choked on non-ascii multibyte character
error in process sentinel: Multibyte text in HTTP request: PATCH /gists/1c784cd92b2de48a214d16734dd59807 HTTP/1.1
MIME-Version: 1.0
Connection: keep-alive
Extension: Security/Digest Security/SSL
Host: api.github.com
Accept-encoding: gzip
Accept: */*
User-Agent: URL/Emacs Emacs/26.3 (OpenStep; x86_64-apple-darwin14.5.0)
Authorization: token super-secrect-token-of-many-ascii-characters
Content-Type: application/json
Content-length: 5859

Locate an unknown non-ASCII Multi-byte Character in a File

Problem

Some program or API is failing because of a rouge multi-byte character in a file. So how do you locate the unknown character without examining every character in the file?

Answer (annoying, but correct)

Every search points to -P with grep, but that’s GNU grep not BSD grep that macOS ships with. Get over it and install goddamn GNU grep. [fn::There definitely has to be a better way (something with non-GNU stuff).]

$ brew install grep

When brew is done it will inform you that any command (binary) installed will have the prefix “g” for GNU.

All commands have been installed with the prefix "g".

Thus to use GNU grep on macOS type ggrep. To find all occurrences of multi-byte character in badfile.txt input the following

ggrep --color='auto' -P -n '[^\x00-\x7F]' badfile.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment