Skip to content

Instantly share code, notes, and snippets.

View JeffRDay's full-sized avatar

Jeff Day JeffRDay

View GitHub Profile
@JeffRDay
JeffRDay / Note.md
Last active May 10, 2024 09:17
Find Non-UTF8 Encoded Characters using Visual Studio Code (VS Code)

I ran into a problem with python where a file I wanted to read in and parse contained unexpected non-UTF-8 encoded characters. I am certain there are many ways to solve this problem, but capturing my quick and dirty appraoch below for posterity.

  1. Open the file and the Open Find
  2. In find, copy/paste the regex below:
^([\x00-\x7F]|[\xC2-\xDF][\x80-\xBF]|\xE0[\xA0-\xBF][\x80-\xBF]|[\xE1-\xEC][\x80-\xBF]{2}|\xED[\x80-\x9F][\x80-\xBF]|[\xEE-\xEF][\x80-\xBF]{2}|\xF0[\x90-\xBF][\x80-\xBF]{2}|[\xF1-\xF3][\x80-\xBF]{3}|\xF4[\x80-\x8F][\x80-\xBF]{2})*$
  1. VS Code will highlight the lines MATCHING UTF encoded characters. So, you just have to skim the file looking for lines without highlighting.
@bosmacs
bosmacs / latex.template
Created June 28, 2011 19:39 — forked from michaelt/latex.template
Simple Pandoc latex.template with comments
%!TEX TS-program = xelatex
\documentclass[12pt]{scrartcl}
% The declaration of the document class:
% The second line here, i.e.
% \documentclass[12pt]{scrartcl}
% is a standard LaTeX document class declaration:
% we say what kind of document we are making in curly brackets,
% and specify any options in square brackets.