Skip to content

Instantly share code, notes, and snippets.

@neoGeneva
Created September 22, 2011 13:34
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save neoGeneva/1234779 to your computer and use it in GitHub Desktop.
Save neoGeneva/1234779 to your computer and use it in GitHub Desktop.
Detect Encoding in C#
public static Encoding GetFileEncoding(string path)
{
if (path == null)
throw new ArgumentNullException("path");
var encodings = Encoding.GetEncodings()
.Select(e => e.GetEncoding())
.Select(e => new { Encoding = e, Preamble = e.GetPreamble() })
.Where(e => e.Preamble.Any())
.ToArray();
var maxPrembleLength = encodings.Max(e => e.Preamble.Length);
byte[] buffer = new byte[maxPrembleLength];
using (var stream = File.OpenRead(path))
{
stream.Read(buffer, 0, (int)Math.Min(maxPrembleLength, stream.Length));
}
return encodings
.Where(enc => enc.Preamble.SequenceEqual(buffer.Take(enc.Preamble.Length)))
.Select(enc => enc.Encoding)
.FirstOrDefault() ?? Encoding.Default;
}
@neoGeneva
Copy link
Author

Works for UTF-8, UTF-16, UTF-16BE, UTF-32 and UTF-32BE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment