Skip to content

Instantly share code, notes, and snippets.

@ChuckSavage
Created August 29, 2017 18:29
Show Gist options
  • Star 15 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save ChuckSavage/dc079e21563ba1402cf6c907d81ac1ca to your computer and use it in GitHub Desktop.
Save ChuckSavage/dc079e21563ba1402cf6c907d81ac1ca to your computer and use it in GitHub Desktop.
C# Is file an image and get its type
// Includes a mini-program for checking and fixing files that have no extension
// Only checks for the most common types
// If you create a better version, please upload it here.
using System;
using System.Collections.Generic;
using System.IO;
namespace AppendJPG
{
class Program
{
static void Main(string[] args)
{
foreach (string file in Directory.EnumerateFiles(Environment.CurrentDirectory))
{
//var x = Path.GetExtension(file);
if (Path.GetExtension(file).Length == 0)
{
//var n = Path.GetFileNameWithoutExtension(file);
IsImageExtension.ImageType type;
if (file.IsImage(out type))
{
var ext = type.ToString().ToLower();
//Console.WriteLine(ext);
File.Move(file, file + "." + ext);
}
//var t = type;
}
}
}
}
public static class IsImageExtension
{
static List<string> jpg;
static List<string> bmp;
static List<string> gif;
static List<string> png;
public enum ImageType
{
JPG,
BMP,
GIF,
PNG,
NONE
}
const string JPG = "FF";
const string BMP = "42";
const string GIF = "47";
const string PNG = "89";
static IsImageExtension()
{
jpg = new List<string> { "FF", "D8" };
bmp = new List<string> { "42", "4D" };
gif = new List<string> { "47", "49", "46" };
png = new List<string> { "89", "50", "4E", "47", "0D", "0A", "1A", "0A" };
}
public static bool IsImage(this string file, out ImageType type)
{
type = ImageType.NONE;
if (string.IsNullOrWhiteSpace(file)) return false;
if (!File.Exists(file)) return false;
using (var stream = File.OpenRead(file))
return stream.IsImage(out type);
}
public static bool IsImage(this Stream stream, out ImageType type)
{
type = ImageType.NONE;
stream.Seek(0, SeekOrigin.Begin);
string bit = stream.ReadByte().ToString("X2");
switch (bit)
{
case JPG:
if (stream.IsImage(jpg))
{
type = ImageType.JPG;
return true;
}
break;
case BMP:
if (stream.IsImage(bmp))
{
type = ImageType.BMP;
return true;
}
break;
case GIF:
if (stream.IsImage(gif))
{
type = ImageType.GIF;
return true;
}
break;
case PNG:
if (stream.IsImage(png))
{
type = ImageType.PNG;
return true;
}
break;
default:
break;
}
return false;
}
public static bool IsImage(this Stream stream, List<string> comparer)
{
stream.Seek(0, SeekOrigin.Begin);
foreach (string c in comparer)
{
string bit = stream.ReadByte().ToString("X2");
if (0 != string.Compare(bit, c))
return false;
}
return true;
}
}
}
@studen28
Copy link

studen28 commented Jun 9, 2019

Thank you so much! Work Correctly.

@EvilVir
Copy link

EvilVir commented Apr 26, 2020

Cleaner and faster version:

public enum FileType
{
	Unknown,
	Jpeg,
	Bmp,
	Gif,
	Png,
	Pdf
}

public static class FilesHelper
{
	private static readonly Dictionary<FileType, byte[]> KNOWN_FILE_HEADERS = new Dictionary<FileType, byte[]>()
	{
		{ FileType.Jpeg, new byte[]{ 0xFF, 0xD8 }}, // JPEG
		{ FileType.Bmp, new byte[]{ 0x42, 0x4D }}, // BMP
		{ FileType.Gif, new byte[]{ 0x47, 0x49, 0x46 }}, // GIF
		{ FileType.Png, new byte[]{ 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A }}, // PNG
		{ FileType.Pdf, new byte[]{ 0x25, 0x50, 0x44, 0x46 }} // PDF
	};

	public static FileType GetKnownFileType(ReadOnlySpan<byte> data)
	{
		foreach (var check in KNOWN_FILE_HEADERS)
		{
			if (data.Length >= check.Value.Length)
			{
				var slice = data.Slice(0, check.Value.Length);
				if (slice.SequenceEqual(check.Value))
				{
					return check.Key;
				}
			}
		}

		return FileType.Unknown;
	}
}

@ChuckSavage
Copy link
Author

ChuckSavage commented Apr 26, 2020

Thanks EvilVir - To make that work, the NuGet Package System.Memory needs to be installed to get ReadOnlySpan recognized. It also seems you need .Net to 4.7.1. Plus, there is no implentation to the current program. I haven't figured out how to read stream and call GetKnownFileType() with the ReadOnlySpan.

@EvilVir
Copy link

EvilVir commented Apr 27, 2020

This is for .NET Core 3.1 but you can just replace ReadOnlySpan<byte> with byte[] and it will work the same under classic .NET.

As for an example, it's quite easy:

var data = File.ReadAllBytes(@"C:\Path\To\Your\File.jpg");
var result = GetKnownFileType(data);

With minimal changes you can make it work with streams, which should be better for memory, but you need to have seekable streams (FileStream is seekable). It would be like this:

        public static FileType GetKnownFileType(Stream data)
        {
            foreach (var check in KNOWN_FILE_HEADERS)
            {
                data.Seek(0, SeekOrigin.Begin);
                
                var slice = new byte[check.Value.Length];
                data.Read(slice, 0, check.Value.Length);
                if (slice.SequenceEqual(check.Value))
                {
                    return check.Key;
                }
            }

            data.Seek(0, SeekOrigin.Begin);
            return FileType.Unknown;
        }

And usage:

using var stream = File.Open(@"C:\Path\To\Your\File.jpg");
var result = GetKnownFileType(stream);
stream.close();

@manoochehrkateb
Copy link

Cleaner and faster version:

public enum FileType
{
	Unknown,
	Jpeg,
	Bmp,
	Gif,
	Png,
	Pdf
}

public static class FilesHelper
{
	private static readonly Dictionary<FileType, byte[]> KNOWN_FILE_HEADERS = new Dictionary<FileType, byte[]>()
	{
		{ FileType.Jpeg, new byte[]{ 0xFF, 0xD8 }}, // JPEG
		{ FileType.Bmp, new byte[]{ 0x42, 0x4D }}, // BMP
		{ FileType.Gif, new byte[]{ 0x47, 0x49, 0x46 }}, // GIF
		{ FileType.Png, new byte[]{ 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A }}, // PNG
		{ FileType.Pdf, new byte[]{ 0x25, 0x50, 0x44, 0x46 }} // PDF
	};

	public static FileType GetKnownFileType(ReadOnlySpan<byte> data)
	{
		foreach (var check in KNOWN_FILE_HEADERS)
		{
			if (data.Length >= check.Value.Length)
			{
				var slice = data.Slice(0, check.Value.Length);
				if (slice.SequenceEqual(check.Value))
				{
					return check.Key;
				}
			}
		}

		return FileType.Unknown;
	}
}

Can you please show usage ?

@3hxx
Copy link

3hxx commented Sep 7, 2022

Cleaner and faster version:

public enum FileType
{
	Unknown,
	Jpeg,
	Bmp,
	Gif,
	Png,
	Pdf
}

public static class FilesHelper
{
	private static readonly Dictionary<FileType, byte[]> KNOWN_FILE_HEADERS = new Dictionary<FileType, byte[]>()
	{
		{ FileType.Jpeg, new byte[]{ 0xFF, 0xD8 }}, // JPEG

This is the perfect solution for anyone looking to verify that a file is of a certain file type. @EvilVir do you have a link to find the byte arrays for other file types?

@EvilVir
Copy link

EvilVir commented Sep 7, 2022

@Freshmintyy they are wildly available on the Internet. One is on Wikipedia: https://en.m.wikipedia.org/wiki/List_of_file_signatures

Other one: https://www.garykessler.net/library/file_sigs.html

Search for Magic Numbers, File Headers etc. there are also nugets for .NET that have these lists and functions to check. Be aware that there are thousands and thousands file types, checking them all without any index would be very slow and resource hungry. Therefore either use indexes, divide for parallel processing etc. or, in cases such as this, just pick couple of signatures you really need and check only against them.

@jasarsoft
Copy link

jasarsoft commented Jun 23, 2023

Hi @EvilVir
We have a small error here.
When the appropriate code is found in the header, it is necessary to return the stream to the beginning before the return statement.

if (slice.SequenceEqual(check.Value))
{
data.Seek(0, SeekOrigin.Begin);
return check.Key;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment