Skip to content

Instantly share code, notes, and snippets.

@bradland
Created June 12, 2020 19:14
Show Gist options
  • Save bradland/000fc64eb12d4aee3304e00ed16b3f68 to your computer and use it in GitHub Desktop.
Save bradland/000fc64eb12d4aee3304e00ed16b3f68 to your computer and use it in GitHub Desktop.
# Excerpt from ptools gem
# Returns whether or not +file+ is a binary non-image file, i.e. executable,
# shared object, ect. Note that this is NOT guaranteed to be 100% accurate.
# It performs a "best guess" based on a simple test of the first
# +File.blksize+ characters, or 4096, whichever is smaller.
#
# By default it will check to see if more than 30 percent of the characters
# are non-text characters. If so, the method returns true. You can configure
# this percentage by passing your own as a second argument.
#
# Example:
#
# File.binary?('somefile.exe') # => true
# File.binary?('somefile.txt') # => false
#--
# Based on code originally provided by Ryan Davis (which, in turn, is
# based on Perl's -B switch).
#
def self.binary?(file, percentage = 0.30)
return false if image?(file)
return false if check_bom?(file)
bytes = File.stat(file).blksize
bytes = 4096 if bytes > 4096
s = (File.read(file, bytes) || "")
s = s.encode('US-ASCII', :undef => :replace).split(//)
((s.size - s.grep(" ".."~").size) / s.size.to_f) > percentage
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment