Skip to content

Instantly share code, notes, and snippets.

@THeK3nger
Last active May 6, 2020 13:31
Show Gist options
  • Save THeK3nger/b5e6908d2153e824260cee7f1d5fd108 to your computer and use it in GitHub Desktop.
Save THeK3nger/b5e6908d2153e824260cee7f1d5fd108 to your computer and use it in GitHub Desktop.
Find old binary blobs in Git repository. Usage example `ruby find-old-binaries.rb master 0.5 true`
#!/usr/bin/env ruby -w
head, treshold, only_orphans = ARGV
only_orphans ||= false
head ||= 'HEAD'
Megabyte = 1000 ** 2
treshold = (treshold || 0.1).to_f * Megabyte
# Usage:
# ruby find-old-binaries.rb <branch> <size limit in MB> <show only orphans?>
#
# If <show only orphans?> the script will output only binary blobs that
# ARE NOT INTO THE CURRENT CHECKED OUT BRANCH.
#
# Useful to know the binary blobs that can be removed from the Git history
# to reduce repository size (if you do not care about history)
big_files = {}
IO.popen("git rev-list #{head}", 'r') do |rev_list|
rev_list.each_line do |commit|
commit.chomp!
for object in `git ls-tree -zrl #{commit}`.split("\0")
bits, type, sha, size, path = object.split(/\s+/, 5)
size = size.to_i
big_files[sha] = [path, size, commit] if size >= treshold
end
end
end
big_files.each do |sha, (path, size, commit)|
where = `git show -s #{commit} --format='%h: %cr'`.chomp
orphan = false
if not File.exist?(path)
orphan = true
end
if only_orphans
if orphan
puts "%4.1fM\t%s\t(%s)" % [size.to_f / Megabyte, path, where]
end
else
puts "%4.1fM\t%s\t(%s) %s" % [size.to_f / Megabyte, path, where, orphan]
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment