Skip to content

Instantly share code, notes, and snippets.

@tlberglund
Created January 29, 2011 22:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save tlberglund/802245 to your computer and use it in GitHub Desktop.
Save tlberglund/802245 to your computer and use it in GitHub Desktop.
A Git script to list all blobs by hash and size in an unpacked repo.
#!/usr/bin/env groovy
def objectDirs = []
new File('.git/objects').eachDir { dir ->
if(dir.name.split('/')[-1] ==~ /[0-9a-f]{2}/) {
objectDirs << dir.absolutePath
}
}
def hashes = []
objectDirs.each { path ->
new File(path).eachFile { file ->
def fileNameParts = file.absolutePath.split('/')
hashes << "${fileNameParts[-2]}${fileNameParts[-1]}"
}
}
def objects = [:]
hashes.each { hash ->
objects[hash] = [ size: "git cat-file -s ${hash}".execute().text.trim(),
type: "git cat-file -t ${hash}".execute().text.trim() ]
}
def blobs = objects.findAll { key, value ->
value.type == 'blob'
}.collect { key, value ->
new GitBlob(hash: key, size: value.size.toInteger())
}.sort { a, b -> b.size <=> a.size }
blobs.each { blob ->
println "${blob.hash}\t${blob.size}"
}
class GitBlob {
String hash
int size
String type
String toString() {
"${hash} (${size})"
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment