Skip to content

Instantly share code, notes, and snippets.

@camertron
Created November 14, 2021 22:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save camertron/bd90bb142ed0a264073b70210d31377b to your computer and use it in GitHub Desktop.
Save camertron/bd90bb142ed0a264073b70210d31377b to your computer and use it in GitHub Desktop.
Lists top directories by file size
#! /usr/bin/env ruby
require 'shellwords'
$cache = {}
def file_sizes_in(path)
$cache[path] ||= begin
raw = `du -d 1 2>/dev/null #{Shellwords.shellescape(path)}`
# exclude current dir
raw.split("\n").map do |line|
size, sub_path = line.squeeze(" ").split(" ", 2)
[size.to_i, sub_path, File.directory?(sub_path)]
end
end
end
def filesize(size)
size *= 512 # convert blocks to bytes
units = %w[B KiB MiB GiB TiB Pib EiB ZiB]
return '0.0 B' if size == 0
exp = (Math.log(size) / Math.log(1024)).to_i
exp += 1 if (size.to_f / 1024 ** exp >= 1024 - 0.05)
exp = units.size - 1 if exp > units.size - 1
'%.1f %s' % [size.to_f / 1024 ** exp, units[exp]]
end
def do_it(current_path, rank)
puts "Starting at #{current_path}"
loop do
STDOUT.write("Calculating size of files and folders in #{current_path}...")
size_info = file_sizes_in(current_path)
current_info = size_info.delete_at(-1)
size_info.sort_by! { |entry| entry[0] }
puts "\r\e[K#{current_path} contains #{filesize(current_info[0])}"
biggest = size_info.select { |entry| entry[2] }[-rank]
break unless biggest
puts "Largest subdirectory is #{biggest[1]} at #{filesize(biggest[0])}"
current_path = biggest[1]
end
end
current_path = ARGV[0] || ENV["HOME"]
do_it(current_path, 1)
size_info = file_sizes_in(current_path)
second_largest = size_info[0..-2].sort_by { |entry| entry[0] }[-2]
puts "\nTraversing second-largest directory in #{current_path}"
do_it(second_largest[1], 1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment