#linux #bash #cli #filesystem
See this post on GitHub for context.
We can use the du
command to determine disk usage. We need to combine it with the find
command to check only file sizes and exclude directory sizes.
Manual pages:
List directories, current level only (depth 1), in human readable format, then sort in reverse order in human format. Human sorting takes into account, e.g., order of SI prefixes such as K, MB, GB, etc:
du -h -d 1 | sort -hr
or
du --human-readable --max-depth=1 | sort --human-numeric-sort --reverse
Same as above, but include --all
files as well:
du -ahd 1 | sort -hr
or
du --all --human-readable --max-depth=1 | sort --human-numeric-sort --reverse
Top 20 biggest directories:
du -ah | sort -hr | head -n 20
or
du --all --human-readable | sort --human-numeric-sort --reverse | head --lines=20
Find only files (-type f
) and execute du
(du -ah {} +
) on them. Use human numbers (SI prefixes) for the size values, human sort them, and return just the top 50 results:
find . -type f -exec du -ah {} + | sort -hr | head -n 50
Exclude the .git
directory:
find . -not -path "./.git/*" -type f -exec du -ah {} + | sort -hr
We can output the results to bat
to make them easier to browse (bat
is non standard and needs to be installed; if you can’t install it, try less
or nano
instead).
find . -not -path "./.git/*" -type f -exec du -ah {} + | sort -hr | bat
We could also use the fd
command, which is a more modern version of the old GNU find
utility, is easier to use, and is faster.
fd . --hidden --type=file --exclude='.git' -x du --all --human-readable | sort -hr | bat
or
fd . -H -tf -E '.git' -x du -ah | sort -hr | bat
We can also do the same in Nushell. It’s a touch more verbose, but is very readable and easy to understand at a glance. In this case we’re
- Using Nushell’s own
ls
command to find everything in current directory and under (using the glob pattern**/*
) - Excluding everything with
.git
ornode_modules
in the name using a regex pattern - Excluding all directories (so we’re returning files only),
- Sorting intelligently by size in reverse order (largest first),
- Selecting only the
size
and thename
columns - Converting to
tsv
format - Copy the result to the clipboard ready for easy pasting into a spreadsheet.
ls -a **/* | where name !~ '\.git\\|node_modules' and type != dir | sort-by size -r | select size name | to tsv | clip
The command is longer than the typical Linux command, but it’s much easier to read at a glance and remember how to write, especially as Nushell comes with excellent command completion support.