Skip to content

Instantly share code, notes, and snippets.

@yashk
Last active December 14, 2022 16:03
Show Gist options
  • Save yashk/1bf84dbd436d1ddc1ee546c310c696c1 to your computer and use it in GitHub Desktop.
Save yashk/1bf84dbd436d1ddc1ee546c310c696c1 to your computer and use it in GitHub Desktop.
hdu.sh - get list of dir on a hdfs path sorted by disk space descending, good tool for figuring out what us occupying space on hdfs
hdfs dfs -du "$1" | awk '{print $1,$2,$3}' | sort -nr | xargs -n3 sh -c 'printf "%s %s %s\n" $(numfmt --to=iec $0) $(numfmt --to=iec $1) $2'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment