Skip to content

Instantly share code, notes, and snippets.

@sambos
Last active April 10, 2018 21:44
Show Gist options
  • Save sambos/37ea0dfa90a252f78504b953d9a8a1a7 to your computer and use it in GitHub Desktop.
Save sambos/37ea0dfa90a252f78504b953d9a8a1a7 to your computer and use it in GitHub Desktop.
hdfs commands

Useful HDFS Commands

Displayed in this format:
 +-------------------------------------------------------------------+ 
 | size  |  disk_space_consumed_with_all_replicas  |  full_path_name | 
 +-------------------------------------------------------------------+ 
 
-du [-s] [-h] ... : Show the amount of space, in bytes, used by the files that match the specified file pattern.

-s : Rather than showing the size of each individual file that matches the
pattern, shows the total (summary) size.

-h : Formats the sizes of files in a human-readable fashion rather than a number of bytes. (Ex MB/GB/TB etc)

Check file size

hdfs dfs -du /$yourDirectoryName

Check dir size

hdfs dfs -du -s -h /$yourDirectoryName

Check dir size from multiple paths

hdfs dfs -du -s -h /somepath/{tag1,tag2}/path1/path2

Check HDFS block size on your cluster

hdfs getconf -confKey dfs.blocksize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment