Skip to content

Instantly share code, notes, and snippets.

@jamesrajendran
Created May 7, 2017 04:57
Show Gist options
  • Save jamesrajendran/27ed92ce47c207992f91c5d223c81d26 to your computer and use it in GitHub Desktop.
Save jamesrajendran/27ed92ce47c207992f91c5d223c81d26 to your computer and use it in GitHub Desktop.
adduser
useradd etl_user -g hadoop
identify user
id etl_user
login as a different user
sudo su - etl_user
as hdfs is the hadoop rootuser in distributions like hortonworks/cloudera
sudo su - hdfs
hadoop fs -mkdir /user/etl_user
hadoop fs -chown etl_user:supergroup /user/etl_user
change owner
hadoof fs -chown etl_user:supergroup /user/etl_user
to check admin commands
hdfs dfsadmin
to see quota, space
hadoop fs -count -q /user/etl_user
to set quto
hadoop dfsadmin -setQuota 10000 /user/etl_user
hadoop dfsadmin -setSpaceQuota 10G /user/etl_user
hadoop dfsadmin -clrQuota /user/etl_user
hadoop dfsadmin -cleSpaceQuota /user/etl_user
#get cluster size report
hadoop dfsadmin -report
hadoop job -list
hadoop job -kill <jobId>
change user with sudo permission:
sudo su - hdfs
RAM size of Namenode:
The namenode requires ~150 bytes for each block, +16 bytes for each replica, and it must be kept in live memory. So a default replication factor of 3 gives you 182 bytes, and you have 7534776 blocks gives about 1.3GB. Plus all other non-file related memory in use in the namenode, 1.95GB sounds about right. I would say that your HDFS cluster size requires a bigger namenode, more RAM
------blocks info -----------
hdfs fsck <filepath> -files -blocks -locations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment