Skip to content

Instantly share code, notes, and snippets.

@dgadiraju
Created October 22, 2018 23:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dgadiraju/3ef7a526ff1b047f5955ff61bfd9c38c to your computer and use it in GitHub Desktop.
Save dgadiraju/3ef7a526ff1b047f5955ff61bfd9c38c to your computer and use it in GitHub Desktop.
sudo yum -y install git
#Below command will create directory by name data
git clone https://github.com/dgadiraju/data.git
#One of the directory is cards, we will unzip largedeck.txt.gz
gunzip data/cards/largedeck.txt.gz
#Creating userspace
#hdfs is superuser for hdfs
#All the servers where hdfs is configured (including gateways) will contain this user
#We need to use hdfs or any superuser to create userspace
#This command will take care of userspace for user itversity
sudo -u hdfs hadoop fs -mkdir -p /user/itversity
sudo -u hdfs hadoop fs -chown itversity:itversity /user/itversity
#Now we can run rest of the commands to copy data into hdfs using user itversity
hadoop fs -mkdir /user/itversity/cards
hadoop fs -put data/cards/deckofcards.txt /user/itversity/cards
hadoop fs -put data/cards/largedeck.txt /user/itversity/cards
#We can get metadata associated with the files using hdfs fsck command
hdfs fsck /user/itversity/cards -files -blocks -locations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment