Thursday 4 October 2012

Hadoop Distributed File System (HDFS) Admin Command Reference - Hadoop Online Training


While the dfs module for bin/hadoop provides common file and directory manipulation commands, they all work with objects within the file system. The dfsadmin module manipulates or queries the file system as a whole. The operation of the commands in this module is described in this section.
Getting overall status: A brief status report for HDFS can be retrieved with bin/hadoop dfsadmin -report. This returns basic information about the overall health of the HDFS cluster, as well as some per-server metrics.
More involved status: If you need to know more details about what the state of the NameNode's metadata is, the command bin/hadoop dfsadmin -metasave filename will record this information in filename. The metasave command will enumerate lists of blocks which are under-replicated, in the process of being replicated, and scheduled for deletion. NB: The help for this command states that it "saves NameNode's primary data structures," but this is a misnomer; the NameNode's state cannot be restored from this information. However, it will provide good information about how the NameNode is managing HDFS's blocks.
Safemode: Safemode is an HDFS state in which the file system is mounted read-only; no replication is performed, nor can files be created or deleted. This is automatically entered as the NameNode starts, to allow all DataNodes time to check in with the NameNode and announce which blocks they hold, before the NameNode determines which blocks are under-replicated, etc. The NameNode waits until a specific percentage of the blocks are present and accounted-for; this is controlled in the configuration by thedfs.safemode.threshold.pct parameter. After this threshold is met, safemode is automatically exited, and HDFS allows normal operations. The bin/hadoop dfsadmin -safemode whatcommand allows the user to manipulate safemode based on the value of what, described below:
  • enter - Enters safemode
  • leave - Forces the NameNode to exit safemode
  • get - Returns a string indicating whether safemode is ON or OFF
  • wait - Waits until safemode has exited and returns
Changing HDFS membership - When decommissioning nodes, it is important to disconnect nodes from HDFS gradually to ensure that data is not lost. See the section on decommissioning later in this document for an explanation of the use of the -refreshNodes dfsadmin command.
Upgrading HDFS versions - When upgrading from one version of Hadoop to the next, the file formats used by the NameNode and DataNodes may change. When you first start the new version of Hadoop on the cluster, you need to tell Hadoop to change the HDFS version (or else it will not mount), using the command: bin/start-dfs.sh -upgrade. It will then begin upgrading the HDFS version. The status of an ongoing upgrade operation can be queried with the bin/hadoop dfsadmin -upgradeProgress status command. More verbose information can be retrieved with bin/hadoop dfsadmin -upgradeProgress details. If the upgrade is blocked and you would like to force it to continue, use the command: bin/hadoop dfsadmin -upgradeProgress force. (Note: be sure you know what you are doing if you use this last command.)
When HDFS is upgraded, Hadoop retains backup information allowing you to downgrade to the original HDFS version in case you need to revert Hadoop versions. To back out the changes, stop the cluster, re-install the older version of Hadoop, and then use the command: bin/start-dfs.sh -rollback. It will restore the previous HDFS state.
Only one such archival copy can be kept at a time. Thus, after a few days of operation with the new version (when it is deemed stable), the archival copy can be removed with the command bin/hadoop dfsadmin -finalizeUpgrade. The rollback command cannot be issued after this point. This must be performed before a second Hadoop upgrade is allowed.
Getting help - As with the dfs module, typing bin/hadoop dfsadmin -help cmd will provide more usage information about the particular command.

9 comments:

  1. This information you provided in the blog that was really unique I love it!!, Thanks for sharing such a great blog..Keep posting..

    Hadoop Training Institutes in Chennai

    ReplyDelete
  2. Best Dot Net Training in Chennai

    Thanks for your wonderful post.It is really very helpful for us and I have gathered some important information from this blog.If anyone wants to get Dot Net Training in Chennai reach FITA, rated as No.1 Dot Net Training Institutes in Chennai.

    Dot Net Course in Chennai

    ReplyDelete
  3. Software Testing Training in Chennai

    I have read your blog and i got a very useful and knowledgeable information from your blog.its really a very nice article. I did Loadrunner Training in Chennai. This is really useful for me. Suppose if anyone interested to learn Manual Testing Training in Chennai reach FITA academy located at Chennai Velachery.

    ReplyDelete
  4. QTP Training Chennai

    Hi, I wish to be a regular contributor of your blog. I have read your blog. Your information is really useful for beginner. I did Testing Training in Chennai at Fita training and placement academy which offer best Software Testing Training in Chennai with years of experienced professionals. This is really useful for me to make a bright career.

    Regards...

    Software Testing Training Institutes in Chennai

    ReplyDelete
  5. Thanks Hadoop scholar, you have shared valuable Hadoop administration interview questions with explanations. As of now you have shared only few admin commands what commands used most frequently, we are seeking more interview tips from you. Pls share more hadoop tips.. Thanks in advance.
    By
    Venu
    Hadoop developer

    ReplyDelete
  6. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing. AWS Training in chennai | AWS Training chennai | AWS course in chennai

    ReplyDelete
  7. Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for me.. Cloud Computing Training in chennai | Cloud Computing Training chennai | Cloud Computing Course in chennai | Cloud Computing Course chennai

    ReplyDelete
  8. This is extremely helpful info!! Very good work. Everything is very interesting to learn and easy to understood. Thank you for giving information. VMWare Training in chennai | VMWare Training chennai | VMWare course in chennai | VMWare course chennai

    ReplyDelete