Friday 9 August 2013

Recovery of deleted files in Hadoop

There may be incidents which we accidently delete necessary files from hadoop. Sometimes the entire file system may get deleted. For doing recovery process the below steps may help you.
For doing this recovery method  trash should be enabled in hdfs. Trash can be enabled by setting the property  fs.trash.interval greater than 0. By default the value is zero.  Its value is number of minutes after which the checkpoint gets deleted. If zero, the trash feature is disabled. We have to set this property in core-site.xml.
<property>
  <name>fs.trash.interval</name>
  <value>30</value>
  <description>Number of minutes after which the checkpoint
  gets deleted.
  If zero, the trash feature is disabled.
  </description>
</property>

There is one more property which is having relation with the above property calledfs.trash.checkpoint.interval. It is the number of minutes between trash checkpoints. This should be smaller or equal to  fs.trash.interval. Everytime the checkpointer runs, it creates a new checkpoint out of current and removes checkpoints created more than fs.trash.interval minutes ago.The default value of this property is zero.


<property>
  <name>fs.trash.checkpoint.interval</name>
  <value>15</value>
  <description>Number of minutes between trash checkpoints.
  Should be smaller or equal to fs.trash.interval.
  Every time the checkpointer runs it creates a new checkpoint
  out of current and removes checkpoints created more than
  fs.trash.interval minutes ago.
  </description>
</property>

If the above properties are enabled in your cluster. Then the deleted files will be present in .Trash directory of hdfs. You have time to recover the files until the next checkpoint occurs. After the new checkpoint the deleted files will not be present in the .Trash. So recover before the new checkpoint. If this property is not enabled in your cluster,  you can enable this for future recovery.

15 comments:

  1. Really a nice blog.Learned how we can recover the deleted files in hadoop.Very informative.Thanks for sharing.Even I love to share the info regarding hadoop.Recently I visited http://www.hadooponlinetutor.com those guys are offering the videos at $20 only.The videos are nice

    ReplyDelete
  2. Best Big Data Hadoop Training in Hyderabad @ Kalyan Orienit

    Follow the below links to know more knowledge on Hadoop

    WebSites:
    ================
    http://www.kalyanhadooptraining.com/

    http://www.hyderabadhadooptraining.com/

    http://www.bigdatatraininghyderabad.com/

    Videos:
    ===============
    https://www.youtube.com/watch?v=-_fTzrgzVQc

    https://www.youtube.com/watch?v=Df2Odze87dE

    https://www.youtube.com/watch?v=AOfX-tNkYyo

    https://www.youtube.com/watch?v=Cyo3y0vlZ3c

    https://www.youtube.com/watch?v=jOLSXx6koO4

    https://www.youtube.com/watch?v=09mpbNBAmCo


    Best Big Data Hadoop Training in Hyderabad @ Kalyan Orienit

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. ecommerce web design agency
    ecommerce website designing company

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete