Visible to the public Backup and Disaster Recovery System for HDFS

TitleBackup and Disaster Recovery System for HDFS
Publication TypeConference Paper
Year of Publication2016
AuthorsLuo, S., Wang, Y., Huang, W., Yu, H.
Conference Name2016 International Conference on Information Science and Security (ICISS)
Keywordsbackup and disaster recovery system, Benchmark testing, cloud computing, Collaboration, composability, data communication, data retention, file system backup, gigabit Ethernet, Hadoop distributed file system, HDFS, HDFS cluster, Human Behavior, human factors, massive scale data storage, metadata, Metrics, parallel processing, performance evaluation, Policy-Governed Secure Collaboration, pubcrawl, reliability, Resiliency, Scalability, science of security, Servers, storage management

HDFS has been widely used for storing massive scale data which is vulnerable to site disaster. The file system backup is an important strategy for data retention. In this paper, we present an efficient, easy- to-use Backup and Disaster Recovery System for HDFS. The system includes a client based on HDFS with additional feature of remote backup, and a remote server with a HDFS cluster to keep the backup data. It supports full backup and regularly incremental backup to the server with very low cost and high throughout. In our experiment, the average speed of backup and recovery is up to 95 MB/s, approaching the theoretical maximum speed of gigabit Ethernet.

Citation Keyluo_backup_2016