Distributed Metadata Management Scheme in HDFS

Saved in:
Bibliographic Details
Title: Distributed Metadata Management Scheme in HDFS
Authors: Mrudula Varade, Vimla Jethani
Contributors: The Pennsylvania State University CiteSeerX Archives
Source: http://www.ijsrp.org/research-paper-0513/ijsrp-p1770.pdf.
Collection: CiteSeerX
Subject Terms: Index Terms- Hadoop, HDFS, Distributed File System, Metadata Management, File Management System. W
Description: designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture can’t meet the exponentially increased storage demand in cloud computing, as the single master server may become a performance bottleneck. Comparative study of a metadata management scheme is done. There is three of techniques sub-tree partitioning, hashing and consistent hashing of metadata management scheme. Out of these three schemes consistent hashing is the best techniques which employs multiple NameNodes, and divides the metadata into “buckets ” which can be dynamically migrated among NameNodes according to system workloads. To maintain reliability, metadata is replicated in different NameNodes with log replication technology, and Paxos algorithm is adopted to keep replication consistency.
Document Type: text
File Description: application/pdf
Language: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.414.7349
Availability: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.414.7349
http://www.ijsrp.org/research-paper-0513/ijsrp-p1770.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Accession Number: edsbas.94C954FE
Database: BASE
Description
Abstract:designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture can’t meet the exponentially increased storage demand in cloud computing, as the single master server may become a performance bottleneck. Comparative study of a metadata management scheme is done. There is three of techniques sub-tree partitioning, hashing and consistent hashing of metadata management scheme. Out of these three schemes consistent hashing is the best techniques which employs multiple NameNodes, and divides the metadata into “buckets ” which can be dynamically migrated among NameNodes according to system workloads. To maintain reliability, metadata is replicated in different NameNodes with log replication technology, and Paxos algorithm is adopted to keep replication consistency.