Distributed Metadata Management Scheme in HDFS

Gespeichert in:
Bibliographische Detailangaben
Titel: Distributed Metadata Management Scheme in HDFS
Autoren: Mrudula Varade, Vimla Jethani
Weitere Verfasser: The Pennsylvania State University CiteSeerX Archives
Quelle: http://www.ijsrp.org/research-paper-0513/ijsrp-p1770.pdf.
Bestand: CiteSeerX
Schlagwörter: Index Terms- Hadoop, HDFS, Distributed File System, Metadata Management, File Management System. W
Beschreibung: designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture can’t meet the exponentially increased storage demand in cloud computing, as the single master server may become a performance bottleneck. Comparative study of a metadata management scheme is done. There is three of techniques sub-tree partitioning, hashing and consistent hashing of metadata management scheme. Out of these three schemes consistent hashing is the best techniques which employs multiple NameNodes, and divides the metadata into “buckets ” which can be dynamically migrated among NameNodes according to system workloads. To maintain reliability, metadata is replicated in different NameNodes with log replication technology, and Paxos algorithm is adopted to keep replication consistency.
Publikationsart: text
Dateibeschreibung: application/pdf
Sprache: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.414.7349
Verfügbarkeit: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.414.7349
http://www.ijsrp.org/research-paper-0513/ijsrp-p1770.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Dokumentencode: edsbas.94C954FE
Datenbank: BASE
Beschreibung
Abstract:designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture can’t meet the exponentially increased storage demand in cloud computing, as the single master server may become a performance bottleneck. Comparative study of a metadata management scheme is done. There is three of techniques sub-tree partitioning, hashing and consistent hashing of metadata management scheme. Out of these three schemes consistent hashing is the best techniques which employs multiple NameNodes, and divides the metadata into “buckets ” which can be dynamically migrated among NameNodes according to system workloads. To maintain reliability, metadata is replicated in different NameNodes with log replication technology, and Paxos algorithm is adopted to keep replication consistency.