Dealing with Small Files Problem in Hadoop Distributed File System

The usage of Hadoop has been increasing greatly in recent years. Hadoop adoption is widespread. Some notable big users such as Yahoo, Facebook, Netflix, and Amazon use Hadoop mainly for unstructured data analysis as Hadoop framework works very well with structured and unstructured data. Hadoop distr...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Procedia computer science Ročník 79; s. 1001 - 1012
Hlavní autoři: Bende, Sachin, Shedge, Rajashree
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 2016
Témata:
ISSN:1877-0509, 1877-0509
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The usage of Hadoop has been increasing greatly in recent years. Hadoop adoption is widespread. Some notable big users such as Yahoo, Facebook, Netflix, and Amazon use Hadoop mainly for unstructured data analysis as Hadoop framework works very well with structured and unstructured data. Hadoop distributed file system (HDFS) is meant for storing large files but when large number of small files need to be stored, HDFS has to face few problems as all the files in HDFS are managed by a single server. Various methods have been proposed to deal with small files problem in HDFS. This paper gives comparative analysis of methods which deals with small files problem in HDFS.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2016.03.127