Hadoop Distributed File System Write Operations

Gespeichert in:
Bibliographische Detailangaben
Titel: Hadoop Distributed File System Write Operations
Autoren: Renukadevi Chuppala
Quelle: International Journal of Intelligent Systems and Applications in Engineering; Vol. 12 No. 23s (2024); 720 – 731
Verlagsinformationen: International Journal of Intelligent Systems and Applications in Engineering, 2024.
Publikationsjahr: 2024
Schlagwörter: Hadoop Distributed File System (HDFS),NameNode , DataNode, Replica, Rackawareness, Data Packet, Data Packet Transfer Time, Pipeline, Fully Connected Digrapgh Network Topology, A* algorithm
Beschreibung: Hadoop is an open-source version of the MapReduce Framework for distributed processing. A Hadoop cluster possesses the capacity to manage substantial volumes of data. Hadoop utilizes the Hadoop Distributed File System, also known as HDFS, to manage large amounts of data. The client will transfer data to the DataNodes by retrieving block information from the NameNode. The pipeline configuration will connect the DataNodes that store the blocks. If a DataNode or network fails during the data writing process, the pipeline will remove the failed DataNode. The pipeline will add the new DataNode based on the existing DataNodes in the cluster. If there is a scarcity of spare nodes in the cluster, customers may encounter an abnormally high frequency of pipeline failures due to the inability to locate additional DataNodes or replacements. In the event of a network failure, the data packet is unable to reach the target DataNode due to their interconnected pipeline structure. Interconnecting each DataNode with every other DataNode ensures that multiple pathways are available through other DataNodes, thereby preventing network failure. The copy operation will take longer due to pipeline connectivity. On the other hand, a direct connection between a DataNode and all other DataNodes significantly reduces the time required, as the datapacket doesn't have to traverse through all other DataNodes to reach the final DataNode. This paper presents the utilization of the A* algorithm to enhance the performance of write operations in the Hadoop Distributed File System.
Publikationsart: Article
Dateibeschreibung: application/pdf
Sprache: English
ISSN: 2147-6799
Zugangs-URL: https://www.ijisae.org/index.php/IJISAE/article/view/6987
Rights: CC BY SA
Dokumentencode: edsair.issn21476799..846b6d970bcb19a1a3c97fa010dae56b
Datenbank: OpenAIRE