Hadoop Vs Spark: An Integrated Big Data Analytics Architecture For Distributed Computing Frameworks.

Uloženo v:
Podrobná bibliografie
Název: Hadoop Vs Spark: An Integrated Big Data Analytics Architecture For Distributed Computing Frameworks.
Autoři: Reddy, S. R. V. Prasad, Parthiban, K., Ramadass, Rajendrakumar, Kishore, Kakarla Hari
Zdroj: International Journal of Environmental Sciences (2229-7359); 2025 Special Issue, Vol. 11, p477-488, 12p
Témata: BIG data, DISTRIBUTED computing, DATA analytics, ELECTRONIC data processing, SOFTWARE frameworks, CLOUD computing
Abstrakt: In the digital age, decision-makers may now access enormous amounts of data. The phrase "big data" refers to databases that are difficult to handle using traditional tools and techniques due to their size, diversity, and dynamic nature. The last few years have seen an increase in the generation of data from multiple sources due to the introduction of cloud computing technology. Today's data processing equipment must be able to handle the enormous volumes of newly created information. For the DS &BDA sector in SC & L, we first propose a novel method of systematic review in this research. In order to classify the existing models and approaches, arrange their areas of practical usage, identify research needs, and propose potential future research methods; we then use the recommended methodology for an organized literature review on DS &BDA methods in the SC &L fields. The Adoop framework rose to prominence with its dispersed file structure and MapReduce programming methodology. On the other hand, Spark is a newly created framework for big data management and analysis that allows you to investigate an infinite number of the underlying properties of large data. The research compares Hadoop MapReduce with Spark's operating principles, effectiveness, and expense, simplicity of use, connectivity, data mining, disaster tolerance, and hygiene. Experimental observations of Hadoop MapReduce and Spark's performance have been made to ascertain their suitability for use in a variety of distributed computing scenarios. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Environmental Sciences (2229-7359) is the property of Academic Science Publications & Distributions (ASPD) and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Complementary Index
Popis
Abstrakt:In the digital age, decision-makers may now access enormous amounts of data. The phrase "big data" refers to databases that are difficult to handle using traditional tools and techniques due to their size, diversity, and dynamic nature. The last few years have seen an increase in the generation of data from multiple sources due to the introduction of cloud computing technology. Today's data processing equipment must be able to handle the enormous volumes of newly created information. For the DS &BDA sector in SC & L, we first propose a novel method of systematic review in this research. In order to classify the existing models and approaches, arrange their areas of practical usage, identify research needs, and propose potential future research methods; we then use the recommended methodology for an organized literature review on DS &BDA methods in the SC &L fields. The Adoop framework rose to prominence with its dispersed file structure and MapReduce programming methodology. On the other hand, Spark is a newly created framework for big data management and analysis that allows you to investigate an infinite number of the underlying properties of large data. The research compares Hadoop MapReduce with Spark's operating principles, effectiveness, and expense, simplicity of use, connectivity, data mining, disaster tolerance, and hygiene. Experimental observations of Hadoop MapReduce and Spark's performance have been made to ascertain their suitability for use in a variety of distributed computing scenarios. [ABSTRACT FROM AUTHOR]
ISSN:22297359