Dynamic Resource Allocation for Efficient Parallel Data Processing Using RMI Protocol

Gespeichert in:
Bibliographische Detailangaben
Titel: Dynamic Resource Allocation for Efficient Parallel Data Processing Using RMI Protocol
Autoren: Sushma K S, Vinay Kumar V
Weitere Verfasser: The Pennsylvania State University CiteSeerX Archives
Quelle: http://www.ijeat.org/attachments/File/v2i5/E1886062513.pdf.
Bestand: CiteSeerX
Schlagwörter: Many-Task Computing, High-Throughput Computing, Loosely Coupled Applications, Cloud Computing
Beschreibung: In recent years ad-hoc parallel data processing has emerged to be one of the killer applications for Infrastructure-as-a-Service (IaaS) clouds. Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. the processing frameworks which are currently used have been designed for static, homogeneous cluster setups and disregard the particular nature of a cloud. Consequently, the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost. In this paper we discuss the opportunities and challenges for efficient parallel data processing in clouds and present our research project Nephele. Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today’s IaaS clouds for both, task scheduling and execution. In this paper we discuss the opportunities and challenges for efficient parallel data processing Particular tasks of a processing job can be assigned to different types of virtual machines which are automatically instantiated and terminated during the job execution. Based on this new framework, we perform extended evaluations of MapReduce-inspired processing jobs on an IaaS cloud system and compare the results to the popular data processing framework Hadoop.
Publikationsart: text
Dateibeschreibung: application/pdf
Sprache: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.686.7230; http://www.ijeat.org/attachments/File/v2i5/E1886062513.pdf
Verfügbarkeit: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.686.7230
http://www.ijeat.org/attachments/File/v2i5/E1886062513.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Dokumentencode: edsbas.39E3F30F
Datenbank: BASE
Beschreibung
Abstract:In recent years ad-hoc parallel data processing has emerged to be one of the killer applications for Infrastructure-as-a-Service (IaaS) clouds. Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. the processing frameworks which are currently used have been designed for static, homogeneous cluster setups and disregard the particular nature of a cloud. Consequently, the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost. In this paper we discuss the opportunities and challenges for efficient parallel data processing in clouds and present our research project Nephele. Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today’s IaaS clouds for both, task scheduling and execution. In this paper we discuss the opportunities and challenges for efficient parallel data processing Particular tasks of a processing job can be assigned to different types of virtual machines which are automatically instantiated and terminated during the job execution. Based on this new framework, we perform extended evaluations of MapReduce-inspired processing jobs on an IaaS cloud system and compare the results to the popular data processing framework Hadoop.