Adaptive memory reservation strategy for heavy workloads in the Spark environment.
Uloženo v:
| Název: | Adaptive memory reservation strategy for heavy workloads in the Spark environment. |
|---|---|
| Autoři: | Li, Bohan, He, Xin, Yu, Junyang, Wang, Guanghui, Song, Yixin, Pan, Shunjie, Gu, Hangyu |
| Zdroj: | PeerJ Computer Science; Nov2024, p1-28, 28p |
| Témata: | DISTRIBUTED computing, PARALLEL programming, INTERNET of things, PARALLEL processing, EVICTION |
| Abstrakt: | The rise of the Internet of Things (IoT) and Industry 2.0 has spurred a growing need for extensive data computing, and Spark emerged as a promising Big Data platform, attributed to its distributed in-memory computing capabilities. However, practical heavy workloads often lead to memory bottleneck issues in the Spark platform. This results in resilient distributed datasets (RDD) eviction and, in extreme cases, violent memory contentions, causing a significant degradation in Spark computational efficiency. To tackle this issue, we propose an adaptive memory reservation (AMR) strategy in this article, specifically designed for heavy workloads in the Spark environment. Specifically, we model optimal task parallelism by minimizing the disparity between the number of tasks completed without blocking and the number completed in regular rounds. Optimal memory for task parallelism is determined to establish an efficient execution memory space for computational parallelism. Subsequently, through adaptive execution memory reservation and dynamic adjustments, such as compression or expansion based on task progress, the strategy ensures dynamic task parallelism in the Spark parallel computing process. Considering the cost of RDD cache location and real-time memory space usage, we select suitable storage locations for different RDD types to alleviate execution memory pressure. Finally, we conduct extensive laboratory experiments to validate the effectiveness of AMR. Results indicate that, compared to existing memory management solutions, AMR reduces the execution time by approximately 46.8%. [ABSTRACT FROM AUTHOR] |
| Copyright of PeerJ Computer Science is the property of PeerJ Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Databáze: | Complementary Index |
| FullText | Text: Availability: 0 CustomLinks: – Url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=pmc&term=2376-5992[TA]+AND+1[PG]+AND+2024[PDAT] Name: FREE - PubMed Central (ISSN based link) Category: fullText Text: Full Text Icon: https://imageserver.ebscohost.com/NetImages/iconPdf.gif MouseOverText: Check this PubMed for the article full text. – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edb&genre=article&issn=23765992&ISBN=&volume=&issue=&date=20241101&spage=1&pages=1-28&title=PeerJ Computer Science&atitle=Adaptive%20memory%20reservation%20strategy%20for%20heavy%20workloads%20in%20the%20Spark%20environment.&aulast=Li%2C%20Bohan&id=DOI:10.7717/peerj-cs.2460 Name: Full Text Finder Category: fullText Text: Full Text Finder Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif MouseOverText: Full Text Finder – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Li%20B Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edb DbLabel: Complementary Index An: 181524311 RelevancyScore: 1007 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 1007.06042480469 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Adaptive memory reservation strategy for heavy workloads in the Spark environment. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Li%2C+Bohan%22">Li, Bohan</searchLink><br /><searchLink fieldCode="AR" term="%22He%2C+Xin%22">He, Xin</searchLink><br /><searchLink fieldCode="AR" term="%22Yu%2C+Junyang%22">Yu, Junyang</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+Guanghui%22">Wang, Guanghui</searchLink><br /><searchLink fieldCode="AR" term="%22Song%2C+Yixin%22">Song, Yixin</searchLink><br /><searchLink fieldCode="AR" term="%22Pan%2C+Shunjie%22">Pan, Shunjie</searchLink><br /><searchLink fieldCode="AR" term="%22Gu%2C+Hangyu%22">Gu, Hangyu</searchLink> – Name: TitleSource Label: Source Group: Src Data: PeerJ Computer Science; Nov2024, p1-28, 28p – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22DISTRIBUTED+computing%22">DISTRIBUTED computing</searchLink><br /><searchLink fieldCode="DE" term="%22PARALLEL+programming%22">PARALLEL programming</searchLink><br /><searchLink fieldCode="DE" term="%22INTERNET+of+things%22">INTERNET of things</searchLink><br /><searchLink fieldCode="DE" term="%22PARALLEL+processing%22">PARALLEL processing</searchLink><br /><searchLink fieldCode="DE" term="%22EVICTION%22">EVICTION</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: The rise of the Internet of Things (IoT) and Industry 2.0 has spurred a growing need for extensive data computing, and Spark emerged as a promising Big Data platform, attributed to its distributed in-memory computing capabilities. However, practical heavy workloads often lead to memory bottleneck issues in the Spark platform. This results in resilient distributed datasets (RDD) eviction and, in extreme cases, violent memory contentions, causing a significant degradation in Spark computational efficiency. To tackle this issue, we propose an adaptive memory reservation (AMR) strategy in this article, specifically designed for heavy workloads in the Spark environment. Specifically, we model optimal task parallelism by minimizing the disparity between the number of tasks completed without blocking and the number completed in regular rounds. Optimal memory for task parallelism is determined to establish an efficient execution memory space for computational parallelism. Subsequently, through adaptive execution memory reservation and dynamic adjustments, such as compression or expansion based on task progress, the strategy ensures dynamic task parallelism in the Spark parallel computing process. Considering the cost of RDD cache location and real-time memory space usage, we select suitable storage locations for different RDD types to alleviate execution memory pressure. Finally, we conduct extensive laboratory experiments to validate the effectiveness of AMR. Results indicate that, compared to existing memory management solutions, AMR reduces the execution time by approximately 46.8%. [ABSTRACT FROM AUTHOR] – Name: Abstract Label: Group: Ab Data: <i>Copyright of PeerJ Computer Science is the property of PeerJ Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edb&AN=181524311 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.7717/peerj-cs.2460 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 28 StartPage: 1 Subjects: – SubjectFull: DISTRIBUTED computing Type: general – SubjectFull: PARALLEL programming Type: general – SubjectFull: INTERNET of things Type: general – SubjectFull: PARALLEL processing Type: general – SubjectFull: EVICTION Type: general Titles: – TitleFull: Adaptive memory reservation strategy for heavy workloads in the Spark environment. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Li, Bohan – PersonEntity: Name: NameFull: He, Xin – PersonEntity: Name: NameFull: Yu, Junyang – PersonEntity: Name: NameFull: Wang, Guanghui – PersonEntity: Name: NameFull: Song, Yixin – PersonEntity: Name: NameFull: Pan, Shunjie – PersonEntity: Name: NameFull: Gu, Hangyu IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 11 Text: Nov2024 Type: published Y: 2024 Identifiers: – Type: issn-print Value: 23765992 Titles: – TitleFull: PeerJ Computer Science Type: main |
| ResultId | 1 |
Full Text Finder
Nájsť tento článok vo Web of Science