A Review of Data Placement and Replication Strategies Based on Machine Learning
The global increase in data volumes has brought forth the need for scalable distributed systems that can provide satisfactory quality of service. Data placement and replication are well known techniques that provide increased performance, improved fault tolerance and higher availability. These techn...
Uložené v:
| Vydané v: | Proceedings - International Conference on Parallel and Distributed Systems s. 278 - 285 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
10.10.2024
|
| Predmet: | |
| ISSN: | 2690-5965 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The global increase in data volumes has brought forth the need for scalable distributed systems that can provide satisfactory quality of service. Data placement and replication are well known techniques that provide increased performance, improved fault tolerance and higher availability. These techniques often require threshold-based activation mechanisms that can vary due to the nature of the workload and the underlying system architecture. Hence, setting and adjusting those thresholds usually require human intervention. In this context, machine learning presents a promising facet to automatically define such thresholds to adapt to different workloads and architectures. In this paper, we study the data placement and replication strategies proposed in the literature that employ machine learning. We classify such strategies based on the machine learning method, the platform on which they are deployed, the dynamicity and the achieved objectives. We describe the approach applied by each strategy as well as possible limitations. In addition, we provide insights into metrics used to evaluate the strategies. We highlight the need to design data placement and replication strategies that respond better to modern needs for distributed systems. We also motivate the use of machine learning to achieve autonomy in distributed systems. |
|---|---|
| AbstractList | The global increase in data volumes has brought forth the need for scalable distributed systems that can provide satisfactory quality of service. Data placement and replication are well known techniques that provide increased performance, improved fault tolerance and higher availability. These techniques often require threshold-based activation mechanisms that can vary due to the nature of the workload and the underlying system architecture. Hence, setting and adjusting those thresholds usually require human intervention. In this context, machine learning presents a promising facet to automatically define such thresholds to adapt to different workloads and architectures. In this paper, we study the data placement and replication strategies proposed in the literature that employ machine learning. We classify such strategies based on the machine learning method, the platform on which they are deployed, the dynamicity and the achieved objectives. We describe the approach applied by each strategy as well as possible limitations. In addition, we provide insights into metrics used to evaluate the strategies. We highlight the need to design data placement and replication strategies that respond better to modern needs for distributed systems. We also motivate the use of machine learning to achieve autonomy in distributed systems. |
| Author | Mokadem, Riad Pierson, Jean-Marc Najjar, Amir |
| Author_xml | – sequence: 1 givenname: Amir surname: Najjar fullname: Najjar, Amir email: amir.najjar@irit.fr organization: Université de Toulouse,Institut de Recherche en Informatique de Toulouse (IRIT),Toulouse,France – sequence: 2 givenname: Riad surname: Mokadem fullname: Mokadem, Riad email: riad.mokadem@irit.fr organization: Université de Toulouse,Institut de Recherche en Informatique de Toulouse (IRIT),Toulouse,France – sequence: 3 givenname: Jean-Marc surname: Pierson fullname: Pierson, Jean-Marc email: jean-marc.pierson@irit.fr organization: Université de Toulouse,Institut de Recherche en Informatique de Toulouse (IRIT),Toulouse,France |
| BookMark | eNotjN1OAjEQhavRREDewJi-wOK03W2nlwj-kGAgotdkdneKNVDI7kbj27uJXp2T7-Q7Q3GRjomFuFUwUQr83WK2ns431pgCJhp0PgGAPD8TY-88GqMKVXhrz8VAWw9Z34srMWzbTwANvTMQq6l85a_I3_IY5Jw6kus9VXzg1ElKdT-e9rGiLh6T3HQNdbyL3Mp7armWPXuh6iMmlkumJsW0uxaXgfYtj_9zJN4fH95mz9ly9bSYTZdZVM52mXas6sChNg4tVLrwlTOMPq89ow3GgWJC77VGLFVZInpNiMGBCaVxhRmJm7_fyMzbUxMP1PxsFThrEHPzC8-zUKM |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ICPADS63350.2024.00044 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Xplore (IEEE/IET Electronic Library - IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore (IEEE/IET Electronic Library - IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798331515966 |
| EISSN | 2690-5965 |
| EndPage | 285 |
| ExternalDocumentID | 10763884 |
| Genre | orig-research |
| GroupedDBID | 29O 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS |
| ID | FETCH-LOGICAL-i176t-27e1dfefd37860c259c73e894d9e86f3701ea8992288b1bb8892a88f703fb3753 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001481011800034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 01:59:30 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i176t-27e1dfefd37860c259c73e894d9e86f3701ea8992288b1bb8892a88f703fb3753 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_10763884 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-Oct.-10 |
| PublicationDateYYYYMMDD | 2024-10-10 |
| PublicationDate_xml | – month: 10 year: 2024 text: 2024-Oct.-10 day: 10 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings - International Conference on Parallel and Distributed Systems |
| PublicationTitleAbbrev | ICPADS |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0020350 |
| Score | 2.2843595 |
| Snippet | The global increase in data volumes has brought forth the need for scalable distributed systems that can provide satisfactory quality of service. Data... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 278 |
| SubjectTerms | Costs Data Placement Data Replication Distributed databases Distributed Systems Fault tolerance Fault tolerant systems Machine Learning Quality of service Reinforcement learning Taxonomy Time factors Tuning Unsupervised learning |
| Title | A Review of Data Placement and Replication Strategies Based on Machine Learning |
| URI | https://ieeexplore.ieee.org/document/10763884 |
| WOSCitedRecordID | wos001481011800034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxcBUHkW85YE1EMdu7IylBYEEJRIPdav8OKMuKWpTfj9nNy1iYGCLLlEi3fly-u7xHSGXTvIenh2dWAynibCIUwyaPclV7oHrPEtjN-H7oxyN1HhclM2wepyFAYDYfAZX4TLW8t3MLkOqDD0cvUEp0SItKeVqWGuDrkKJrBkBZmlx_TAo-8OXnKMUUWAmIi2n-LVDJYaQu84_P75Luj_DeLTchJk9sgXVPumstzHQxjkPyHOfrhL9dObpUNealiFFHt5KdeXw5qZUTdectLCgNxjGHEXZU2yrBNowrn50ydvd7evgPmnWJSRTJvM6ySQw58E7LlWeWsQ1VnJQhXAFoOa5TBloFXholTLMGKWKTCvl0ee94QhbDkm7mlVwRCg4HvJTtvDWiMx6BT2DDzChrbDWwjHpBgVNPleMGJO1bk7-kJ-SnWCD8M9n6Rlp1_MlnJNt-1VPF_OLaMdv4oSdmQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagIMFUHkW88cAaiGM3ccZSqFrRlkgU1K1y7DPqkqI25fdzdtMiBga26BLZ0l1Op3t83xFyaxLexH9HBRrDaSA05ik5mj2IZWyBqzgK_TThez8ZDuV4nGYVWN1jYQDAD5_BnXv0vXwz00tXKkMPR2-QUmyTnaYQEVvBtTb5lWuSVSBgFqb3vXbWenyNOUoxD4yEJ-YUv7ao-CDSqf_z-gPS-IHj0WwTaA7JFhRHpL7ex0Ar9zwmLy26KvXTmaWPqlQ0c0VydypVhcGXm2Y1XbPSwoI-YCAzFGUDP1gJtOJc_WiQt87TqN0NqoUJwZQlcRlECTBjwRqeyDjUmNnohINMhUkBdc-TkIGSjolWypzluZRppKS06PU255i4nJBaMSvglFAw3FWodGp1LiJtJTRz_IAJpYXWGs5Iwylo8rnixJisdXP-h_yG7HVHg_6k3xs-X5B9Zw8XAVh4SWrlfAlXZFd_ldPF_Nrb9BtFMaDg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+International+Conference+on+Parallel+and+Distributed+Systems&rft.atitle=A+Review+of+Data+Placement+and+Replication+Strategies+Based+on+Machine+Learning&rft.au=Najjar%2C+Amir&rft.au=Mokadem%2C+Riad&rft.au=Pierson%2C+Jean-Marc&rft.date=2024-10-10&rft.pub=IEEE&rft.eissn=2690-5965&rft.spage=278&rft.epage=285&rft_id=info:doi/10.1109%2FICPADS63350.2024.00044&rft.externalDocID=10763884 |