PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models
Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and feed them to the GPU for training in a seamless manner. To sustain high training throughput, state-of-the-art solutions reserve a la...
Uložené v:
| Vydané v: | 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) s. 340 - 353 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
29.06.2024
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and feed them to the GPU for training in a seamless manner. To sustain high training throughput, state-of-the-art solutions reserve a large fleet of CPU servers for preprocessing which incurs substantial deployment cost and power consumption. Our characterization reveals that prior CPU-centric preprocessing is bottlenecked on feature generation and feature normalization operations as it fails to reap out the abundant inter-/intra-feature parallelism in RecSys preprocessing. PreSto is a storage-centric preprocessing system leveraging In-Storage Processing (ISP), which offloads the bottlenecked preprocessing operations to our ISP units. We show that PreSto outperforms the baseline CPU-centric system with a 9.6× speedup in end-to-end preprocessing time, 4.3× enhancement in cost-efficiency, and 11.3× improvement in energy-efficiency on average for production-scale RecSys preprocessing. |
|---|---|
| AbstractList | Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and feed them to the GPU for training in a seamless manner. To sustain high training throughput, state-of-the-art solutions reserve a large fleet of CPU servers for preprocessing which incurs substantial deployment cost and power consumption. Our characterization reveals that prior CPU-centric preprocessing is bottlenecked on feature generation and feature normalization operations as it fails to reap out the abundant inter-/intra-feature parallelism in RecSys preprocessing. PreSto is a storage-centric preprocessing system leveraging In-Storage Processing (ISP), which offloads the bottlenecked preprocessing operations to our ISP units. We show that PreSto outperforms the baseline CPU-centric system with a 9.6× speedup in end-to-end preprocessing time, 4.3× enhancement in cost-efficiency, and 11.3× improvement in energy-efficiency on average for production-scale RecSys preprocessing. |
| Author | Kim, Hyeseong Lee, Yunjae Rhu, Minsoo |
| Author_xml | – sequence: 1 givenname: Yunjae surname: Lee fullname: Lee, Yunjae email: yunjae408@kaist.ac.kr organization: KAIST,School of Electrical Engineering – sequence: 2 givenname: Hyeseong surname: Kim fullname: Kim, Hyeseong email: hyeseong.kim@kaist.ac.kr organization: KAIST,School of Electrical Engineering – sequence: 3 givenname: Minsoo surname: Rhu fullname: Rhu, Minsoo email: mrhu@kaist.ac.kr organization: KAIST,School of Electrical Engineering |
| BookMark | eNotj81OwzAQhI0EElD6Bj34BRLWdmwn3KoCJVIRqC3namNvqkiNXTm59O0JP6cZzacZae7ZdYiBGFsIyIWA6rHerZa6AmtzCbLIAUCpKzavbFUqDUoaXYpbNh-GrgEDlVW21Hds-5loN8Ynvgy8DtlkEx6JP-OIfELnFB1NlXDku8swUs_bmPg-YRd-si252PcUPI5dDPw9ejoND-ymxdNA83-dsa_Xl_3qLdt8rOvVcpOh1OWYicaY1gkPhbGIVEjvfFEANMpodAgOwDelRgmqLWzbGIctUiWkk07JUqgZW_ztdkR0OKeux3Q5iN9zQqtvictRqg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ISCA59077.2024.00033 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798350326581 |
| EndPage | 353 |
| ExternalDocumentID | 10609715 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Research Foundation funderid: 10.13039/501100001321 – fundername: SK Hynix funderid: 10.13039/100018058 – fundername: IC Design Education Center funderid: 10.13039/501100003836 |
| GroupedDBID | 6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a258t-1b66fc1d0467aae42dcd4400b365aca0c00db85a203f47fb6cafae912c2c32813 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001290320700023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:35:15 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a258t-1b66fc1d0467aae42dcd4400b365aca0c00db85a203f47fb6cafae912c2c32813 |
| PageCount | 14 |
| ParticipantIDs | ieee_primary_10609715 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-June-29 |
| PublicationDateYYYYMMDD | 2024-06-29 |
| PublicationDate_xml | – month: 06 year: 2024 text: 2024-June-29 day: 29 |
| PublicationDecade | 2020 |
| PublicationTitle | 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) |
| PublicationTitleAbbrev | ISCA |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib060973785 |
| Score | 2.3007753 |
| Snippet | Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 340 |
| SubjectTerms | computational storage device Costs Data preprocessing Graphics processing units near data processing neural network Parallel processing Power demand Recommendation system Throughput Training |
| Title | PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models |
| URI | https://ieeexplore.ieee.org/document/10609715 |
| WOSCitedRecordID | wos001290320700023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA22ePCkYsVvcvC6mk02m8RbqRZ7KcVW6K1MJlkQdFu2W3-_yW6rXjx4C8lAYPI5ybz3CLktnDIpokiUkfHpBrIEpMNE2yIT6NCyzDViE2o81vO5mWzB6g0WxnvfJJ_5u1hs_vLdEjfxqSys8DxSHskO6SilWrDWbvLEFqG03MLjUmbuR9NBX4bgT4UwkEeSbBblcX-JqDRnyPDwn70fkd4PGo9Ovs-ZY7LnyxPyMqn8tF4-0H5JR2USilXYGegj1BCM_arN_w_2tOUkp-FySmdbPQgag86P0F0rqESjINr7ukdeh0-zwXOy1UdIgEtdJ6nN8wJTF0JcBeAz7tBlYU1akUtAYMiYs1oCZ6LIVGFzhAK8STlyFFyn4pR0y2XpzwjVqJVjhTVZxMqCMICRnwAdA2sl8-ekFx2yWLUUGIudLy7-qL8kB9HnMaeKmyvSrauNvyb7-Fm_raubZuC-ANKmmv0 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5aBT2pWPFtDl5Xs3lsNt5KtbRYS7EVvJVkkgVBt2W79feb7G7ViwdvIQkMTB6TSfJ9H0LXmZUqBmCRVCJc3WgeaWEhSk3GGVgwhNtKbEKORunrqxo3YPUKC-Ocqz6fuZtQrN7y7RxW4arMr_AkUB6JTbQlOKdxDddaT5_QxmQqGoBcTNTtYNLtCJ_-SZ8I0kCTTYJA7i8ZlSqK9Pb-aX8ftX_weHj8HWkO0IbLD9HzuHCTcn6HOzke5JEvFn5vwPe61L6zW9QIAN8f16zk2B9P8bRRhMAh7fzw5mpJJRwk0d6XbfTSe5h2-1GjkBBpKtIyik2SZBBbn-RKrR2nFiz3q9KwRGjQBAixJhWaEpZxmZkEdKadiilQYDSN2RFq5fPcHSOcQiotyYziAS2rmdIQGArAEm2MIO4EtYNDZouaBGO29sXpH_VXaKc_fRrOhoPR4xnaDf4PP6yoOketsli5C7QNn-XbsrisBvELEnqeRA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+ACM%2FIEEE+51st+Annual+International+Symposium+on+Computer+Architecture+%28ISCA%29&rft.atitle=PreSto%3A+An+In-Storage+Data+Preprocessing+System+for+Training+Recommendation+Models&rft.au=Lee%2C+Yunjae&rft.au=Kim%2C+Hyeseong&rft.au=Rhu%2C+Minsoo&rft.date=2024-06-29&rft.pub=IEEE&rft.spage=340&rft.epage=353&rft_id=info:doi/10.1109%2FISCA59077.2024.00033&rft.externalDocID=10609715 |