A Transducers-based Programming Framework for Efficient Data Transformation
Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and conversions between data formats and layouts. Previous work has shown that data transformation can represent a performance bottleneck for da...
Gespeichert in:
| Veröffentlicht in: | 2024 33rd International Conference on Parallel Architectures and Compilation Techniques (PACT) S. 66 - 77 |
|---|---|
| Hauptverfasser: | , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
ACM
13.10.2024
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and conversions between data formats and layouts. Previous work has shown that data transformation can represent a performance bottleneck for data analytics workloads. The transducers computational abstraction can be used to express a wide range of data transformations, and recent efforts have proposed configurable engines implementing various transducer models (from finite state transducers, to pushdown transducers, to extended models). This line of research, however, is still at an early stage. Notably, expressing data transformation using transducers requires a paradigm shift, impacting programmability. To address this problem, we propose a programming framework to map data transformation tasks onto a variety of transducer models. Our framework includes: (1) a platform agnostic programming language (xPTLang) to code transducer programs using intuitive programming constructs, and (2) a compiler that, given an xPTLang program, generates efficient transducer processing engines for CPU and GPU. Our compiler includes a set of optimizations to improve code efficiency. We demonstrate our framework on a diverse set of data transformation tasks on an Intel CPU and an Nvidia GPU. |
|---|---|
| AbstractList | Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and conversions between data formats and layouts. Previous work has shown that data transformation can represent a performance bottleneck for data analytics workloads. The transducers computational abstraction can be used to express a wide range of data transformations, and recent efforts have proposed configurable engines implementing various transducer models (from finite state transducers, to pushdown transducers, to extended models). This line of research, however, is still at an early stage. Notably, expressing data transformation using transducers requires a paradigm shift, impacting programmability. To address this problem, we propose a programming framework to map data transformation tasks onto a variety of transducer models. Our framework includes: (1) a platform agnostic programming language (xPTLang) to code transducer programs using intuitive programming constructs, and (2) a compiler that, given an xPTLang program, generates efficient transducer processing engines for CPU and GPU. Our compiler includes a set of optimizations to improve code efficiency. We demonstrate our framework on a diverse set of data transformation tasks on an Intel CPU and an Nvidia GPU. |
| Author | Becchi, Michela Nguyen, Tri |
| Author_xml | – sequence: 1 givenname: Tri surname: Nguyen fullname: Nguyen, Tri email: tmnguye7@ncsu.edu organization: North Carolina State University,United States of America – sequence: 2 givenname: Michela surname: Becchi fullname: Becchi, Michela email: mbecchi@ncsu.edu organization: North Carolina State University,United States of America |
| BookMark | eNotjM1KAzEYRSMoqHXWblzkBabmb_KzLLVVsaCLui5fMl9K0MlIMiK-vQN1dS-He-41Oc9jRkJuOVtyrrp7qTvNuFtKbbR1_Iw0zjirGDNMS24vSVNr8qwzYt5ZcUVeVnRfINf-O2CprYeKPX0r47HAMKR8pNu54M9YPmgcC93EmELCPNEHmOCkznyAKY35hlxE-KzY_OeCvG83-_VTu3t9fF6vdi0IZae2wxCkwc4LEE5ygw5YFL0R2oPuhQ6gDAflIFqHHoT3zOsgNVrpALmSC3J3-k2IePgqaYDye-DMMiOFkX8lF06Z |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/3656019.3676891 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798400706318 |
| EndPage | 77 |
| ExternalDocumentID | 10807327 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Science Foundation grantid: CCF-1907863 funderid: 10.13039/100000001 |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK LHSKQ RIE RIL |
| ID | FETCH-LOGICAL-a248t-5ecc37e5b2a29317e9a0f2d726ba6d26ca471a49af89eba2bb0b6c36e839ae143 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001344829000006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Jan 08 06:10:43 EST 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a248t-5ecc37e5b2a29317e9a0f2d726ba6d26ca471a49af89eba2bb0b6c36e839ae143 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_10807327 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-Oct.-13 |
| PublicationDateYYYYMMDD | 2024-10-13 |
| PublicationDate_xml | – month: 10 year: 2024 text: 2024-Oct.-13 day: 13 |
| PublicationDecade | 2020 |
| PublicationTitle | 2024 33rd International Conference on Parallel Architectures and Compilation Techniques (PACT) |
| PublicationTitleAbbrev | PACT |
| PublicationYear | 2024 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssib057256082 |
| Score | 2.2707927 |
| Snippet | Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 66 |
| SubjectTerms | Codes Computational modeling Computer languages Data analysis Data models Engines Graphics processing units Optimization Programming Transducers |
| Title | A Transducers-based Programming Framework for Efficient Data Transformation |
| URI | https://ieeexplore.ieee.org/document/10807327 |
| WOSCitedRecordID | wos001344829000006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoxcAEiCK-5YHVbeI4sT0iaIWEVHUAqVt1ti8SAy1qU34_PictLAxsluU40p2V9xTfe8fYfeldhM3CCjL3FkqGIMBkIKAMIY-Eua5U22xCT6dmPrezTqyetDCImIrPcEjDdJcfVn5Lv8pGVA-nC6l7rKe1bsVau8NTagJvIzv7nlyVo4KMZXI7JFMyQyacv_qnJPiYHP_zxSds8CPE47M9xJyyA1yesZcHnhAmxKysN4JwKNAqqrP6iMv4ZFdwxSMj5eNkEhH350_QQPvoXrI4YG-T8evjs-h6IgiQyjSijCEvNJZOQgTqXKOFrJZBy8pBFWTlIaINKAu1sehAOpe5yhcVRiIEGMnROesvV0u8YNxHamEyA9J7pdAq8KhCjWT5rlxkFpdsQJFYfLa2F4tdEK7-mL9mRzIiPn3Y8-KG9Zv1Fm_Zof9q3jfru5SsbxCPlrI |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0UTfSkRozf7sFrYbfbbtujUQgGJBww4Uam7WziQSCw-Pttu4BePHhrmm43mWn2vWznvSHkUVjjYTPXNJh7U86co6BSoCCcyzxhLgteN5uQw6GaTPRoI1aPWhhEjMVn2ArDeJfv5nYdfpW1Qz2czJncJweCc5bVcq3t8REywLdiGwOfjIt2HqxlMt0KtmQq2HD-6qASAaR78s9Xn5LmjxQvGe1A5ozs4eyc9J-SiDHO52W5ogGJXFgVKq0-_bKkuy25SjwnTTrRJsLvn7xABfWjO9Fik7x3O-PnHt10RaDAuKqo8EHPJQrDwEN1JlFDWjInWWGgcKyw4PEGuIZSaTTAjElNYfMCPRUC9PTogjRm8xleksR6cqFSBcxazlFzsMhdicH0nRvPLa5IM0RiuqiNL6bbIFz_Mf9Ajnrjt8F08Drs35Bj5vE_fOaz_JY0quUa78ih_ao-Vsv7mLhvFxCZ-Q |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+33rd+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=A+Transducers-based+Programming+Framework+for+Efficient+Data+Transformation&rft.au=Nguyen%2C+Tri&rft.au=Becchi%2C+Michela&rft.date=2024-10-13&rft.pub=ACM&rft.spage=66&rft.epage=77&rft_id=info:doi/10.1145%2F3656019.3676891&rft.externalDocID=10807327 |