A Transducers-based Programming Framework for Efficient Data Transformation

Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and conversions between data formats and layouts. Previous work has shown that data transformation can represent a performance bottleneck for da...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2024 33rd International Conference on Parallel Architectures and Compilation Techniques (PACT) s. 66 - 77
Hlavní autoři: Nguyen, Tri, Becchi, Michela
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 13.10.2024
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and conversions between data formats and layouts. Previous work has shown that data transformation can represent a performance bottleneck for data analytics workloads. The transducers computational abstraction can be used to express a wide range of data transformations, and recent efforts have proposed configurable engines implementing various transducer models (from finite state transducers, to pushdown transducers, to extended models). This line of research, however, is still at an early stage. Notably, expressing data transformation using transducers requires a paradigm shift, impacting programmability. To address this problem, we propose a programming framework to map data transformation tasks onto a variety of transducer models. Our framework includes: (1) a platform agnostic programming language (xPTLang) to code transducer programs using intuitive programming constructs, and (2) a compiler that, given an xPTLang program, generates efficient transducer processing engines for CPU and GPU. Our compiler includes a set of optimizations to improve code efficiency. We demonstrate our framework on a diverse set of data transformation tasks on an Intel CPU and an Nvidia GPU.
AbstractList Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and conversions between data formats and layouts. Previous work has shown that data transformation can represent a performance bottleneck for data analytics workloads. The transducers computational abstraction can be used to express a wide range of data transformations, and recent efforts have proposed configurable engines implementing various transducer models (from finite state transducers, to pushdown transducers, to extended models). This line of research, however, is still at an early stage. Notably, expressing data transformation using transducers requires a paradigm shift, impacting programmability. To address this problem, we propose a programming framework to map data transformation tasks onto a variety of transducer models. Our framework includes: (1) a platform agnostic programming language (xPTLang) to code transducer programs using intuitive programming constructs, and (2) a compiler that, given an xPTLang program, generates efficient transducer processing engines for CPU and GPU. Our compiler includes a set of optimizations to improve code efficiency. We demonstrate our framework on a diverse set of data transformation tasks on an Intel CPU and an Nvidia GPU.
Author Becchi, Michela
Nguyen, Tri
Author_xml – sequence: 1
  givenname: Tri
  surname: Nguyen
  fullname: Nguyen, Tri
  email: tmnguye7@ncsu.edu
  organization: North Carolina State University,United States of America
– sequence: 2
  givenname: Michela
  surname: Becchi
  fullname: Becchi, Michela
  email: mbecchi@ncsu.edu
  organization: North Carolina State University,United States of America
BookMark eNotjM1KAzEYRSMoqHXWblzkBabmb_KzLLVVsaCLui5fMl9K0MlIMiK-vQN1dS-He-41Oc9jRkJuOVtyrrp7qTvNuFtKbbR1_Iw0zjirGDNMS24vSVNr8qwzYt5ZcUVeVnRfINf-O2CprYeKPX0r47HAMKR8pNu54M9YPmgcC93EmELCPNEHmOCkznyAKY35hlxE-KzY_OeCvG83-_VTu3t9fF6vdi0IZae2wxCkwc4LEE5ygw5YFL0R2oPuhQ6gDAflIFqHHoT3zOsgNVrpALmSC3J3-k2IePgqaYDye-DMMiOFkX8lF06Z
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3656019.3676891
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798400706318
EndPage 77
ExternalDocumentID 10807327
Genre orig-research
GrantInformation_xml – fundername: National Science Foundation
  grantid: CCF-1907863
  funderid: 10.13039/100000001
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a248t-5ecc37e5b2a29317e9a0f2d726ba6d26ca471a49af89eba2bb0b6c36e839ae143
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001344829000006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Jan 08 06:10:43 EST 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a248t-5ecc37e5b2a29317e9a0f2d726ba6d26ca471a49af89eba2bb0b6c36e839ae143
PageCount 12
ParticipantIDs ieee_primary_10807327
PublicationCentury 2000
PublicationDate 2024-Oct.-13
PublicationDateYYYYMMDD 2024-10-13
PublicationDate_xml – month: 10
  year: 2024
  text: 2024-Oct.-13
  day: 13
PublicationDecade 2020
PublicationTitle 2024 33rd International Conference on Parallel Architectures and Compilation Techniques (PACT)
PublicationTitleAbbrev PACT
PublicationYear 2024
Publisher ACM
Publisher_xml – name: ACM
SSID ssib057256082
Score 2.2706947
Snippet Many data analytics and scientific applications rely on data transformation tasks, such as encoding, decoding, parsing of structured and unstructured data, and...
SourceID ieee
SourceType Publisher
StartPage 66
SubjectTerms Codes
Computational modeling
Computer languages
Data analysis
Data models
Engines
Graphics processing units
Optimization
Programming
Transducers
Title A Transducers-based Programming Framework for Efficient Data Transformation
URI https://ieeexplore.ieee.org/document/10807327
WOSCitedRecordID wos001344829000006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwFLRoxcAEiCK-5YHVbWInefaIoBESqOoAqFv1nLxIDLSoTfn9PLsfsDCQKYrsRDpHvovz7izELUsGq11lVUXOqAw0Km-8U-SQycUU1MQ13bdnGI3sZOLGG7N69MIQUSw-o344jf_y63m1Cktlg1APB0ZDR3QAYG3W2r48OQTytnoT35Nm-cCEYJnU9UMomQ0hnL_2T4n0UR7-88FHovdjxJPjHcUciz2anYinOxkZpuZRWSxV4KE6tAp1Vh_cTJbbgivJilQOY0gE318-YIvrrjvLYk-8lsOX-0e12RNBoc5sq3KG3ADlXgcsU2BMk0bXoAuPRa2LCpltMHPYWEcetfeJLyrGnIUQEoujU9GdzWd0JmSSU40sCLFB_qZImLgx1QwZ5Hyk4M9FLyAx_VzHXky3IFz8cf1SHGhm_DCxp-ZKdNvFiq7FfvXVvi8XN3GwvgHo5pYB
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwFLSgIMEEiCK-8cDqNrbj2B4RtCpqqToU1K16ThyJgRalKb-fZ_cDFgYyRZGdSOfId3HenQm5R8lghM0Ny72VLNUCmJPOMm8ByUVmvoxrum8DPRyaycSO1mb16IXx3sfiM98Kp_FffjHPl2GprB3q4bQUepfsqTQVfGXX2rw-Sgf6NmId4MNT1ZYhWobbVoglMyGG89cOKpFAukf_fPQxaf5Y8ehoSzInZMfPTkn_gUaOKXBcqgULTFSEVqHS6gOb0e6m5IqiJqWdGBOB96dPUMOq69a02CSv3c74scfWuyIwEKmpmULQpfbKiYAm14hqUopCi8xBVogsB-QbSC2UxnoHwrnEZTmijlIIPMqjM9KYzWf-nNBE-QJQEkIJ-FWRIHUDFwiZVnhw7S5IMyAx_VwFX0w3IFz-cf2OHPTGL4Pp4HnYvyKHAvk_TPNcXpNGXS39DdnPv-r3RXUbB-4b5zuZSA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+33rd+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=A+Transducers-based+Programming+Framework+for+Efficient+Data+Transformation&rft.au=Nguyen%2C+Tri&rft.au=Becchi%2C+Michela&rft.date=2024-10-13&rft.pub=ACM&rft.spage=66&rft.epage=77&rft_id=info:doi/10.1145%2F3656019.3676891&rft.externalDocID=10807327