Neural Network Approach to Program Synthesis for Tabular Transformation by Example

Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for data analysts by automatically generating programs for data transformation. Most of the previously proposed PBE methods are based on search a...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE Access Ročník 10; s. 24864 - 24876
Hlavní autoři: Ujibashi, Yoshifumi, Takasu, Atsuhiro
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 2022
Institute of Electrical and Electronics Engineers (IEEE)
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2169-3536, 2169-3536
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for data analysts by automatically generating programs for data transformation. Most of the previously proposed PBE methods are based on search algorithms, but recent improvements in machine learning (ML) have led to its application in PBE research. For example, RobustFill was proposed as an ML-based PBE method for string transformation by using long short-term memory (LSTM) as the sequential encoder-decoder model. However, an ML-based PBE method has not been developed for tabular transformations, which are used frequently in data analysis. Thus, in the present study, we propose an ML-based PBE method for tabular transformations. First, we consider the features of tabular transformations, which are more complex and data intensive than string transformations, and propose a new ML-based PBE method using the state-of-the-art Transformer sequential encoder-decoder model. To our knowledge, this is the first ML-based PBE method for tabular transformations. We also propose two decoding methods comprising multistep beam search and program validation-beam search, which are optimized for program generation, and thus generate correct programs with higher accuracy. Our evaluation results demonstrated that the Transformer-based PBE model performed much better than LSTM-based PBE when applied to tabular transformations. Furthermore, the Transformer-based model with the proposed decoding method performed better than the conventional PBE model using the search-based method.
AbstractList Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for data analysts by automatically generating programs for data transformation. Most of the previously proposed PBE methods are based on search algorithms, but recent improvements in machine learning (ML) have led to its application in PBE research. For example, RobustFill was proposed as an ML-based PBE method for string transformation by using long short-term memory (LSTM) as the sequential encoder–decoder model. However, an ML-based PBE method has not been developed for tabular transformations, which are used frequently in data analysis. Thus, in the present study, we propose an ML-based PBE method for tabular transformations. First, we consider the features of tabular transformations, which are more complex and data intensive than string transformations, and propose a new ML-based PBE method using the state-of-the-art Transformer sequential encoder–decoder model. To our knowledge, this is the first ML-based PBE method for tabular transformations. We also propose two decoding methods comprising multistep beam search and program validation-beam search, which are optimized for program generation, and thus generate correct programs with higher accuracy. Our evaluation results demonstrated that the Transformer-based PBE model performed much better than LSTM-based PBE when applied to tabular transformations. Furthermore, the Transformer-based model with the proposed decoding method performed better than the conventional PBE model using the search-based method.
Author Ujibashi, Yoshifumi
Takasu, Atsuhiro
Author_xml – sequence: 1
  givenname: Yoshifumi
  orcidid: 0000-0001-6527-2814
  surname: Ujibashi
  fullname: Ujibashi, Yoshifumi
  email: ujibashi@nii.ac.jp
  organization: Department of Informatics, The Graduate University for Advanced Studies, SOKENDAI, Shonan, Hayama, Japan
– sequence: 2
  givenname: Atsuhiro
  orcidid: 0000-0002-9061-7949
  surname: Takasu
  fullname: Takasu, Atsuhiro
  organization: Department of Informatics, The Graduate University for Advanced Studies, SOKENDAI, Shonan, Hayama, Japan
BackLink https://cir.nii.ac.jp/crid/1871991017441804800$$DView record in CiNii
BookMark eNpNUcFuEzEQXaEiUUq_oJeV4Jrg8dhr-xhFASpVBZFytuz1uN2wWQfvRpC_x2GrijnMjJ7evJnRe1tdDGmgqroBtgRg5uNqvd5st0vOOF8iSCka_aq65NCYBUpsLv7r31TX47hjJXSBpLqsvt_TMbu-vqfpd8o_69XhkJNrn-op1d9yesxuX29Pw_REYzfWMeX6wflj70rNbhgLsHdTl4ban-rNH7c_9PSueh1dP9L1c72qfnzaPKy_LO6-fr5dr-4WrZA4LYIxqGOryEswAQ2xEDQCaAmaB9-0nqOK1JgookLniDMC8CBD2zRtZHhV3c66IbmdPeRu7_LJJtfZf0DKj9blqWt7suQRIiiSwgURmHM-MEFKRx8FSY5F6_2sVb7_daRxsrt0zEM53_IGlUCN2BQWzqw2p3HMFF-2ArNnL-zshT17YZ-9KFMf5qmh62zbnTNoBcYAAyUEaCY0O79zM9M6InoRNoqj4IB_ARMMkYk
CODEN IAECCG
Cites_doi 10.1145/1925844.1926423
10.18653/v1/P19-1365
10.18653/v1/D16-1137
10.1145/3035918.3064034
10.18653/v1/W17-3207
10.5555/3454287.3455008
10.1145/1993316.1993536
10.1145/2813885.2737952
10.14778/2977797.2977807
10.18653/v1/2020.acl-main.398
10.18653/v1/N19-4009
10.1145/3360594
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
ESBDL
RIA
RIE
RYH
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2022.3155468
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CiNii Complete
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList Materials Research Database


Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 24876
ExternalDocumentID oai_doaj_org_article_eb31f17e54ad4d0aabd04e78fbf4e523
10_1109_ACCESS_2022_3155468
9723421
Genre orig-research
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABAZT
ABVLG
ACGFS
ADBBV
AGSQL
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RNS
RYH
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c453t-d9938fc7eb519d39e0dd831185182db6cb237fe69f4f73aae20e11b15dc66cf03
IEDL.DBID DOA
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766563200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2169-3536
IngestDate Tue Oct 14 19:06:24 EDT 2025
Mon Jun 30 02:25:19 EDT 2025
Sat Nov 29 06:31:57 EST 2025
Thu Jun 26 23:44:53 EDT 2025
Wed Aug 27 02:24:09 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c453t-d9938fc7eb519d39e0dd831185182db6cb237fe69f4f73aae20e11b15dc66cf03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-9061-7949
0000-0001-6527-2814
OpenAccessLink https://doaj.org/article/eb31f17e54ad4d0aabd04e78fbf4e523
PQID 2637438336
PQPubID 4845423
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_eb31f17e54ad4d0aabd04e78fbf4e523
nii_cinii_1871991017441804800
crossref_primary_10_1109_ACCESS_2022_3155468
proquest_journals_2637438336
ieee_primary_9723421
PublicationCentury 2000
PublicationDate 20220000
2022-01-01
2022-00-00
20220101
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – year: 2022
  text: 20220000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE Access
PublicationTitleAbbrev Access
PublicationYear 2022
Publisher IEEE
Institute of Electrical and Electronics Engineers (IEEE)
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers (IEEE)
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref24
Parisotto (ref4) 2016
ref12
Nagarajan (ref18)
ref23
Kulikov (ref15) 2018
ref14
ref22
ref10
Clymo (ref21); 108
ref1
Laich (ref17)
Mohammad (ref19)
ref8
ref7
Shin (ref20)
ref6
ref5
Devlin (ref3)
Kalyan (ref9)
Raman (ref2) 2001; 1
Vaswani (ref16) 2017
Sutskever (ref11) 2014
References_xml – volume: 1
  start-page: 381
  year: 2001
  ident: ref2
  article-title: Potter’s wheel: An interactive data cleaning system
  publication-title: VLDB
– ident: ref5
  doi: 10.1145/1925844.1926423
– year: 2014
  ident: ref11
  article-title: Sequence to sequence learning with neural networks
  publication-title: arXiv:1409.3215
– ident: ref14
  doi: 10.18653/v1/P19-1365
– start-page: 1
  volume-title: Proc. Assoc. Adv. Artif. Intell.
  ident: ref19
  article-title: Disjunctive program synthesis: A robust approach to programming by example
– ident: ref12
  doi: 10.18653/v1/D16-1137
– year: 2017
  ident: ref16
  article-title: Attention is all you need
  publication-title: arXiv:1706.03762
– year: 2018
  ident: ref15
  article-title: Importance of search and evaluation strategies in neural dialogue modeling
  publication-title: arXiv:1811.00907
– ident: ref1
  doi: 10.1145/3035918.3064034
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Represent.
  ident: ref9
  article-title: Neural-guided deductive search for real-time program synthesis from examples
– ident: ref13
  doi: 10.18653/v1/W17-3207
– start-page: 990
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref3
  article-title: RobustFill: Neural program learning under noisy I/O
– ident: ref22
  doi: 10.5555/3454287.3455008
– ident: ref6
  doi: 10.1145/1993316.1993536
– start-page: 1
  volume-title: Proc. 8th Int. Conf. Learn. Represent.
  ident: ref17
  article-title: Guiding program synthesis by learning to generate examples
– ident: ref7
  doi: 10.1145/2813885.2737952
– start-page: 1714
  volume-title: Proc. 22nd Int. Conf. Artif. Intell. Statist. (AISTATS)
  ident: ref18
  article-title: Learning natural programs from a few examples in real–time
– year: 2016
  ident: ref4
  article-title: Neuro-symbolic program synthesis
  publication-title: arXiv:1611.01855
– ident: ref10
  doi: 10.14778/2977797.2977807
– ident: ref24
  doi: 10.18653/v1/2020.acl-main.398
– volume: 108
  start-page: 3450
  volume-title: Proc. Int. Conf. Artif. Intell. Statist.
  ident: ref21
  article-title: Data generation for neural programming by example
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Represent.
  ident: ref20
  article-title: Synthetic datasets for neural program synthesis
– ident: ref23
  doi: 10.18653/v1/N19-4009
– ident: ref8
  doi: 10.1145/3360594
SSID ssj0000816957
Score 2.2425783
Snippet Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for...
SourceID doaj
proquest
crossref
nii
ieee
SourceType Open Website
Aggregation Database
Index Database
Publisher
StartPage 24864
SubjectTerms Automatic programming
beam search
Coders
Data analysis
Data integration
Data models
Decoding
Domain specific languages
Electrical engineering. Electronics. Nuclear engineering
Encoding
LSTM
Machine learning
Machine learning algorithms
Memory
neural network
Neural networks
program synthesis
programming by example
Search algorithms
Strings
Tables (data)
tabular data
TK1-9971
Training data
Transformations
transformer
Transformers
SummonAdditionalLinks – databaseName: IEEE Electronic Library (IEL)
  dbid: RIE
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB2VigMc-CqIQIt84NhQJ_7McVm14oBWFSyoNyu2x1IuWdTdIvrvGSfuqgguXKLIsqzEL_a8cWbeALw3ERVPWtc6tLqWfaIl1aq-JmNotfIq2Kl0wvfPZrWyV1fd5QGc7nNhEHEKPsMP-Xb6lx834SYflZ3lClkyZ40_MMbMuVr785RcQKJTpggLNbw7WyyX9A7kArYteaY5Gsv-YXwmjf5SVIUsyzgMf-3Hk5G5ePp_j_cMnhQyyRYz-s_hAMcX8PiexOARfMnqG9RnNYd7s0XREGe7DbucY7PY19uRaOB22DJisGzd-xyaytb3KO1mZP6Wnf_qs5bwS_h2cb5efqpLHYU6SCV2dSQOYlMw6ImuRdEhj9EK8iwUORfR6-BbYRLqLslkRN9jy7FpfKNi0DokLl7B4bgZ8TWwltgLGsRgOyuTij5p3gVBDU3yQvsKTu8m2P2Y5TLc5Gbwzs14uIyHK3hU8DGDsO-ata6nBppXV5aOI3e_SY1BJfsoI-97H7lEY5NPEsmPruAoY7EfpMBQwQlB6sKQrw35hcSEafsh-mdzHj2v4PgObFfW7da1Wpgs3ir0m3-P-hYe5ReYD2GO4XB3fYMn8DD83A3b63fTJ_kb153d0g
  priority: 102
  providerName: IEEE
Title Neural Network Approach to Program Synthesis for Tabular Transformation by Example
URI https://ieeexplore.ieee.org/document/9723421
https://cir.nii.ac.jp/crid/1871991017441804800
https://www.proquest.com/docview/2637438336
https://doaj.org/article/eb31f17e54ad4d0aabd04e78fbf4e523
Volume 10
WOSCitedRecordID wos000766563200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2169-3536
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000816957
  issn: 2169-3536
  databaseCode: DOA
  dateStart: 20130101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2169-3536
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000816957
  issn: 2169-3536
  databaseCode: M~E
  dateStart: 20130101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09bxQxEB2hiAIKBATEQhK5oGQVf9tbHqeLKOAUwYHSWf6UttlDuQsiDb-d8e4mOkRBQ7OFZa3smfXMe6vxG4C3JmVFi9atjly30hc8Ulz5FpOh1SqoaMfWCd8-mvXaXl11lwetvmpN2CQPPBnuHMkeK8xkJX2SiXofEpXZ2BKKzMiiavRF1HNApsYYbJnulJllhhjtzhfLJe4ICSHnyFNrbZb9IxWNiv1zixXMM0Pf_xWdx5Rz8RSezFiRLKY1PoMHeXgOjw8UBI_hcxXXwDnrqZqbLGaJcLLfksup9Ip8uR0Q5e36HUGASjY-1MpTsjlArNuBhFuy-umrVPAL-Hqx2iw_tHObhDZKJfZtQohhSzQ5IBpLoss0JSuQOCjkDinoGLgwJeuuyGKE95nTzFhgKkWtY6HiJRwN2yG_AsIRnGSTc7SdlUWlUDTtosABVoLQoYF3dxZz3yc1DDeyCNq5ycCuGtjNBm7gfbXq_dQqZT0OoIPd7GD3Lwc3cFx9cv-S2iZNctbAKfrIxb4-GdI-BLoYXRDd2XpNnjZwcuc9Nx_LneNamKrNKvTr_7G0N_Cobnf6I3MCR_vrm3wKD-OPfb-7Phu_SHx--rU6G-8V_gaCs-Rx
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB1VBQk48FUqFlrwgWNDnfgzx2XVqohlVcGCerNieyzlkkXdLWr_fcdJuiqCC5cksiwr9ovtN87MG4APJqLiSetCh0oXskk0pSrVFLQZWq28CrZPnfBzbhYLe3FRn-_A0TYWBhF75zP8mB_7f_lxFa7yUdlxzpAlc9T4AyXpPkRrbU9UcgqJWplRWqjk9fF0NqNekBFYVWSbZn8s-8f206v0j2lVaG_p2vavFbnfZk6f_d8LPoenI51k0wH_F7CD3Ut4ck9kcA--Zf0NqrMYHL7ZdFQRZ5sVOx-8s9j3m46I4LpdM-KwbNn47JzKlvdI7apj_oadXDdZTfgV_Dg9Wc7OijGTQhGkEpsiEguxKRj0RNiiqJHHaAXZForMi-h18JUwCXWdZDKiabDiWJa-VDFoHRIX-7DbrTp8Dawi_oIGMdjayqSiT5rXQVBBmbzQfgJHdwPsfg2CGa43NHjtBjxcxsONeEzgUwZhWzWrXfcFNK5unDyODP4ylQaVbKKMvGl85BKNTT5JJEt6AnsZi20jIwwTOCRIXWjztSTLkLgwLUBEAG2OpOcTOLgD240zd-0qLUyWbxX6zb9bfQ-PzpZf527-efHlLTzOnRmOZA5gd3N5hYfwMPzetOvLd_3neQuHFOEZ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Neural+Network+Approach+to+Program+Synthesis+for+Tabular+Transformation+by+Example&rft.jtitle=IEEE+access&rft.au=Ujibashi%2C+Yoshifumi&rft.au=Takasu%2C+Atsuhiro&rft.date=2022&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=10&rft.spage=24864&rft.epage=24876&rft_id=info:doi/10.1109%2FACCESS.2022.3155468&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2022_3155468
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon