Neural Network Approach to Program Synthesis for Tabular Transformation by Example
Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for data analysts by automatically generating programs for data transformation. Most of the previously proposed PBE methods are based on search a...
Uloženo v:
| Vydáno v: | IEEE Access Ročník 10; s. 24864 - 24876 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
2022
Institute of Electrical and Electronics Engineers (IEEE) The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2169-3536, 2169-3536 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for data analysts by automatically generating programs for data transformation. Most of the previously proposed PBE methods are based on search algorithms, but recent improvements in machine learning (ML) have led to its application in PBE research. For example, RobustFill was proposed as an ML-based PBE method for string transformation by using long short-term memory (LSTM) as the sequential encoder-decoder model. However, an ML-based PBE method has not been developed for tabular transformations, which are used frequently in data analysis. Thus, in the present study, we propose an ML-based PBE method for tabular transformations. First, we consider the features of tabular transformations, which are more complex and data intensive than string transformations, and propose a new ML-based PBE method using the state-of-the-art Transformer sequential encoder-decoder model. To our knowledge, this is the first ML-based PBE method for tabular transformations. We also propose two decoding methods comprising multistep beam search and program validation-beam search, which are optimized for program generation, and thus generate correct programs with higher accuracy. Our evaluation results demonstrated that the Transformer-based PBE model performed much better than LSTM-based PBE when applied to tabular transformations. Furthermore, the Transformer-based model with the proposed decoding method performed better than the conventional PBE model using the search-based method. |
|---|---|
| AbstractList | Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for data analysts by automatically generating programs for data transformation. Most of the previously proposed PBE methods are based on search algorithms, but recent improvements in machine learning (ML) have led to its application in PBE research. For example, RobustFill was proposed as an ML-based PBE method for string transformation by using long short-term memory (LSTM) as the sequential encoder–decoder model. However, an ML-based PBE method has not been developed for tabular transformations, which are used frequently in data analysis. Thus, in the present study, we propose an ML-based PBE method for tabular transformations. First, we consider the features of tabular transformations, which are more complex and data intensive than string transformations, and propose a new ML-based PBE method using the state-of-the-art Transformer sequential encoder–decoder model. To our knowledge, this is the first ML-based PBE method for tabular transformations. We also propose two decoding methods comprising multistep beam search and program validation-beam search, which are optimized for program generation, and thus generate correct programs with higher accuracy. Our evaluation results demonstrated that the Transformer-based PBE model performed much better than LSTM-based PBE when applied to tabular transformations. Furthermore, the Transformer-based model with the proposed decoding method performed better than the conventional PBE model using the search-based method. |
| Author | Ujibashi, Yoshifumi Takasu, Atsuhiro |
| Author_xml | – sequence: 1 givenname: Yoshifumi orcidid: 0000-0001-6527-2814 surname: Ujibashi fullname: Ujibashi, Yoshifumi email: ujibashi@nii.ac.jp organization: Department of Informatics, The Graduate University for Advanced Studies, SOKENDAI, Shonan, Hayama, Japan – sequence: 2 givenname: Atsuhiro orcidid: 0000-0002-9061-7949 surname: Takasu fullname: Takasu, Atsuhiro organization: Department of Informatics, The Graduate University for Advanced Studies, SOKENDAI, Shonan, Hayama, Japan |
| BackLink | https://cir.nii.ac.jp/crid/1871991017441804800$$DView record in CiNii |
| BookMark | eNpNUcFuEzEQXaEiUUq_oJeV4Jrg8dhr-xhFASpVBZFytuz1uN2wWQfvRpC_x2GrijnMjJ7evJnRe1tdDGmgqroBtgRg5uNqvd5st0vOOF8iSCka_aq65NCYBUpsLv7r31TX47hjJXSBpLqsvt_TMbu-vqfpd8o_69XhkJNrn-op1d9yesxuX29Pw_REYzfWMeX6wflj70rNbhgLsHdTl4ban-rNH7c_9PSueh1dP9L1c72qfnzaPKy_LO6-fr5dr-4WrZA4LYIxqGOryEswAQ2xEDQCaAmaB9-0nqOK1JgookLniDMC8CBD2zRtZHhV3c66IbmdPeRu7_LJJtfZf0DKj9blqWt7suQRIiiSwgURmHM-MEFKRx8FSY5F6_2sVb7_daRxsrt0zEM53_IGlUCN2BQWzqw2p3HMFF-2ArNnL-zshT17YZ-9KFMf5qmh62zbnTNoBcYAAyUEaCY0O79zM9M6InoRNoqj4IB_ARMMkYk |
| CODEN | IAECCG |
| Cites_doi | 10.1145/1925844.1926423 10.18653/v1/P19-1365 10.18653/v1/D16-1137 10.1145/3035918.3064034 10.18653/v1/W17-3207 10.5555/3454287.3455008 10.1145/1993316.1993536 10.1145/2813885.2737952 10.14778/2977797.2977807 10.18653/v1/2020.acl-main.398 10.18653/v1/N19-4009 10.1145/3360594 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| DBID | 97E ESBDL RIA RIE RYH AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
| DOI | 10.1109/ACCESS.2022.3155468 |
| DatabaseName | IEEE Xplore (IEEE) IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CiNii Complete CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Materials Research Database |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2169-3536 |
| EndPage | 24876 |
| ExternalDocumentID | oai_doaj_org_article_eb31f17e54ad4d0aabd04e78fbf4e523 10_1109_ACCESS_2022_3155468 9723421 |
| Genre | orig-research |
| GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS RYH AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c453t-d9938fc7eb519d39e0dd831185182db6cb237fe69f4f73aae20e11b15dc66cf03 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766563200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2169-3536 |
| IngestDate | Tue Oct 14 19:06:24 EDT 2025 Mon Jun 30 02:25:19 EDT 2025 Sat Nov 29 06:31:57 EST 2025 Thu Jun 26 23:44:53 EDT 2025 Wed Aug 27 02:24:09 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://creativecommons.org/licenses/by/4.0/legalcode |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c453t-d9938fc7eb519d39e0dd831185182db6cb237fe69f4f73aae20e11b15dc66cf03 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-9061-7949 0000-0001-6527-2814 |
| OpenAccessLink | https://doaj.org/article/eb31f17e54ad4d0aabd04e78fbf4e523 |
| PQID | 2637438336 |
| PQPubID | 4845423 |
| PageCount | 13 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_eb31f17e54ad4d0aabd04e78fbf4e523 nii_cinii_1871991017441804800 crossref_primary_10_1109_ACCESS_2022_3155468 proquest_journals_2637438336 ieee_primary_9723421 |
| PublicationCentury | 2000 |
| PublicationDate | 20220000 2022-01-01 2022-00-00 20220101 |
| PublicationDateYYYYMMDD | 2022-01-01 |
| PublicationDate_xml | – year: 2022 text: 20220000 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE Access |
| PublicationTitleAbbrev | Access |
| PublicationYear | 2022 |
| Publisher | IEEE Institute of Electrical and Electronics Engineers (IEEE) The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers (IEEE) – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref24 Parisotto (ref4) 2016 ref12 Nagarajan (ref18) ref23 Kulikov (ref15) 2018 ref14 ref22 ref10 Clymo (ref21); 108 ref1 Laich (ref17) Mohammad (ref19) ref8 ref7 Shin (ref20) ref6 ref5 Devlin (ref3) Kalyan (ref9) Raman (ref2) 2001; 1 Vaswani (ref16) 2017 Sutskever (ref11) 2014 |
| References_xml | – volume: 1 start-page: 381 year: 2001 ident: ref2 article-title: Potter’s wheel: An interactive data cleaning system publication-title: VLDB – ident: ref5 doi: 10.1145/1925844.1926423 – year: 2014 ident: ref11 article-title: Sequence to sequence learning with neural networks publication-title: arXiv:1409.3215 – ident: ref14 doi: 10.18653/v1/P19-1365 – start-page: 1 volume-title: Proc. Assoc. Adv. Artif. Intell. ident: ref19 article-title: Disjunctive program synthesis: A robust approach to programming by example – ident: ref12 doi: 10.18653/v1/D16-1137 – year: 2017 ident: ref16 article-title: Attention is all you need publication-title: arXiv:1706.03762 – year: 2018 ident: ref15 article-title: Importance of search and evaluation strategies in neural dialogue modeling publication-title: arXiv:1811.00907 – ident: ref1 doi: 10.1145/3035918.3064034 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref9 article-title: Neural-guided deductive search for real-time program synthesis from examples – ident: ref13 doi: 10.18653/v1/W17-3207 – start-page: 990 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref3 article-title: RobustFill: Neural program learning under noisy I/O – ident: ref22 doi: 10.5555/3454287.3455008 – ident: ref6 doi: 10.1145/1993316.1993536 – start-page: 1 volume-title: Proc. 8th Int. Conf. Learn. Represent. ident: ref17 article-title: Guiding program synthesis by learning to generate examples – ident: ref7 doi: 10.1145/2813885.2737952 – start-page: 1714 volume-title: Proc. 22nd Int. Conf. Artif. Intell. Statist. (AISTATS) ident: ref18 article-title: Learning natural programs from a few examples in real–time – year: 2016 ident: ref4 article-title: Neuro-symbolic program synthesis publication-title: arXiv:1611.01855 – ident: ref10 doi: 10.14778/2977797.2977807 – ident: ref24 doi: 10.18653/v1/2020.acl-main.398 – volume: 108 start-page: 3450 volume-title: Proc. Int. Conf. Artif. Intell. Statist. ident: ref21 article-title: Data generation for neural programming by example – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref20 article-title: Synthetic datasets for neural program synthesis – ident: ref23 doi: 10.18653/v1/N19-4009 – ident: ref8 doi: 10.1145/3360594 |
| SSID | ssj0000816957 |
| Score | 2.2425783 |
| Snippet | Data transformation is a laborious and time-consuming task for analysts. Programming by example (PBE) is a technique that can simplify this difficult task for... |
| SourceID | doaj proquest crossref nii ieee |
| SourceType | Open Website Aggregation Database Index Database Publisher |
| StartPage | 24864 |
| SubjectTerms | Automatic programming beam search Coders Data analysis Data integration Data models Decoding Domain specific languages Electrical engineering. Electronics. Nuclear engineering Encoding LSTM Machine learning Machine learning algorithms Memory neural network Neural networks program synthesis programming by example Search algorithms Strings Tables (data) tabular data TK1-9971 Training data Transformations transformer Transformers |
| SummonAdditionalLinks | – databaseName: IEEE Electronic Library (IEL) dbid: RIE link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB2VigMc-CqIQIt84NhQJ_7McVm14oBWFSyoNyu2x1IuWdTdIvrvGSfuqgguXKLIsqzEL_a8cWbeALw3ERVPWtc6tLqWfaIl1aq-JmNotfIq2Kl0wvfPZrWyV1fd5QGc7nNhEHEKPsMP-Xb6lx834SYflZ3lClkyZ40_MMbMuVr785RcQKJTpggLNbw7WyyX9A7kArYteaY5Gsv-YXwmjf5SVIUsyzgMf-3Hk5G5ePp_j_cMnhQyyRYz-s_hAMcX8PiexOARfMnqG9RnNYd7s0XREGe7DbucY7PY19uRaOB22DJisGzd-xyaytb3KO1mZP6Wnf_qs5bwS_h2cb5efqpLHYU6SCV2dSQOYlMw6ImuRdEhj9EK8iwUORfR6-BbYRLqLslkRN9jy7FpfKNi0DokLl7B4bgZ8TWwltgLGsRgOyuTij5p3gVBDU3yQvsKTu8m2P2Y5TLc5Gbwzs14uIyHK3hU8DGDsO-ata6nBppXV5aOI3e_SY1BJfsoI-97H7lEY5NPEsmPruAoY7EfpMBQwQlB6sKQrw35hcSEafsh-mdzHj2v4PgObFfW7da1Wpgs3ir0m3-P-hYe5ReYD2GO4XB3fYMn8DD83A3b63fTJ_kb153d0g priority: 102 providerName: IEEE |
| Title | Neural Network Approach to Program Synthesis for Tabular Transformation by Example |
| URI | https://ieeexplore.ieee.org/document/9723421 https://cir.nii.ac.jp/crid/1871991017441804800 https://www.proquest.com/docview/2637438336 https://doaj.org/article/eb31f17e54ad4d0aabd04e78fbf4e523 |
| Volume | 10 |
| WOSCitedRecordID | wos000766563200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2169-3536 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000816957 issn: 2169-3536 databaseCode: DOA dateStart: 20130101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2169-3536 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000816957 issn: 2169-3536 databaseCode: M~E dateStart: 20130101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV09bxQxEB2hiAIKBATEQhK5oGQVf9tbHqeLKOAUwYHSWf6UttlDuQsiDb-d8e4mOkRBQ7OFZa3smfXMe6vxG4C3JmVFi9atjly30hc8Ulz5FpOh1SqoaMfWCd8-mvXaXl11lwetvmpN2CQPPBnuHMkeK8xkJX2SiXofEpXZ2BKKzMiiavRF1HNApsYYbJnulJllhhjtzhfLJe4ICSHnyFNrbZb9IxWNiv1zixXMM0Pf_xWdx5Rz8RSezFiRLKY1PoMHeXgOjw8UBI_hcxXXwDnrqZqbLGaJcLLfksup9Ip8uR0Q5e36HUGASjY-1MpTsjlArNuBhFuy-umrVPAL-Hqx2iw_tHObhDZKJfZtQohhSzQ5IBpLoss0JSuQOCjkDinoGLgwJeuuyGKE95nTzFhgKkWtY6HiJRwN2yG_AsIRnGSTc7SdlUWlUDTtosABVoLQoYF3dxZz3yc1DDeyCNq5ycCuGtjNBm7gfbXq_dQqZT0OoIPd7GD3Lwc3cFx9cv-S2iZNctbAKfrIxb4-GdI-BLoYXRDd2XpNnjZwcuc9Nx_LneNamKrNKvTr_7G0N_Cobnf6I3MCR_vrm3wKD-OPfb-7Phu_SHx--rU6G-8V_gaCs-Rx |
| linkProvider | Directory of Open Access Journals |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB1VBQk48FUqFlrwgWNDnfgzx2XVqohlVcGCerNieyzlkkXdLWr_fcdJuiqCC5cksiwr9ovtN87MG4APJqLiSetCh0oXskk0pSrVFLQZWq28CrZPnfBzbhYLe3FRn-_A0TYWBhF75zP8mB_7f_lxFa7yUdlxzpAlc9T4AyXpPkRrbU9UcgqJWplRWqjk9fF0NqNekBFYVWSbZn8s-8f206v0j2lVaG_p2vavFbnfZk6f_d8LPoenI51k0wH_F7CD3Ut4ck9kcA--Zf0NqrMYHL7ZdFQRZ5sVOx-8s9j3m46I4LpdM-KwbNn47JzKlvdI7apj_oadXDdZTfgV_Dg9Wc7OijGTQhGkEpsiEguxKRj0RNiiqJHHaAXZForMi-h18JUwCXWdZDKiabDiWJa-VDFoHRIX-7DbrTp8Dawi_oIGMdjayqSiT5rXQVBBmbzQfgJHdwPsfg2CGa43NHjtBjxcxsONeEzgUwZhWzWrXfcFNK5unDyODP4ylQaVbKKMvGl85BKNTT5JJEt6AnsZi20jIwwTOCRIXWjztSTLkLgwLUBEAG2OpOcTOLgD240zd-0qLUyWbxX6zb9bfQ-PzpZf527-efHlLTzOnRmOZA5gd3N5hYfwMPzetOvLd_3neQuHFOEZ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Neural+Network+Approach+to+Program+Synthesis+for+Tabular+Transformation+by+Example&rft.jtitle=IEEE+access&rft.au=Ujibashi%2C+Yoshifumi&rft.au=Takasu%2C+Atsuhiro&rft.date=2022&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=10&rft.spage=24864&rft.epage=24876&rft_id=info:doi/10.1109%2FACCESS.2022.3155468&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2022_3155468 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |