Semantic programming by example with pre-trained models
The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate repetitive tasks in an intuitive way. Existing frameworks on inductive synthesis only perform syntactic manipulations, where they rely on the syntacti...
Uložené v:
| Vydané v: | Proceedings of ACM on programming languages Ročník 5; číslo OOPSLA; s. 1 - 25 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
01.10.2021
|
| ISSN: | 2475-1421, 2475-1421 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate repetitive tasks in an intuitive way. Existing frameworks on inductive synthesis only perform syntactic manipulations, where they rely on the syntactic structure of the given examples and not their meaning. Any semantic manipulations, such as transforming dates, have to be manually encoded by the designer of the inductive programming framework. Recent advances in large language models have shown these models to be very adept at performing semantic transformations of its input by simply providing a few examples of the task at hand. When it comes to syntactic transformations, however, these models are limited in their expressive power. In this paper, we propose a novel framework for integrating inductive synthesis with few-shot learning language models to combine the strength of these two popular technologies. In particular, the inductive synthesis is tasked with breaking down the problem in smaller subproblems, among which those that cannot be solved syntactically are passed to the language model. We formalize three semantic operators that can be integrated with inductive synthesizers. To minimize invoking expensive semantic operators during learning, we introduce a novel deferred query execution algorithm that considers the operators to be oracles during learning. We evaluate our approach in the domain of string transformations: the combination methodology can automate tasks that cannot be handled using either technologies by themselves. Finally, we demonstrate the generality of our approach via a case study in the domain of string profiling. |
|---|---|
| AbstractList | The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate repetitive tasks in an intuitive way. Existing frameworks on inductive synthesis only perform syntactic manipulations, where they rely on the syntactic structure of the given examples and not their meaning. Any semantic manipulations, such as transforming dates, have to be manually encoded by the designer of the inductive programming framework. Recent advances in large language models have shown these models to be very adept at performing semantic transformations of its input by simply providing a few examples of the task at hand. When it comes to syntactic transformations, however, these models are limited in their expressive power. In this paper, we propose a novel framework for integrating inductive synthesis with few-shot learning language models to combine the strength of these two popular technologies. In particular, the inductive synthesis is tasked with breaking down the problem in smaller subproblems, among which those that cannot be solved syntactically are passed to the language model. We formalize three semantic operators that can be integrated with inductive synthesizers. To minimize invoking expensive semantic operators during learning, we introduce a novel deferred query execution algorithm that considers the operators to be oracles during learning. We evaluate our approach in the domain of string transformations: the combination methodology can automate tasks that cannot be handled using either technologies by themselves. Finally, we demonstrate the generality of our approach via a case study in the domain of string profiling. |
| Author | Le, Vu Verbruggen, Gust Gulwani, Sumit |
| Author_xml | – sequence: 1 givenname: Gust orcidid: 0000-0001-9182-597X surname: Verbruggen fullname: Verbruggen, Gust organization: KU Leuven, Belgium – sequence: 2 givenname: Vu surname: Le fullname: Le, Vu organization: Microsoft, USA – sequence: 3 givenname: Sumit surname: Gulwani fullname: Gulwani, Sumit organization: Microsoft, USA |
| BookMark | eNplj01LAzEQhoNUsNbiX9ibp2gmmTS7Ryl-QcGDel6y2dka2eyWJKD9926xB9HTvMz7MMxzzmbDOBBjlyCuAVDfKCw1GnPC5hKN5oASZr_yGVum9CGEgGoiVTVn5oWCHbJ3xS6O22hD8MO2aPYFfdmw66n49Pl96ojnaP1AbRHGlvp0wU472ydaHueCvd3fva4f-eb54Wl9u-FO6jJz6DrbyKp0Sih0iLJSato1aEGuJBlBBlE7MEJ3EnDVSBK2bElZgMqRUgt29XPXxTGlSF29iz7YuK9B1Afl-qg8kfwP6Xy22Y_D4fP-H_8NlqdXzw |
| CitedBy_id | crossref_primary_10_1145_3563330 crossref_primary_10_1145_3709677 crossref_primary_10_1145_3563350 crossref_primary_10_1145_3649850 crossref_primary_10_1145_3622815 crossref_primary_10_1145_3563327 crossref_primary_10_1145_3571226 crossref_primary_10_1145_3622863 crossref_primary_10_1145_3632860 crossref_primary_10_1145_3729300 |
| Cites_doi | 10.1145/3453483.3454080 10.1561/2500000010 10.1609/aaai.v31i1.10668 10.1145/2807442.2807459 10.1145/2814270.2814310 10.1109/ICDE.2016.7498319 10.1109/ICSE.2017.44 10.1145/3276520 10.18653/v1/P17-1147 10.24963/ijcai.2017/227 10.1145/2666356.2594333 10.1145/3428287 10.1145/3360569 10.18653/v1/D19-1250 10.1145/3448016.3457250 10.18653/v1/2020.emnlp-main.437 10.1145/1926385.1926423 10.18653/v1/P16-1162 10.14778/3231751.3231766 10.18653/v1/N19-1423 10.1145/2240236.2240260 10.1145/2213836.2213848 |
| ContentType | Journal Article |
| DBID | AAYXX CITATION |
| DOI | 10.1145/3485477 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2475-1421 |
| EndPage | 25 |
| ExternalDocumentID | 10_1145_3485477 |
| GroupedDBID | AAKMM AAYFX AAYXX ACM AEFXT AEJOY AIKLT AKRVB ALMA_UNASSIGNED_HOLDINGS CITATION GUFHI LHSKQ M~E OK1 ROL |
| ID | FETCH-LOGICAL-c258t-1ffab298c3034c4429331ffb4a1262e70e7445c1705f2146b2e0a8de3a119ce33 |
| ISICitedReferencesCount | 16 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000731569200004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2475-1421 |
| IngestDate | Tue Nov 18 21:33:42 EST 2025 Sat Nov 29 07:51:23 EST 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | OOPSLA |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c258t-1ffab298c3034c4429331ffb4a1262e70e7445c1705f2146b2e0a8de3a119ce33 |
| ORCID | 0000-0001-9182-597X |
| OpenAccessLink | https://dl.acm.org/doi/pdf/10.1145/3485477 |
| PageCount | 25 |
| ParticipantIDs | crossref_primary_10_1145_3485477 crossref_citationtrail_10_1145_3485477 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-10-01 |
| PublicationDateYYYYMMDD | 2021-10-01 |
| PublicationDate_xml | – month: 10 year: 2021 text: 2021-10-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings of ACM on programming languages |
| PublicationYear | 2021 |
| References | e_1_2_2_25_1 e_1_2_2_24_1 e_1_2_2_6_1 e_1_2_2_23_1 e_1_2_2_22_1 e_1_2_2_1_1 e_1_2_2_20_1 Balog M (e_1_2_2_2_1) 2019 e_1_2_2_9_1 e_1_2_2_29_1 e_1_2_2_8_1 e_1_2_2_28_1 e_1_2_2_27_1 e_1_2_2_26_1 Brown Tom B. (e_1_2_2_4_1) 2020; 33 Bhupatiraju Surya (e_1_2_2_3_1) 2017 e_1_2_2_14_1 e_1_2_2_13_1 e_1_2_2_12_1 e_1_2_2_11_1 e_1_2_2_10_1 Mikolov Tomas (e_1_2_2_21_1) 2013; 2 e_1_2_2_30_1 e_1_2_2_31_1 e_1_2_2_19_1 e_1_2_2_32_1 e_1_2_2_18_1 e_1_2_2_33_1 Cypher Allen (e_1_2_2_5_1) e_1_2_2_17_1 e_1_2_2_34_1 e_1_2_2_16_1 e_1_2_2_35_1 e_1_2_2_15_1 Devlin J. (e_1_2_2_7_1) 2017 |
| References_xml | – ident: e_1_2_2_9_1 doi: 10.1145/3453483.3454080 – volume-title: Abdel rahman Mohamed, and P. Kohli year: 2017 ident: e_1_2_2_3_1 – ident: e_1_2_2_13_1 doi: 10.1561/2500000010 – ident: e_1_2_2_27_1 – ident: e_1_2_2_29_1 doi: 10.1609/aaai.v31i1.10668 – ident: e_1_2_2_19_1 doi: 10.1145/2807442.2807459 – ident: e_1_2_2_26_1 doi: 10.1145/2814270.2814310 – ident: e_1_2_2_35_1 – ident: e_1_2_2_1_1 doi: 10.1109/ICDE.2016.7498319 – ident: e_1_2_2_31_1 doi: 10.1109/ICSE.2017.44 – ident: e_1_2_2_20_1 – volume-title: 5th International Conference on Learning Representations, ICLR year: 2019 ident: e_1_2_2_2_1 – ident: e_1_2_2_23_1 doi: 10.1145/3276520 – volume: 33 start-page: 1877 year: 2020 ident: e_1_2_2_4_1 article-title: Language Models are Few-Shot Learners publication-title: Advances in Neural Information Processing Systems. – ident: e_1_2_2_15_1 doi: 10.18653/v1/P17-1147 – ident: e_1_2_2_8_1 doi: 10.24963/ijcai.2017/227 – volume: 2 volume-title: Proceedings of the 26th International Conference on Neural Information Processing Systems - year: 2013 ident: e_1_2_2_21_1 – ident: e_1_2_2_18_1 – ident: e_1_2_2_28_1 – ident: e_1_2_2_16_1 doi: 10.1145/2666356.2594333 – ident: e_1_2_2_10_1 doi: 10.1145/3428287 – volume-title: Abdel rahman Mohamed, and P. Kohli year: 2017 ident: e_1_2_2_7_1 – ident: e_1_2_2_22_1 doi: 10.1145/3360569 – ident: e_1_2_2_25_1 doi: 10.18653/v1/D19-1250 – ident: e_1_2_2_33_1 doi: 10.1145/3448016.3457250 – ident: e_1_2_2_17_1 – volume-title: Watch what I do: programming by demonstration ident: e_1_2_2_5_1 – ident: e_1_2_2_30_1 doi: 10.18653/v1/2020.emnlp-main.437 – ident: e_1_2_2_11_1 doi: 10.1145/1926385.1926423 – ident: e_1_2_2_32_1 doi: 10.18653/v1/P16-1162 – ident: e_1_2_2_14_1 doi: 10.14778/3231751.3231766 – ident: e_1_2_2_6_1 doi: 10.18653/v1/N19-1423 – ident: e_1_2_2_12_1 doi: 10.1145/2240236.2240260 – ident: e_1_2_2_24_1 – ident: e_1_2_2_34_1 doi: 10.1145/2213836.2213848 |
| SSID | ssj0001934839 |
| Score | 2.334033 |
| Snippet | The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate... |
| SourceID | crossref |
| SourceType | Enrichment Source Index Database |
| StartPage | 1 |
| Title | Semantic programming by example with pre-trained models |
| Volume | 5 |
| WOSCitedRecordID | wos000731569200004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2475-1421 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001934839 issn: 2475-1421 databaseCode: M~E dateStart: 20170101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwELZ4HbhAS1vxauUD4haIX-v4uEK0PfBYiYe4rRKvg1ZiA9rdwHLhtzOOHdcsldoeeokix46cfMl4MvlmPoT2NBwoRKYSxUwn4XrAYK-0AXjLwJCdQYfkjdiEPDvLbm5Uz9NtJ42cgKyqbDZTD_8VamgDsG3q7D_AHU4KDbAPoMMWYIftXwF_YUZwt4a6pV6NbDAAnEwzy20lYBd5teSPRh0C_M1GDGcSe6m9sKo1RI_u0an9pxCfrw1zBo_8GuAZ17e3zor9qCeBTnPiuLR1YPrUd09OSAqs1mg4jeMOlAQGmzdPlEuREO7ymw_Mb9q8fRXRY3R-3rs46UYGk0Qrr8uAfm_TuS1_wXgmuFd8eVM1e241CxxDl3Et-n7gIlqmUijL-jt9icJwCo43enNh7i612o499GMjnyVyPi4_oDX_1YC7Du2PaMFUG2i9VeTA3kB_QrIFH0dg4eIZe_CxBR9H4GMH_md09f348uhn4qUxEk1FNk1IWeYFVZmGd4prDk4FY9BW8JzQDjUyNZJzoW2tpNJKtxfUpHk2MCwnRGnD2Be0VN1XZhNhA0tvQXWqqDBcl1rJokzzUhJucpalxRbaby--r33deDvHu_7cHd5COHR8cKVS5rts_7nLDlr99bTtoqXpuDZf0Yp-nA4n428NeK9Lwlu- |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semantic+programming+by+example+with+pre-trained+models&rft.jtitle=Proceedings+of+ACM+on+programming+languages&rft.au=Verbruggen%2C+Gust&rft.au=Le%2C+Vu&rft.au=Gulwani%2C+Sumit&rft.date=2021-10-01&rft.issn=2475-1421&rft.eissn=2475-1421&rft.volume=5&rft.issue=OOPSLA&rft.spage=1&rft.epage=25&rft_id=info:doi/10.1145%2F3485477&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3485477 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2475-1421&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2475-1421&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2475-1421&client=summon |