Semantic programming by example with pre-trained models

The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate repetitive tasks in an intuitive way. Existing frameworks on inductive synthesis only perform syntactic manipulations, where they rely on the syntacti...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings of ACM on programming languages Ročník 5; číslo OOPSLA; s. 1 - 25
Hlavní autori: Verbruggen, Gust, Le, Vu, Gulwani, Sumit
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: 01.10.2021
ISSN:2475-1421, 2475-1421
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate repetitive tasks in an intuitive way. Existing frameworks on inductive synthesis only perform syntactic manipulations, where they rely on the syntactic structure of the given examples and not their meaning. Any semantic manipulations, such as transforming dates, have to be manually encoded by the designer of the inductive programming framework. Recent advances in large language models have shown these models to be very adept at performing semantic transformations of its input by simply providing a few examples of the task at hand. When it comes to syntactic transformations, however, these models are limited in their expressive power. In this paper, we propose a novel framework for integrating inductive synthesis with few-shot learning language models to combine the strength of these two popular technologies. In particular, the inductive synthesis is tasked with breaking down the problem in smaller subproblems, among which those that cannot be solved syntactically are passed to the language model. We formalize three semantic operators that can be integrated with inductive synthesizers. To minimize invoking expensive semantic operators during learning, we introduce a novel deferred query execution algorithm that considers the operators to be oracles during learning. We evaluate our approach in the domain of string transformations: the combination methodology can automate tasks that cannot be handled using either technologies by themselves. Finally, we demonstrate the generality of our approach via a case study in the domain of string profiling.
AbstractList The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate repetitive tasks in an intuitive way. Existing frameworks on inductive synthesis only perform syntactic manipulations, where they rely on the syntactic structure of the given examples and not their meaning. Any semantic manipulations, such as transforming dates, have to be manually encoded by the designer of the inductive programming framework. Recent advances in large language models have shown these models to be very adept at performing semantic transformations of its input by simply providing a few examples of the task at hand. When it comes to syntactic transformations, however, these models are limited in their expressive power. In this paper, we propose a novel framework for integrating inductive synthesis with few-shot learning language models to combine the strength of these two popular technologies. In particular, the inductive synthesis is tasked with breaking down the problem in smaller subproblems, among which those that cannot be solved syntactically are passed to the language model. We formalize three semantic operators that can be integrated with inductive synthesizers. To minimize invoking expensive semantic operators during learning, we introduce a novel deferred query execution algorithm that considers the operators to be oracles during learning. We evaluate our approach in the domain of string transformations: the combination methodology can automate tasks that cannot be handled using either technologies by themselves. Finally, we demonstrate the generality of our approach via a case study in the domain of string profiling.
Author Le, Vu
Verbruggen, Gust
Gulwani, Sumit
Author_xml – sequence: 1
  givenname: Gust
  orcidid: 0000-0001-9182-597X
  surname: Verbruggen
  fullname: Verbruggen, Gust
  organization: KU Leuven, Belgium
– sequence: 2
  givenname: Vu
  surname: Le
  fullname: Le, Vu
  organization: Microsoft, USA
– sequence: 3
  givenname: Sumit
  surname: Gulwani
  fullname: Gulwani, Sumit
  organization: Microsoft, USA
BookMark eNplj01LAzEQhoNUsNbiX9ibp2gmmTS7Ryl-QcGDel6y2dka2eyWJKD9926xB9HTvMz7MMxzzmbDOBBjlyCuAVDfKCw1GnPC5hKN5oASZr_yGVum9CGEgGoiVTVn5oWCHbJ3xS6O22hD8MO2aPYFfdmw66n49Pl96ojnaP1AbRHGlvp0wU472ydaHueCvd3fva4f-eb54Wl9u-FO6jJz6DrbyKp0Sih0iLJSato1aEGuJBlBBlE7MEJ3EnDVSBK2bElZgMqRUgt29XPXxTGlSF29iz7YuK9B1Afl-qg8kfwP6Xy22Y_D4fP-H_8NlqdXzw
CitedBy_id crossref_primary_10_1145_3563330
crossref_primary_10_1145_3709677
crossref_primary_10_1145_3563350
crossref_primary_10_1145_3649850
crossref_primary_10_1145_3622815
crossref_primary_10_1145_3563327
crossref_primary_10_1145_3571226
crossref_primary_10_1145_3622863
crossref_primary_10_1145_3632860
crossref_primary_10_1145_3729300
Cites_doi 10.1145/3453483.3454080
10.1561/2500000010
10.1609/aaai.v31i1.10668
10.1145/2807442.2807459
10.1145/2814270.2814310
10.1109/ICDE.2016.7498319
10.1109/ICSE.2017.44
10.1145/3276520
10.18653/v1/P17-1147
10.24963/ijcai.2017/227
10.1145/2666356.2594333
10.1145/3428287
10.1145/3360569
10.18653/v1/D19-1250
10.1145/3448016.3457250
10.18653/v1/2020.emnlp-main.437
10.1145/1926385.1926423
10.18653/v1/P16-1162
10.14778/3231751.3231766
10.18653/v1/N19-1423
10.1145/2240236.2240260
10.1145/2213836.2213848
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.1145/3485477
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2475-1421
EndPage 25
ExternalDocumentID 10_1145_3485477
GroupedDBID AAKMM
AAYFX
AAYXX
ACM
AEFXT
AEJOY
AIKLT
AKRVB
ALMA_UNASSIGNED_HOLDINGS
CITATION
GUFHI
LHSKQ
M~E
OK1
ROL
ID FETCH-LOGICAL-c258t-1ffab298c3034c4429331ffb4a1262e70e7445c1705f2146b2e0a8de3a119ce33
ISICitedReferencesCount 16
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000731569200004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2475-1421
IngestDate Tue Nov 18 21:33:42 EST 2025
Sat Nov 29 07:51:23 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue OOPSLA
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c258t-1ffab298c3034c4429331ffb4a1262e70e7445c1705f2146b2e0a8de3a119ce33
ORCID 0000-0001-9182-597X
OpenAccessLink https://dl.acm.org/doi/pdf/10.1145/3485477
PageCount 25
ParticipantIDs crossref_primary_10_1145_3485477
crossref_citationtrail_10_1145_3485477
PublicationCentury 2000
PublicationDate 2021-10-01
PublicationDateYYYYMMDD 2021-10-01
PublicationDate_xml – month: 10
  year: 2021
  text: 2021-10-01
  day: 01
PublicationDecade 2020
PublicationTitle Proceedings of ACM on programming languages
PublicationYear 2021
References e_1_2_2_25_1
e_1_2_2_24_1
e_1_2_2_6_1
e_1_2_2_23_1
e_1_2_2_22_1
e_1_2_2_1_1
e_1_2_2_20_1
Balog M (e_1_2_2_2_1) 2019
e_1_2_2_9_1
e_1_2_2_29_1
e_1_2_2_8_1
e_1_2_2_28_1
e_1_2_2_27_1
e_1_2_2_26_1
Brown Tom B. (e_1_2_2_4_1) 2020; 33
Bhupatiraju Surya (e_1_2_2_3_1) 2017
e_1_2_2_14_1
e_1_2_2_13_1
e_1_2_2_12_1
e_1_2_2_11_1
e_1_2_2_10_1
Mikolov Tomas (e_1_2_2_21_1) 2013; 2
e_1_2_2_30_1
e_1_2_2_31_1
e_1_2_2_19_1
e_1_2_2_32_1
e_1_2_2_18_1
e_1_2_2_33_1
Cypher Allen (e_1_2_2_5_1)
e_1_2_2_17_1
e_1_2_2_34_1
e_1_2_2_16_1
e_1_2_2_35_1
e_1_2_2_15_1
Devlin J. (e_1_2_2_7_1) 2017
References_xml – ident: e_1_2_2_9_1
  doi: 10.1145/3453483.3454080
– volume-title: Abdel rahman Mohamed, and P. Kohli
  year: 2017
  ident: e_1_2_2_3_1
– ident: e_1_2_2_13_1
  doi: 10.1561/2500000010
– ident: e_1_2_2_27_1
– ident: e_1_2_2_29_1
  doi: 10.1609/aaai.v31i1.10668
– ident: e_1_2_2_19_1
  doi: 10.1145/2807442.2807459
– ident: e_1_2_2_26_1
  doi: 10.1145/2814270.2814310
– ident: e_1_2_2_35_1
– ident: e_1_2_2_1_1
  doi: 10.1109/ICDE.2016.7498319
– ident: e_1_2_2_31_1
  doi: 10.1109/ICSE.2017.44
– ident: e_1_2_2_20_1
– volume-title: 5th International Conference on Learning Representations, ICLR
  year: 2019
  ident: e_1_2_2_2_1
– ident: e_1_2_2_23_1
  doi: 10.1145/3276520
– volume: 33
  start-page: 1877
  year: 2020
  ident: e_1_2_2_4_1
  article-title: Language Models are Few-Shot Learners
  publication-title: Advances in Neural Information Processing Systems.
– ident: e_1_2_2_15_1
  doi: 10.18653/v1/P17-1147
– ident: e_1_2_2_8_1
  doi: 10.24963/ijcai.2017/227
– volume: 2
  volume-title: Proceedings of the 26th International Conference on Neural Information Processing Systems -
  year: 2013
  ident: e_1_2_2_21_1
– ident: e_1_2_2_18_1
– ident: e_1_2_2_28_1
– ident: e_1_2_2_16_1
  doi: 10.1145/2666356.2594333
– ident: e_1_2_2_10_1
  doi: 10.1145/3428287
– volume-title: Abdel rahman Mohamed, and P. Kohli
  year: 2017
  ident: e_1_2_2_7_1
– ident: e_1_2_2_22_1
  doi: 10.1145/3360569
– ident: e_1_2_2_25_1
  doi: 10.18653/v1/D19-1250
– ident: e_1_2_2_33_1
  doi: 10.1145/3448016.3457250
– ident: e_1_2_2_17_1
– volume-title: Watch what I do: programming by demonstration
  ident: e_1_2_2_5_1
– ident: e_1_2_2_30_1
  doi: 10.18653/v1/2020.emnlp-main.437
– ident: e_1_2_2_11_1
  doi: 10.1145/1926385.1926423
– ident: e_1_2_2_32_1
  doi: 10.18653/v1/P16-1162
– ident: e_1_2_2_14_1
  doi: 10.14778/3231751.3231766
– ident: e_1_2_2_6_1
  doi: 10.18653/v1/N19-1423
– ident: e_1_2_2_12_1
  doi: 10.1145/2240236.2240260
– ident: e_1_2_2_24_1
– ident: e_1_2_2_34_1
  doi: 10.1145/2213836.2213848
SSID ssj0001934839
Score 2.334033
Snippet The ability to learn programs from few examples is a powerful technology with disruptive applications in many domains, as it allows users to automate...
SourceID crossref
SourceType Enrichment Source
Index Database
StartPage 1
Title Semantic programming by example with pre-trained models
Volume 5
WOSCitedRecordID wos000731569200004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2475-1421
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001934839
  issn: 2475-1421
  databaseCode: M~E
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwELZ4HbhAS1vxauUD4haIX-v4uEK0PfBYiYe4rRKvg1ZiA9rdwHLhtzOOHdcsldoeeokix46cfMl4MvlmPoT2NBwoRKYSxUwn4XrAYK-0AXjLwJCdQYfkjdiEPDvLbm5Uz9NtJ42cgKyqbDZTD_8VamgDsG3q7D_AHU4KDbAPoMMWYIftXwF_YUZwt4a6pV6NbDAAnEwzy20lYBd5teSPRh0C_M1GDGcSe6m9sKo1RI_u0an9pxCfrw1zBo_8GuAZ17e3zor9qCeBTnPiuLR1YPrUd09OSAqs1mg4jeMOlAQGmzdPlEuREO7ymw_Mb9q8fRXRY3R-3rs46UYGk0Qrr8uAfm_TuS1_wXgmuFd8eVM1e241CxxDl3Et-n7gIlqmUijL-jt9icJwCo43enNh7i612o499GMjnyVyPi4_oDX_1YC7Du2PaMFUG2i9VeTA3kB_QrIFH0dg4eIZe_CxBR9H4GMH_md09f348uhn4qUxEk1FNk1IWeYFVZmGd4prDk4FY9BW8JzQDjUyNZJzoW2tpNJKtxfUpHk2MCwnRGnD2Be0VN1XZhNhA0tvQXWqqDBcl1rJokzzUhJucpalxRbaby--r33deDvHu_7cHd5COHR8cKVS5rts_7nLDlr99bTtoqXpuDZf0Yp-nA4n428NeK9Lwlu-
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semantic+programming+by+example+with+pre-trained+models&rft.jtitle=Proceedings+of+ACM+on+programming+languages&rft.au=Verbruggen%2C+Gust&rft.au=Le%2C+Vu&rft.au=Gulwani%2C+Sumit&rft.date=2021-10-01&rft.issn=2475-1421&rft.eissn=2475-1421&rft.volume=5&rft.issue=OOPSLA&rft.spage=1&rft.epage=25&rft_id=info:doi/10.1145%2F3485477&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3485477
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2475-1421&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2475-1421&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2475-1421&client=summon