Active Learning for Neurosymbolic Program Synthesis

The goal of active learning for program synthesis is to synthesize the desired program by asking targeted questions that minimize user interaction. While prior work has explored active learning in the purely symbolic setting, such techniques are inadequate for the increasingly popular paradigm of ne...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of ACM on programming languages Vol. 9; no. OOPSLA2; pp. 1455 - 1483
Main Authors: Barnaby, Celeste, Chen, Qiaochu, Ramalingam, Ramya, Bastani, Osbert, Dillig, Işıl
Format: Journal Article
Language:English
Published: New York, NY, USA ACM 09.10.2025
Subjects:
ISSN:2475-1421, 2475-1421
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The goal of active learning for program synthesis is to synthesize the desired program by asking targeted questions that minimize user interaction. While prior work has explored active learning in the purely symbolic setting, such techniques are inadequate for the increasingly popular paradigm of neurosymbolic program synthesis, where the synthesized program incorporates neural components. When applied to the neurosymbolic setting, such techniques can -- and, in practice, do -- return an unintended program due to mispredictions of neural components. This paper proposes a new active learning technique that can handle the unique challenges posed by neural network mispredictions. Our approach is based upon a new evaluation strategy called constrained conformal evaluation (CCE), which accounts for neural mispredictions while taking into account user-provided feedback. Our proposed method iteratively makes CCE more precise until all remaining programs are guaranteed to be observationally equivalent. We have implemented this method in a tool called SmartLabel and experimentally evaluated it on three neurosymbolic domains. Our results demonstrate that SmartLabel identifies the ground truth program for 98% of the benchmarks, requiring under 5 rounds of user interaction on average. In contrast, prior techniques for active learning are only able to converge to the ground truth program for at most 65% of the benchmarks.
ISSN:2475-1421
2475-1421
DOI:10.1145/3763102