Active Learning for Neurosymbolic Program Synthesis
The goal of active learning for program synthesis is to synthesize the desired program by asking targeted questions that minimize user interaction. While prior work has explored active learning in the purely symbolic setting, such techniques are inadequate for the increasingly popular paradigm of ne...
Saved in:
| Published in: | Proceedings of ACM on programming languages Vol. 9; no. OOPSLA2; pp. 1455 - 1483 |
|---|---|
| Main Authors: | , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York, NY, USA
ACM
09.10.2025
|
| Subjects: | |
| ISSN: | 2475-1421, 2475-1421 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The goal of active learning for program synthesis is to synthesize the desired program by asking targeted questions that minimize user interaction. While prior work has explored active learning in the purely symbolic setting, such techniques are inadequate for the increasingly popular paradigm of neurosymbolic program synthesis, where the synthesized program incorporates neural components. When applied to the neurosymbolic setting, such techniques can -- and, in practice, do -- return an unintended program due to mispredictions of neural components. This paper proposes a new active learning technique that can handle the unique challenges posed by neural network mispredictions. Our approach is based upon a new evaluation strategy called constrained conformal evaluation (CCE), which accounts for neural mispredictions while taking into account user-provided feedback. Our proposed method iteratively makes CCE more precise until all remaining programs are guaranteed to be observationally equivalent. We have implemented this method in a tool called SmartLabel and experimentally evaluated it on three neurosymbolic domains. Our results demonstrate that SmartLabel identifies the ground truth program for 98% of the benchmarks, requiring under 5 rounds of user interaction on average. In contrast, prior techniques for active learning are only able to converge to the ground truth program for at most 65% of the benchmarks. |
|---|---|
| ISSN: | 2475-1421 2475-1421 |
| DOI: | 10.1145/3763102 |