UniGenCoder: Merging SEQ2SEQ and SEQ2TREE Paradigms for Unified Code Generation

Deep learning-based code generation has completely transformed the way developers write programs today. Existing approaches to code generation have focused either on the Sequence-to-Sequence paradigm, which generates target code as a sequence of tokens, or the Sequence-to-Tree paradigm, which output...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM International Conference on Software Engineering: New Ideas and Emerging Technologies Results (Online) pp. 71 - 75
Main Authors: Shao, Liangying, Yan, Yanfu, Poshyvanyk, Denys, Su, Jinsong
Format: Conference Proceeding
Language:English
Published: IEEE 27.04.2025
Subjects:
ISSN:2832-7632
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deep learning-based code generation has completely transformed the way developers write programs today. Existing approaches to code generation have focused either on the Sequence-to-Sequence paradigm, which generates target code as a sequence of tokens, or the Sequence-to-Tree paradigm, which outputs code as a sequence of actions. While these two paradigms are intuitively complementary, their combination has not been previously explored. By comparing the code generated under these two paradigms, we find that integrating them holds significant potential. In this paper, we propose UniGenCoder for code-related generation tasks, which consists of a shared encoder, a shared decoder with a minimal set of additional parameters to unify two paradigms, and a selector that dynamically chooses optimal paradigm for each instance. Also, during the model training, we first perform the multi-task learning and distillation strategies to facilitate knowledge transfer between two paradigms, and then leverage contrastive learning to train the selector. Experimental results on the text-to-code and code-to-code generation tasks demonstrate the effectiveness of our proposed model. We release our code at https://github.com/DeepLearnXMU/UniGenCoder.
ISSN:2832-7632
DOI:10.1109/ICSE-NIER66352.2025.00020