Semantic-aware Source Code Modeling

Source code modeling represents a promising avenue for automating software development, such as code generation, bug repair, and program analysis. This research direction aims to train deep neural nets to learn the statistical predictability inherent in human-written programs to enhance developer pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] S. 2494 - 2497
1. Verfasser: Ding, Yangruibo
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: ACM 27.10.2024
Schlagworte:
ISSN:2643-1572
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Source code modeling represents a promising avenue for automating software development, such as code generation, bug repair, and program analysis. This research direction aims to train deep neural nets to learn the statistical predictability inherent in human-written programs to enhance developer productivity, code quality, and the overall software development life cycle.Although existing code modeling approaches, particularly those underpinned by Transformer-based language models, have demonstrated effectiveness across various software engineering tasks, most of them have directly adopted learning schemes from natural language processing (e.g., data collection and processing, training objectives) to source code, primarily focusing on learning code text and syntax. However, such a direct transplant limits the models' capability to capture deep program semantics, such as code functionality, dependencies, and program states during execution.In this research proposal, we highlight the critical role of program semantics in source code modeling. We propose a range of innovative methodologies to bridge the gap between the text-based language models for large-scale code training and the requirement of deep semantic understanding to assist with software engineering tasks effectively. Furthermore, we showcase the efficacy of the proposed semantic-aware code modeling through a handful of published papers and preliminary results, with motivations to delve deeper into this avenue during doctoral research.
ISSN:2643-1572
DOI:10.1145/3691620.3695605