Defect Detection in Source Code via Multimodal Feature Fusion.

Saved in:
Bibliographic Details
Title: Defect Detection in Source Code via Multimodal Feature Fusion.
Authors: Xiong, Shuchu, Yin, Lu, Gu, Haozhan, Zhang, Chengquan
Source: Applied Sciences (2076-3417); Sep2025, Vol. 15 Issue 17, p9358, 26p
Subject Terms: SOURCE code, FEATURE extraction, MACHINE learning, DEFECT tracking (Computer software development), DATA structures
Abstract: To address the limitation of existing static defect detection methods in capturing code semantics and structural relationships—which leads to incomplete feature representation—we propose a multimodal feature fusion approach for source code defect detection. First, semantic features are extracted from code character sequences while structural features are derived from Abstract Syntax Trees (ASTs). Second, a structural attention mechanism dynamically models interdependencies between these two modalities and fuses them into comprehensive representation vectors. Finally, defect detection is performed based on the integrated representations. Experimental results on the Sard dataset demonstrate: Compared to baseline methods using single representations (semantic or structural), our approach improves F1-score by 1.96% to 11.76%. Against other feature fusion methods, it achieves 1.36% to 1.66% higher F1-score. The method demonstrates good stability when dealing with imbalanced defect category data. By effectively fusing multimodal code information, this approach significantly enhances the accuracy and adaptability of code defect detection in open-source environments. [ABSTRACT FROM AUTHOR]
Copyright of Applied Sciences (2076-3417) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Complementary Index
Description
Abstract:To address the limitation of existing static defect detection methods in capturing code semantics and structural relationships—which leads to incomplete feature representation—we propose a multimodal feature fusion approach for source code defect detection. First, semantic features are extracted from code character sequences while structural features are derived from Abstract Syntax Trees (ASTs). Second, a structural attention mechanism dynamically models interdependencies between these two modalities and fuses them into comprehensive representation vectors. Finally, defect detection is performed based on the integrated representations. Experimental results on the Sard dataset demonstrate: Compared to baseline methods using single representations (semantic or structural), our approach improves F1-score by 1.96% to 11.76%. Against other feature fusion methods, it achieves 1.36% to 1.66% higher F1-score. The method demonstrates good stability when dealing with imbalanced defect category data. By effectively fusing multimodal code information, this approach significantly enhances the accuracy and adaptability of code defect detection in open-source environments. [ABSTRACT FROM AUTHOR]
ISSN:20763417
DOI:10.3390/app15179358