An integrated graph neural network model for joint software defect prediction and code quality assessment.
Saved in:
| Title: | An integrated graph neural network model for joint software defect prediction and code quality assessment. |
|---|---|
| Authors: | Dai P; Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China. pingdai2025@163.com., Zhu H; Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China., Wu J; Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China., He H; Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China. |
| Source: | Scientific reports [Sci Rep] 2025 Dec 11. Date of Electronic Publication: 2025 Dec 11. |
| Publication Model: | Ahead of Print |
| Publication Type: | Journal Article |
| Language: | English |
| Journal Info: | Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101563288 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 2045-2322 (Electronic) Linking ISSN: 20452322 NLM ISO Abbreviation: Sci Rep Subsets: MEDLINE |
| Imprint Name(s): | Original Publication: London : Nature Publishing Group, copyright 2011- |
| Abstract: | Current software defect prediction and code quality assessment methods treat these inherently related tasks independently, failing to leverage their complementary information. Existing graph-based approaches lack the ability to jointly model structural dependencies and quality characteristics, limiting their effectiveness in capturing the complex relationships between defect patterns and code quality indicators. This paper proposes a novel integrated model that simultaneously tackles both objectives using graph neural networks to leverage the inherent graph structure of software systems. Our novelty lies in the first-of-its-kind integration of multi-level graph representations (AST, CFG, DFG) with a dual-branch attention-based GNN architecture for simultaneous defect prediction and quality assessment. Our approach constructs multi-level graph representations by integrating abstract syntax trees, control flow graphs, and data flow graphs, capturing both syntactic and semantic relationships in source code. The proposed dual-branch GNN architecture employs shared representation learning with attention mechanisms and multi-task optimization to exploit complementary information between defect prediction and quality assessment tasks. Comprehensive experiments on six real-world software projects demonstrate significant improvements over traditional methods, achieving F1-scores of 0.811 and AUC values of 0.896 for defect prediction, while showing 9.3% average improvement in code quality assessment accuracy across multiple quality dimensions. The integration strategy proves effective in capturing complex structural dependencies and provides actionable insights for software development teams, establishing a foundation for intelligent software engineering tools that deliver comprehensive code analysis capabilities. (© 2025. The Author(s).) |
| Competing Interests: | Declarations. Competing interests: The authors declare no competing interests. Ethics approval: This study was conducted in accordance with ethical guidelines for software engineering research and was approved by the Research Ethics Committee of Anhui Institute of Information Technology (Ethics Approval Reference: AIIT-CSE-2024-015, approved on March 15, 2024). The research exclusively utilized publicly available open-source software repositories and datasets that are freely accessible under respective open-source licenses. All data collection and analysis procedures complied with the terms of use of the respective software repositories, including Apache Software Foundation, Eclipse Foundation, and Mozilla Foundation guidelines. No human subjects were involved in this research, and all software projects analyzed were publicly available with appropriate licensing for academic research purposes. |
| References: | Albattah, W. & Alzahrani, M. Software defect prediction based on machine learning and deep learning techniques: an empirical approach. AI 5 (4), 1743–1758. https://doi.org/10.3390/ai5040086 (2024). Köksal, Ö., Babur, Ö. & Tekinerdogan, B. On the use of deep learning in software defect prediction. J. Syst. Softw. 194, 111511. https://doi.org/10.1016/j.jss.2022.111511 (2022). Raschka, S., Patterson, J. & Nolet, C. Machine learning in python: main developments and technology trends in data science, machine learning, and artificial intelligence. Information 11 (4), 193 (2020). Stradowski, M. & Madeyski, L. Industrial applications of software defect prediction using machine learning: A business-driven systematic literature review. J. Syst. Softw. 200, 111460. https://doi.org/10.1016/j.jss.2023.111460 (2023). Tantithamthavorn, C., Hassan, A. E. & Matsumoto, K. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. Software Eng. 46 (11), 1200–1219 (2020). Yang, X., Lo, D., Xia, X., Zhang, Y. & Sun, J. Deep learning for just-in-time defect prediction. In Proceedings of the 2015 IEEE International Conference on Software Quality, Reliability and Security (pp. 17–26). (2015). Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst. 32 (1), 4–24 (2020). Ying, Z. et al. Hierarchical graph representation learning with differentiable pooling. Adv. Neural. Inf. Process. Syst. https://doi.org/10.48550/arXiv.1806.08804 (2018). Radjenović, D., Heričko, M., Torkar, R. & Živkovič, A. Software fault prediction metrics: A systematic literature review. Inf. Softw. Technol. 55 (8), 1397–1418 (2013). Menzies, T. et al. Defect prediction from static code features: current results, limitations, new approaches. Automated Softw. Eng. 17 (4), 375–407 (2010). Hall, T., Beecham, S., Bowes, D., Gray, D. & Counsell, S. A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Software Eng. 38 (6), 1276–1304 (2012). Ghotra, B., McIntosh, S. & Hassan, A. E. Revisiting the impact of classification techniques on the performance of defect prediction models. In Proceedings of the 37th international conference on software engineering (pp. 789–800). (2015). Li, J., He, P., Zhu, J. & Lyu, M. R. Software defect prediction via convolutional neural network. In 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS) (pp. 318–328). (2017). Wang, S., Liu, T. & Tan, L. Automatically learning semantic features for defect prediction. In Proceedings of the 38th international conference on software engineering (pp. 297–308). (2016). Devlin, J., Chang, M. W., Lee, K. & Toutanova, K. Bert: Pre-training of deep bidirectional Transformers for Language Understanding. ArXiv Preprint arXiv :181004805. (2018). Chen, X., Zhao, Y., Wang, Q. & Yuan, Z. Multi: Multi-objective effort-aware just-in-time software defect prediction. Inf. Softw. Technol. 93, 1–13 (2018). Zhu, E., Wang, S., Liu, C. & Wang, J. Adaptive tokenization transformer: enhancing irregularly sampled multivariate Time-Series analysis. IEEE Internet Things J. 12 (19), 39237–39246. https://doi.org/10.1109/JIOT.2025.3554249 (2025). Chidamber, S. R. & Kemerer, C. F. A metrics suite for object oriented design. IEEE Trans. Software Eng. 20 (6), 476–493 (1994). Heitlager, I., Kuipers, T. & Visser, J. A practical model for measuring maintainability. In 6th international conference on the quality of information and communications technology (QUATIC 2007) (pp. 30–39). (2007). Emanuelsson, P. & Nilsson, U. A comparative study of industrial static analysis tools. Electr. Notes Theor. Comput. Sci. 217, 5–21 (2008). Ball, T. The concept of dynamic analysis. ACM SIGSOFT Softw. Eng. Notes. 24 (6), 216–234 (1999). Alves, T. L., Ypma, C. & Visser, J. Deriving metric thresholds from benchmark data. In 2010 IEEE international conference on software maintenance (pp. 1–10). (2010). ISO/IEC. ISO/IEC 25010:2011 Systems and software engineering – Systems and software Quality Requirements and Evaluation (SQuaRE) – System and software quality models. (2011). Buse, R. P. & Weimer, W. R. Learning a metric for code readability. IEEE Trans. Software Eng. 36 (4), 546–558 (2010). Abadeh, M. N. Knowledge-enhanced software refinement: leveraging reinforcement learning for search-based quality engineering. Automated Softw. Eng. 31, 57. https://doi.org/10.1007/s10515-024-00456-7 (2024). Abadeh, M. N. Performance-driven software development: an incremental refinement approach for high-quality requirement engineering. Requirements Eng. 25, 95–113. https://doi.org/10.1007/s00766-019-00309-w (2020). Abadeh, M. & Mirzaie, M. An empirical analysis for software robustness vulnerability in terms of modularity quality. Syst. Eng. 26 (6), 754–769. https://doi.org/10.1002/sys.21686 (2023). Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Networks. 20 (1), 61–80 (2008). Liu, C. et al. Boosting reinforcement learning via hierarchical game playing with state relay. IEEE Trans. Neural Networks Learn. Syst. 36 (4), 7077–7089. https://doi.org/10.1109/TNNLS.2024.3386717 (2025). Cheng, X., Wang, H., Hua, J., Xu, G. & Sui, Y. DeepWukong: statically detecting software vulnerabilities using deep graph neural network. ACM Trans. Softw. Eng. Methodol. 30 (3), 1–33 (2021). Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. (2016). Hamilton, W., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. Adv. Neural. Inf. Process. Syst. https://doi.org/10.48550/arXiv.1706.02216 (2017). Veličković, P. et al. Graph attention networks. arXiv preprint arXiv:1710.10903. (2017). Allamanis, M., Brockschmidt, M. & Khademi, M. Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740. (2017). Chen, Z., Monperrus, M. & Baudry, B. A comprehensive study of deep learning for software engineering. arXiv preprint arXiv:1909.03636. (2019). Aho, A. V., Lam, M. S., Sethi, R. & Ullman, J. D. Compilers: principles, techniques, and tools (2nd edition). Addison Wesley. (2006). Allen, F. E. Control flow analysis. ACM Sigplan Notices. 5 (7), 1–19 (1970). Ferrante, J., Ottenstein, K. J. & Warren, J. D. The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9 (3), 319–349 (1987). Caruana, R. Multitask learning. Mach. Learn. 28 (1), 41–75 (1997). Ruder, S. An overview of multi-task learning in deep neural networks. ArXiv Preprint arXiv :170605098. (2017). Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. (2014). Zhang, Y. & Yang, Q. A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. 34 (12), 5586–5609 (2021). Crawshaw, M. Multi-task learning with deep neural networks: A survey. arXiv preprint arXiv:2009.09796. (2020). Kendall, A., Gal, Y. & Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7482–7491). (2018). Liu, S., Johns, E. & Davison, A. J. End-to-end multi-task learning with attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1871–1880). (2019). Chen, Z., Badrinarayanan, V., Lee, C. Y. & Rabinovich, A. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In International conference on machine learning (pp. 794–803). (2018). McCabe, T. J. A complexity measure. IEEE Trans. Softw. Eng. SE. -2 (4), 308–320 (1976). Halstead, M. H. Elements of Software Science (Operating and Programming Systems series) (Elsevier Science Inc., 1977). Jureczko, M. & Madeyski, L. Towards identifying software project clusters with regard to defect prediction. In Proceedings of the 6th international conference on predictive models in software engineering (pp. 1–10). (2010). Zimmermann, T., Premraj, R. & Zeller, A. Predicting defects for eclipse. In Third international workshop on predictor models in software engineering (PROMISE’07: ICSE workshops 2007) (pp. 9–9). (2007). Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin,I. (2017). Attention is all you need. Advances in neural information processing systems,30. Garg, S., Ramakrishnan, N. & Buehrer, G. Learning graph neural networks for software defect prediction. In 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS) (pp. 296–305). (2020). Rawat, N., Somani, V. & Tripathi, A. K. Prioritizing software regression testing using reinforcement learning and hidden Markov model. Int. J. Comput. Appl. 45 (12), 748–754. https://doi.org/10.1080/1206212X.2023.2273585 (2023). Hata, H., Mizuno, O. & Kikuno, T. Bug prediction based on fine-grained module histories. In Proceedings of the 34th international conference on software engineering (pp. 200–210). (2012). Yang, Y. & Hospedales, T. M. Deep multi-task representation learning: A tensor factorisation approach. arXiv preprint arXiv:1605.06391. (2017). Zhou, Y., Tong, Y., Gu, R. & Gall, H. Combining text mining and data mining for bug report classification. J. Software: Evol. Process. 28 (3), 150–176 (2016). Hindle, A., Barr, E. T., Su, Z., Gabel, M. & Devanbu, P. On the naturalness of software. In 2012 34th International Conference on Software Engineering (ICSE) (pp. 837–847). (2012). |
| Contributed Indexing: | Keywords: Code quality assessment; Graph neural networks; Multi-task learning; Program analysis; Software defect prediction; Software engineering |
| Entry Date(s): | Date Created: 20251211 Latest Revision: 20251211 |
| Update Code: | 20251212 |
| DOI: | 10.1038/s41598-025-31209-5 |
| PMID: | 41381815 |
| Database: | MEDLINE |
| Abstract: | Current software defect prediction and code quality assessment methods treat these inherently related tasks independently, failing to leverage their complementary information. Existing graph-based approaches lack the ability to jointly model structural dependencies and quality characteristics, limiting their effectiveness in capturing the complex relationships between defect patterns and code quality indicators. This paper proposes a novel integrated model that simultaneously tackles both objectives using graph neural networks to leverage the inherent graph structure of software systems. Our novelty lies in the first-of-its-kind integration of multi-level graph representations (AST, CFG, DFG) with a dual-branch attention-based GNN architecture for simultaneous defect prediction and quality assessment. Our approach constructs multi-level graph representations by integrating abstract syntax trees, control flow graphs, and data flow graphs, capturing both syntactic and semantic relationships in source code. The proposed dual-branch GNN architecture employs shared representation learning with attention mechanisms and multi-task optimization to exploit complementary information between defect prediction and quality assessment tasks. Comprehensive experiments on six real-world software projects demonstrate significant improvements over traditional methods, achieving F1-scores of 0.811 and AUC values of 0.896 for defect prediction, while showing 9.3% average improvement in code quality assessment accuracy across multiple quality dimensions. The integration strategy proves effective in capturing complex structural dependencies and provides actionable insights for software development teams, establishing a foundation for intelligent software engineering tools that deliver comprehensive code analysis capabilities.<br /> (© 2025. The Author(s).) |
|---|---|
| ISSN: | 2045-2322 |
| DOI: | 10.1038/s41598-025-31209-5 |
Full Text Finder
Nájsť tento článok vo Web of Science