A privacy-preserving federated meta-learning framework for cross-project defect prediction in software systems.

Uloženo v:
Podrobná bibliografie
Název: A privacy-preserving federated meta-learning framework for cross-project defect prediction in software systems.
Autoři: Potharlanka, Jhansi Lakshmi, Shaik, Kareena Yashmin, N, Bharath Kumar
Zdroj: Scientific Reports; 11/18/2025, Vol. 15 Issue 1, p1-28, 28p
Témata: FEDERATED learning, SOFTWARE engineering, DATA protection software, DEFECT tracking (Computer software development), DATA security, MACHINE learning, KNOWLEDGE transfer
Abstrakt: Software defect prediction (SDP) is a critical task in software engineering, aiming to identify fault-prone modules before deployment. This paper introduces the Efficient Communication Federated Meta-Learning (ECFML) framework for cross-project defect prediction (CPDP). ECFML integrates Model-Agnostic Meta-Learning (MAML) with a lightweight Mobile Vision Transformer (MobileViT)-inspired backbone adapted for tabular software metrics. Feature vectors are projected into token sequences and processed via 1D convolutions and transformer mixing, enabling effective representation learning with a compact footprint ( 142 k parameters, 0.54 MB). This design reduces both computation and communication overhead in federated environments. Experiments on the AEEEM benchmark (EQ, JDT, PDE) show that ECFML achieves competitive or superior performance compared to ResNet-18 and U-Net. On EQ, it yields the highest gains in F1-score and AUC; on PDE it consistently improves F1-score and G-Mean; and on JDT it achieves performance comparable to strong baselines, reflecting stable generalization across heterogeneous projects. Privacy is enforced via Laplace Differential Privacy with a fixed clipping bound specified a priori, ensuring pure -DP guarantees per round under conservative composition (). Robustness analysis further shows that the framework maintains stronger performance than baselines under additive Gaussian noise and FGSM perturbations, though degradation remains under stronger adversarial settings. Overall, ECFML strikes a balance between predictive accuracy, privacy preservation, and communication efficiency, making it a viable solution for federated, privacy-sensitive software repositories. [ABSTRACT FROM AUTHOR]
Copyright of Scientific Reports is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Complementary Index
Popis
Abstrakt:Software defect prediction (SDP) is a critical task in software engineering, aiming to identify fault-prone modules before deployment. This paper introduces the Efficient Communication Federated Meta-Learning (ECFML) framework for cross-project defect prediction (CPDP). ECFML integrates Model-Agnostic Meta-Learning (MAML) with a lightweight Mobile Vision Transformer (MobileViT)-inspired backbone adapted for tabular software metrics. Feature vectors are projected into token sequences and processed via 1D convolutions and transformer mixing, enabling effective representation learning with a compact footprint ( 142 k parameters, 0.54 MB). This design reduces both computation and communication overhead in federated environments. Experiments on the AEEEM benchmark (EQ, JDT, PDE) show that ECFML achieves competitive or superior performance compared to ResNet-18 and U-Net. On EQ, it yields the highest gains in F1-score and AUC; on PDE it consistently improves F1-score and G-Mean; and on JDT it achieves performance comparable to strong baselines, reflecting stable generalization across heterogeneous projects. Privacy is enforced via Laplace Differential Privacy with a fixed clipping bound specified a priori, ensuring pure -DP guarantees per round under conservative composition (). Robustness analysis further shows that the framework maintains stronger performance than baselines under additive Gaussian noise and FGSM perturbations, though degradation remains under stronger adversarial settings. Overall, ECFML strikes a balance between predictive accuracy, privacy preservation, and communication efficiency, making it a viable solution for federated, privacy-sensitive software repositories. [ABSTRACT FROM AUTHOR]
ISSN:20452322
DOI:10.1038/s41598-025-24440-7