Improving classifier-based effort-aware software defect prediction by reducing ranking errors

Saved in:
Bibliographic Details
Title: Improving classifier-based effort-aware software defect prediction by reducing ranking errors
Authors: Yuchen Guo, Martin Shepperd, Ning Li
Source: International Conference on Evaluation and Assessment in Software Engineering (EASE)
Publication Status: Preprint
Publisher Information: ACM, 2024.
Publication Year: 2024
Subject Terms: software defect prediction, Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Software Engineering, software engineering (cs.SE), ranking error, D.2, ranking strategy, effort-aware
Description: Context: Software defect prediction utilizes historical data to direct software quality assurance resources to potentially problematic components. Effort-aware (EA) defect prediction prioritizes more bug-like components by taking cost-effectiveness into account. In other words, it is a ranking problem, however, existing ranking strategies based on classification, give limited consideration to ranking errors. Objective: Improve the performance of classifier-based EA ranking methods by focusing on ranking errors. Method: We propose a ranking score calculation strategy called EA-Z which sets a lower bound to avoid near-zero ranking errors. We investigate four primary EA ranking strategies with 16 classification learners, and conduct the experiments for EA-Z and the other four existing strategies. Results: Experimental results from 72 data sets show EA-Z is the best ranking score calculation strategy in terms of Recall@20% and Popt when considering all 16 learners. For particular learners, imbalanced ensemble learner UBag-svm and UBst-rf achieve top performance with EA-Z. Conclusion: Our study indicates the effectiveness of reducing ranking errors for classifier-based effort-aware defect prediction. We recommend using EA-Z with imbalanced ensemble learning.
10 pages with 12 figures. Accepted by International Conference on Evaluation and Assessment in Software Engineering (EASE) 2024
Document Type: Article
Conference object
File Description: Electronic
DOI: 10.1145/3661167.3661195
DOI: 10.48550/arxiv.2405.07604
Access URL: http://arxiv.org/abs/2405.07604
https://bura.brunel.ac.uk/handle/2438/29008
Rights: arXiv Non-Exclusive Distribution
URL: https://www.acm.org/publications/policies/copyright_policy#Background
Accession Number: edsair.doi.dedup.....0feff0b48afb1a655def1deb3fb4d838
Database: OpenAIRE
Description
Abstract:Context: Software defect prediction utilizes historical data to direct software quality assurance resources to potentially problematic components. Effort-aware (EA) defect prediction prioritizes more bug-like components by taking cost-effectiveness into account. In other words, it is a ranking problem, however, existing ranking strategies based on classification, give limited consideration to ranking errors. Objective: Improve the performance of classifier-based EA ranking methods by focusing on ranking errors. Method: We propose a ranking score calculation strategy called EA-Z which sets a lower bound to avoid near-zero ranking errors. We investigate four primary EA ranking strategies with 16 classification learners, and conduct the experiments for EA-Z and the other four existing strategies. Results: Experimental results from 72 data sets show EA-Z is the best ranking score calculation strategy in terms of Recall@20% and Popt when considering all 16 learners. For particular learners, imbalanced ensemble learner UBag-svm and UBst-rf achieve top performance with EA-Z. Conclusion: Our study indicates the effectiveness of reducing ranking errors for classifier-based effort-aware defect prediction. We recommend using EA-Z with imbalanced ensemble learning.<br />10 pages with 12 figures. Accepted by International Conference on Evaluation and Assessment in Software Engineering (EASE) 2024
DOI:10.1145/3661167.3661195