Some Investigations of Machine Learning Models for Software Defects

Software defect prediction (SDP) and software defect severity prediction (SDSP) models alleviate the burden on the testers by providing the automatic assessment of a newly-developed program in a short amount of time. The research on defect prediction or defect severity prediction is primarily focuse...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings (IEEE/ACM International Conference on Software Engineering Companion. Online) S. 259 - 263
1. Verfasser:	Bhutamapuram, Umamaheswara Sharma
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.05.2023
Schlagworte:	Cross-Project Defect Prediction Feasibility Study Performance Measures Software Defect Prediction Software Defect Severity Prediction Software Reliability
ISSN:	2574-1934
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Software defect prediction (SDP) and software defect severity prediction (SDSP) models alleviate the burden on the testers by providing the automatic assessment of a newly-developed program in a short amount of time. The research on defect prediction or defect severity prediction is primarily focused on proposing classification frameworks or addressing challenges in developing prediction models; however, the primary yet significant gap in the literature is interpreting the predictions in terms of project objectives. Furthermore, the literature indicates that these models have poor predictive performance. In this thesis, we investigate the use of a diversity-based ensemble learning mechanism for the cross-project defect prediction (CPDP) task and self-training semi-supervised learning for the software defect severity prediction, respectively, for obtaining better prediction performances. We also propose a few project-specific performance measures to interpret the predictions in terms of project objectives (such as a reduction in expenditure, time, and failure chances). Through the empirical analysis, we observe that (1) the diversity-based ensemble learning mechanism improves the prediction performance in terms of both the traditional and proposed measures, and (2) the self-training semi-supervised learning model has a positive impact on predicting the severity of a defective module. Once a potential prediction model is developed, any software organisation may utilise its services. How can an organisation showcase their trust in the developed prediction model? To this end, we investigate the feasibility of SDP models in real-world testing environments by providing proofs using the probabilistic bounds. The proofs summarised show that even if the prediction model has a lower failure probability, the probability of obtaining fewer failures in SDP-tested software than in similar but manually tested software is still exponentially small. This result enables the researchers in SDP to avoid proposing prediction models.
ISSN:	2574-1934
DOI:	10.1109/ICSE-Companion58688.2023.00070