Comparing the effectiveness of several modeling methods for fault prediction

We compare the effectiveness of four modeling methods—negative binomial regression, recursive partitioning, random forests and Bayesian additive regression trees—for predicting the files likely to contain the most faults for 28 to 35 releases of three large industrial software systems. Predictor var...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Empirical software engineering : an international journal Jg. 15; H. 3; S. 277 - 295
Hauptverfasser: Weyuker, Elaine J., Ostrand, Thomas J., Bell, Robert M.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Boston Springer US 01.06.2010
Springer Nature B.V
Schlagworte:
ISSN:1382-3256, 1573-7616, 1573-7616
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We compare the effectiveness of four modeling methods—negative binomial regression, recursive partitioning, random forests and Bayesian additive regression trees—for predicting the files likely to contain the most faults for 28 to 35 releases of three large industrial software systems. Predictor variables included lines of code, file age, faults in the previous release, changes in the previous two releases, and programming language. To compare the effectiveness of the different models, we use two metrics—the percent of faults contained in the top 20% of files identified by the model, and a new, more general metric, the fault-percentile-average . The negative binomial regression and random forests models performed significantly better than recursive partitioning and Bayesian additive regression trees, as assessed by either of the metrics. For each of the three systems, the negative binomial and random forests models identified 20% of the files in each release that contained an average of 76% to 94% of the faults.
Bibliographie:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:1382-3256
1573-7616
1573-7616
DOI:10.1007/s10664-009-9111-2