Predicting Faults from Cached History

We analyze the version history of 7 software systems to predict the most fault prone entities and files. The basic assumption is that faults do not occur in isolation, but rather in bursts of several related faults. Therefore, we cache locations that are likely to have faults: starting from the loca...

Full description

Saved in:
Bibliographic Details
Published in:29th International Conference on Software Engineering (ICSE'07) pp. 489 - 498
Main Authors: Sunghun Kim, Zimmermann, T., Whitehead, E.J., Zeller, A.
Format: Conference Proceeding
Language:English
Published: IEEE 01.05.2007
Subjects:
ISBN:9780769528281, 0769528287
ISSN:0270-5257
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We analyze the version history of 7 software systems to predict the most fault prone entities and files. The basic assumption is that faults do not occur in isolation, but rather in bursts of several related faults. Therefore, we cache locations that are likely to have faults: starting from the location of a known (fixed) fault, we cache the location itself, any locations changed together with the fault, recently added locations, and recently changed locations. By consulting the cache at the moment a fault is fixed, a developer can detect likely fault-prone locations. This is useful for prioritizing verification and validation resources on the most fault prone files or entities. In our evaluation of seven open source projects with more than 200,000 revisions, the cache selects 10% of the source code files; these files account for 73%-95% of faults - a significant advance beyond the state of the art.
Bibliography:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:9780769528281
0769528287
ISSN:0270-5257
DOI:10.1109/ICSE.2007.66