Detection of Duplicate Defect Reports Using Natural Language Processing

Defect reports are generated from various testing and development activities in software engineering. Sometimes two reports are submitted that describe the same problem, leading to duplicate reports. These reports are mostly written in structured natural language, and as such, it is hard to compare...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:29th International Conference on Software Engineering (ICSE'07) s. 499 - 510
Hlavní autoři: Runeson, P., Alexandersson, M., Nyholm, O.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.01.2007
Témata:
ISBN:9780769528281, 0769528287
ISSN:0270-5257
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Defect reports are generated from various testing and development activities in software engineering. Sometimes two reports are submitted that describe the same problem, leading to duplicate reports. These reports are mostly written in structured natural language, and as such, it is hard to compare two reports for similarity with formal methods. In order to identify duplicates, we investigate using natural language processing (NLP) techniques to support the identification. A prototype tool is developed and evaluated in a case study analyzing defect reports at Sony Ericsson mobile communications. The evaluation shows that about 2/3 of the duplicates can possibly be found using the NLP techniques. Different variants of the techniques provide only minor result differences, indicating a robust technology. User testing shows that the overall attitude towards the technique is positive and that it has a growth potential.
Bibliografie:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:9780769528281
0769528287
ISSN:0270-5257
DOI:10.1109/ICSE.2007.32