Discovering essential code elements in informal documentation

To access the knowledge contained in developer communication, such as forum posts, it is useful to determine automatically the code elements referred to in the discussions. We propose a novel traceability recovery approach to extract the code elements contained in various documents. As opposed to pr...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the 2013 International Conference on Software Engineering pp. 832 - 841
Main Authors: Rigby, Peter C., Robillard, Martin P.
Format: Conference Proceeding
Language:English
Published: Piscataway, NJ, USA IEEE Press 18.05.2013
Series:ACM Conferences
Subjects:
ISBN:1467330760, 9781467330763
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To access the knowledge contained in developer communication, such as forum posts, it is useful to determine automatically the code elements referred to in the discussions. We propose a novel traceability recovery approach to extract the code elements contained in various documents. As opposed to previous work, our approach does not require an index of code elements to find links, which makes it particularly well-suited for the analysis of informal documentation. When evaluated on 188 StackOverflow answer posts containing 993 code elements, the technique performs with average 0.92 precision and 0.90 recall. As a major refinement on traditional traceability approaches, we also propose to detect which of the code elements in a document are salient, or germane, to the topic of the post. To this end we developed a three-feature decision tree classifier that performs with a precision of 0.65-0.74 and recall of 0.30-0.65, depending on the subject of the document.
ISBN:1467330760
9781467330763
DOI:10.5555/2486788.2486897