Alleviating patch overfitting with automatic test generation: a study of feasibility and effectiveness for the Nopol repair system

Among the many different kinds of program repair techniques, one widely studied family of techniques is called test suite based repair. However, test suites are in essence input-output specifications and are thus typically inadequate for completely specifying the expected behavior of the program und...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Empirical software engineering : an international journal Ročník 24; číslo 1; s. 33 - 67
Hlavní autoři: Yu, Zhongxing, Martinez, Matias, Danglot, Benjamin, Durieux, Thomas, Monperrus, Martin
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.02.2019
Springer Nature B.V
Springer Verlag
Témata:
ISSN:1382-3256, 1573-7616, 1573-7616
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Among the many different kinds of program repair techniques, one widely studied family of techniques is called test suite based repair. However, test suites are in essence input-output specifications and are thus typically inadequate for completely specifying the expected behavior of the program under repair. Consequently, the patches generated by test suite based repair techniques can just overfit to the used test suite, and fail to generalize to other tests. We deeply analyze the overfitting problem in program repair and give a classification of this problem. This classification will help the community to better understand and design techniques to defeat the overfitting problem. We further propose and evaluate an approach called UnsatGuided, which aims to alleviate the overfitting problem for synthesis-based repair techniques with automatic test case generation. The approach uses additional automatically generated tests to strengthen the repair constraint used by synthesis-based repair techniques. We analyze the effectiveness of UnsatGuided: 1) analytically with respect to alleviating two different kinds of overfitting issues; 2) empirically based on an experiment over the 224 bugs of the Defects4J repository. The main result is that automatic test generation is effective in alleviating one kind of overfitting, issue–regression introduction, but due to oracle problem, has minimal positive impact on alleviating the other kind of overfitting issue–incomplete fixing.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1382-3256
1573-7616
1573-7616
DOI:10.1007/s10664-018-9619-4