Coverage-Based Greybox Fuzzing as Markov Chain

Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test is generated by slightly mutating a seed input. If the test exercises a new and interesting path, it is added to the set of seeds; otherwise, it is discarded. We observe that most tests ex...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on software engineering Ročník 45; číslo 5; s. 489 - 506
Hlavní autoři:	Bohme, Marcel, Van-Thuan Pham, Roychoudhury, Abhik
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.05.2019 IEEE Computer Society
Témata:	automated testing Computer crashes Crashes Exposure fuzzing Markov analysis Markov chains Markov processes path exploration Program verification (computers) Schedules Search problems Seeds symbolic execution Systematics Vulnerability detection
ISSN:	0098-5589, 1939-3520
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test is generated by slightly mutating a seed input. If the test exercises a new and interesting path, it is added to the set of seeds; otherwise, it is discarded. We observe that most tests exercise the same few "high-frequency" paths and develop strategies to explore significantly more paths with the same number of tests by gravitating towards low-frequency paths. We explain the challenges and opportunities of CGF using a Markov chain model which specifies the probability that fuzzing the seed that exercises path i generates an input that exercises path j. Each state (i.e., seed) has an energy that specifies the number of inputs to be generated from that seed. We show that CGF is considerably more efficient if energy is inversely proportional to the density of the stationary distribution and increases monotonically every time that seed is chosen. Energy is controlled with a power schedule. We implemented several schedules by extending AFL. In 24 hours, AFLFast exposes 3 previously unreported CVEs that are not exposed by AFL and exposes 6 previously unreported CVEs 7x faster than AFL. AFLFast produces at least an order of magnitude more unique crashes than AFL. We compared AFLFast to the symbolic executor Klee. In terms of vulnerability detection, AFLFast is significantly more effective than Klee on the same subject programs that were discussed in the original Klee paper. In terms of code coverage, AFLFast only slightly outperforms Klee while a combination of both tools achieves best results by mitigating the individual weaknesses.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0098-5589 1939-3520
DOI:	10.1109/TSE.2017.2785841