Reliable benchmarking: requirements and solutions

Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	International journal on software tools for technology transfer Ročník 21; číslo 1; s. 1 - 29
Hlavní autori:	Beyer, Dirk, Löwe, Stefan, Wendler, Philipp
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Berlin/Heidelberg Springer Berlin Heidelberg 06.02.2019 Springer Nature B.V
Predmet:	Algorithms Benchmarks Computer Science Program verification (computers) Regular Paper Software Engineering Software Engineering/Programming and Operating Systems Software reliability Solvers Source code Theory of Computation Competition Benchmarking Container Process isolation Resource measurement Process control
ISSN:	1433-2779, 1433-2787
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec , a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1433-2779 1433-2787
DOI:	10.1007/s10009-017-0469-y