Reliable benchmarking: requirements and solutions

Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of...

Full description

Saved in:
Bibliographic Details
Published in:International journal on software tools for technology transfer Vol. 21; no. 1; pp. 1 - 29
Main Authors: Beyer, Dirk, Löwe, Stefan, Wendler, Philipp
Format: Journal Article
Language:English
Published: Berlin/Heidelberg Springer Berlin Heidelberg 06.02.2019
Springer Nature B.V
Subjects:
ISSN:1433-2779, 1433-2787
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec , a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.
AbstractList Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec , a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.
Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.
Author Beyer, Dirk
Löwe, Stefan
Wendler, Philipp
Author_xml – sequence: 1
  givenname: Dirk
  surname: Beyer
  fullname: Beyer, Dirk
  organization: LMU Munich
– sequence: 2
  givenname: Stefan
  surname: Löwe
  fullname: Löwe, Stefan
  organization: One Logic
– sequence: 3
  givenname: Philipp
  surname: Wendler
  fullname: Wendler, Philipp
  organization: LMU Munich
BookMark eNp9kE1LwzAYgINMcJv-AG8Fz9V8Nok3GX7BQJDdQ5omM7NLt6Q97N-bUVEQ3CXJ4XmSN88MTEIXLADXCN4iCPldyiuUJUS8hLSS5eEMTBElpMRc8MnPmcsLMEtpAzNYcTkF6N22XtetLWobzMdWx08f1vdFtPvBR7u1oU-FDk2RunbofRfSJTh3uk326nufg9XT42rxUi7fnl8XD8vSUML6suZONxoZKpwjEDXGaMwIQ4zgqpZMCs0Ex1RwU7kas4YY3UgtqBXQOsHIHNyM1-5itx9s6tWmG2LILypMJeZSVIycpJAgUlBU0UzxkTKxSylap4zv9fE3fdS-VQiqY0Q1RlS5jTpGVIdsoj_mLvoc6XDSwaOTMhvWNv7O9L_0BbwBhaA
CitedBy_id crossref_primary_10_1007_s10009_025_00809_x
crossref_primary_10_1007_s10270_024_01155_3
crossref_primary_10_1145_3571242
crossref_primary_10_1007_s10817_019_09535_x
crossref_primary_10_1177_30504554241305110
crossref_primary_10_1007_s10009_021_00613_3
crossref_primary_10_1007_s10009_021_00615_1
crossref_primary_10_1109_ACCESS_2025_3560547
crossref_primary_10_1007_s10009_021_00631_1
crossref_primary_10_1134_S0361768820080022
crossref_primary_10_1145_3632925
crossref_primary_10_1145_3371082
crossref_primary_10_1007_s10009_020_00587_8
crossref_primary_10_1145_3477579
crossref_primary_10_1016_j_softx_2025_102087
crossref_primary_10_1007_s10009_024_00768_9
crossref_primary_10_1007_s10009_024_00769_8
crossref_primary_10_1007_s10817_024_09702_9
crossref_primary_10_1007_s10703_024_00449_y
crossref_primary_10_1109_TGCN_2024_3420957
crossref_primary_10_1145_3660797
crossref_primary_10_1145_3748647
crossref_primary_10_1016_j_cis_2024_103360
crossref_primary_10_1007_s11219_023_09620_w
crossref_primary_10_1145_3459080
crossref_primary_10_3390_math10132264
crossref_primary_10_14232_actacyb_298287
crossref_primary_10_1016_j_jss_2024_112058
crossref_primary_10_1007_s10703_025_00471_8
crossref_primary_10_1007_s10515_020_00270_x
crossref_primary_10_1007_s10009_025_00811_3
crossref_primary_10_1145_3732933
crossref_primary_10_1016_j_scico_2024_103154
crossref_primary_10_1016_j_scico_2025_103336
crossref_primary_10_1145_3660766
crossref_primary_10_1016_j_amc_2024_128827
crossref_primary_10_1109_ACCESS_2022_3171408
Cites_doi 10.1007/978-3-662-49674-9_55
10.1145/2884781.2884835
10.1007/978-3-662-54580-5_20
10.1007/978-3-662-46681-0_31
10.1145/2038642.2038650
10.1007/978-3-319-08587-6_28
10.1002/spe.2382
10.1109/2.675631
10.1007/978-1-84800-044-5_14
10.1145/2812803
10.1145/2413176.2413206
10.1145/2658987
10.1145/1508284.1508275
10.1145/1712605.1712640
10.1609/aaai.v31i1.10641
10.1145/2491411.2491429
10.1002/spe.2476
10.1007/978-3-642-40564-8_23
10.1007/978-3-642-36742-7_43
10.1007/978-3-642-28756-5_38
10.1109/ACSD.2014.12
10.1007/978-3-319-23404-5_12
10.1145/2393596.2393665
10.1007/978-3-642-25231-0_2
10.1007/978-3-319-08867-9_21
ContentType Journal Article
Copyright The Author(s) 2017. corrected publication 2020
International Journal on Software Tools for Technology Transfer is a copyright of Springer, (2017). All Rights Reserved.
The Author(s) 2017. corrected publication 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2017. corrected publication 2020
– notice: International Journal on Software Tools for Technology Transfer is a copyright of Springer, (2017). All Rights Reserved.
– notice: The Author(s) 2017. corrected publication 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
3V.
7SC
7XB
8AL
8AO
8FD
8FE
8FG
8FK
8G5
ABJCF
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
GUQSH
HCIFZ
JQ2
K7-
L6V
L7M
L~C
L~D
M0N
M2O
M7S
MBDVC
P5Z
P62
PADUT
PHGZM
PHGZT
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
Q9U
DOI 10.1007/s10009-017-0469-y
DatabaseName Springer Nature OA Free Journals
CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ProQuest Central (purchase pre-March 2016)
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
Research Library
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central
ProQuest Central Student
Research Library Prep
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
ProQuest Research Library
Engineering Database
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Research Library China
ProQuest Databases
ProQuest One Academic
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
ProQuest Central Basic
DatabaseTitle CrossRef
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Pharma Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Research Library
Research Library China
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Engineering Collection
Advanced Technologies & Aerospace Collection
ProQuest Computing
Engineering Database
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList CrossRef
Research Library Prep

Research Library Prep
Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1433-2787
EndPage 29
ExternalDocumentID 10_1007_s10009_017_0469_y
GroupedDBID -59
-5G
-BR
-EM
-~C
.86
.DC
.VR
06D
0R~
0VY
1N0
203
29J
2J2
2JN
2JY
2KG
2KM
2LR
2~H
30V
4.4
406
408
409
40D
40E
5GY
67Z
6NX
8AO
8FE
8FG
8FW
8G5
8TC
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTD
ABFTV
ABHLI
ABHQN
ABJCF
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABUWG
ABWNU
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACHXU
ACIWK
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACZOJ
ADHHG
ADHIR
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEMSY
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BDATZ
BENPR
BGLVJ
BGNMA
BPHCQ
BSONS
C6C
CCPQU
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GUQSH
GXS
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K6V
K7-
KDC
KOV
L6V
LAS
LLZTM
M0N
M2O
M4Y
M7S
MA-
NB0
NPVJJ
NQJWS
NU0
O93
O9J
OAM
P62
P9O
PADUT
PF0
PQQKQ
PROAC
PT4
PT5
PTHSS
Q2X
QOS
R89
R9I
ROL
RPX
RSV
S16
S27
S3B
SAP
SCO
SDH
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TSG
TSK
TSV
TUC
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7X
Z7Z
Z83
Z88
ZMTXR
.4S
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
3V.
7SC
7XB
8AL
8FD
8FK
JQ2
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c435t-b7fada1c48ff301dcca253515326b9598a5872487c6fb25d3cad9a84e80ef853
IEDL.DBID RSV
ISICitedReferencesCount 145
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000459292700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1433-2779
IngestDate Tue Nov 04 23:27:45 EST 2025
Wed Nov 05 00:36:14 EST 2025
Tue Nov 18 22:31:19 EST 2025
Sat Nov 29 03:07:48 EST 2025
Fri Feb 21 02:47:11 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Competition
Benchmarking
Container
Process isolation
Resource measurement
Process control
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c435t-b7fada1c48ff301dcca253515326b9598a5872487c6fb25d3cad9a84e80ef853
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://link.springer.com/10.1007/s10009-017-0469-y
PQID 2183984164
PQPubID 46652
PageCount 29
ParticipantIDs proquest_journals_2492798653
proquest_journals_2183984164
crossref_citationtrail_10_1007_s10009_017_0469_y
crossref_primary_10_1007_s10009_017_0469_y
springer_journals_10_1007_s10009_017_0469_y
PublicationCentury 2000
PublicationDate 20190206
PublicationDateYYYYMMDD 2019-02-06
PublicationDate_xml – month: 2
  year: 2019
  text: 20190206
  day: 6
PublicationDecade 2010
PublicationPlace Berlin/Heidelberg
PublicationPlace_xml – name: Berlin/Heidelberg
– name: Heidelberg
PublicationTitle International journal on software tools for technology transfer
PublicationTitleAbbrev Int J Softw Tools Technol Transfer
PublicationYear 2019
Publisher Springer Berlin Heidelberg
Springer Nature B.V
Publisher_xml – name: Springer Berlin Heidelberg
– name: Springer Nature B.V
References RousselOControlling a solver execution with the runsolver toolJSAT2011713914429178261331.68210
CokDRDéharbeDWeberTThe 2014 SMT competitionJSAT201692072423512057
Beyer, D.. Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for efficient regression verification. In: Proceedings of FSE, pp. 389–399. ACM (2013)
Beyer, D.: Second competition on software verification (Summary of SV-COMP 2013). In: Proceedings of TACAS, LNCS 7795, pp. 594–609. Springer (2013)
Singh, B., Srinivasan, V.: Containers: challenges with the memory resource controller and its performance. In: Proceedings of Ottawa Linux Symposium (OLS), pp. 209–222 (2007)
Beyer, D., Dresler, G., Wendler, P.: Software verification in the Google App-Engine cloud. In: Proceedings of CAV, LNCS 8559, pp. 327–333. Springer (2014)
Beyer, D.: Software verification and verifiable witnesses (Report on SV-COMP 2015). In: Proceedings of TACAS, LNCS 9035, pp. 401–416. Springer (2015)
Beyer, D., Löwe, S., Wendler, P.: Benchmarking and resource measurement. In: Proceedings of SPIN, LNCS 9232, pp. 160–178. Springer (2015)
Barrett, C., Fontaine, P., Tinelli, C.: The SMT-LIB standard: version 2.5. Technical report, University of Iowa (2015). www.smt-lib.org
Charwat, G., Ianni, G., Krennwallner, T., Kronegger, M., Pfandler, A., Redl, C., Schwengerer, M., Spendier, L., Wallner, J., Xiao, G.: VCWC: a versioning competition workflow compiler. In: Proceedings of LPNMR, LNCS 8148, pp. 233–238. Springer (2013)
Kalibera, T., Bulej, L., Tuma, P.: Benchmark precision and random initial state. In: Proceedings of SPECTS, pp. 484–490. SCS (2005)
Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (Report on SV-COMP 2016). In: Proceedings of TACAS, LNCS 9636, pp. 887–904. Springer (2016)
Handigol, N., Heller, B., Jeyakumar, V., Lantz, B., McKeown, N.: Reproducible network experiments using container-based emulation. In: Proceedings of CoNEXT, pp. 253–264. ACM (2012)
Kordon, F., Hulin-Hubard, F.: BenchKit, a tool for massive concurrent benchmarking. In: Proceedings of ACSD, pp. 159–165. IEEE (2014)
de Oliveira, A.B., Petkovich, J.-C., Fischmeister, S.: How much does memory layout impact performance? A wide study. In: Proceedings of REPRODUCE (2014)
Juristo, N., Gómez, O.S.: Replication of software engineering experiments. In: Empirical Software Engineering and Verification, pp. 60–88. Springer (2012)
Brooks, A., Roper, M., Wood, M., Daly, J., Miller, J.: Replication’s role in software engineering. In: Guide to Advanced Empirical Software Engineering, pp. 365–379. Springer (2008)
Beyer, D.: Software verification with validation of results (Report on SV-COMP 2017). In: Proceedings of TACAS, LNCS 10206, pp. 331–349. Springer (2017)
Mytkowicz, T., Diwan, A., Hauswirth, M., Sweeney, P.F.: Producing wrong data without doing anything obviously wrong! In: Proceedings of ASPLOS, pp. 265–276. ACM (2009)
SuhY-KSnodgrassR TKececiogluJ DDowneyP JMaierR SYiCEMP: execution time measurement protocol for compute-bound programsSoftw. Pract. Exp.201747455959710.1002/spe.2476
Balyo, T., Heule, M.J.H., Järvisalo, M.: SAT competition 2016: recent developments. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 5061–5063. AAAI Press (2017)
KrishnamurthiSVitekJThe real software crisis: repeatability as a core valueCommun. ACM2015583343610.1145/2658987
Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a cross-community infrastructure for logic solving. In: Proceedings of IJCAR, LNCS 8562, pp. 367–373. Springer (2014)
CollbergCSProebstingTARepeatability in computer-systems researchCommun. ACM2016593626910.1145/2812803
Hocko, M., Kalibera, T.: Reducing performance non-determinism via cache-aware page allocation strategies. In: Proceedings of ICPE, pp. 223–234. ACM (2010)
Rizzi, E.F., Elbaum, S., Dwyer, M.B.: On the techniques we create, the tools we build, and their misalignments: a study of Klee. In: Proceedings of ICSE, pp. 132–143. ACM (2016)
GuDVerbruggeCGagnonECode layout as a source of noise in JVM performanceStud. Inform. Univ.2005418399
TichyWFShould computer scientists experiment more?IEEE Comput.1998315324010.1109/2.675631
Beyer, D.: Competition on software verification (SV-COMP). In: Proceedings of TACAS, LNCS 7214, pp. 504–524. Springer (2012)
Vitek, J., Kalibera, T.: Repeatability, reproducibility, and rigor in systems research. In: Proceedings of EMSOFT, pp. 33–38. ACM (2011)
JCGM Working Group 2. International vocabulary of metrology—basic and general concepts and associated terms (VIM), 3rd edition. Technical Report JCGM 200:2012, BIPM (2012)
Petkovich, J., de Oliveira, A.B., Zhang, Y., Reidemeister, T., Fischmeister, S.: DataMill: a distributed heterogeneous infrastructure for robust experimentation. Softw. Pract. Exp. 46(10), 1411–1440 (2016)
Visser, W., Geldenhuys, J., Dwyer, M.B.: Green: reducing, reusing and recycling constraints in program analysis. In: Proceedings of FSE, pp. 58:1–58:11. ACM (2012)
DR Cok (469_CR13) 2016; 9
469_CR28
469_CR29
CS Collberg (469_CR14) 2016; 59
469_CR24
469_CR25
469_CR26
469_CR8
469_CR7
469_CR20
469_CR21
469_CR9
469_CR22
O Roussel (469_CR27) 2011; 7
D Gu (469_CR16) 2005; 4
469_CR2
469_CR1
469_CR4
469_CR3
469_CR6
469_CR5
S Krishnamurthi (469_CR23) 2015; 58
469_CR17
469_CR18
469_CR19
469_CR12
Y-K Suh (469_CR30) 2017; 47
469_CR15
469_CR10
469_CR32
469_CR11
469_CR33
WF Tichy (469_CR31) 1998; 31
References_xml – reference: Singh, B., Srinivasan, V.: Containers: challenges with the memory resource controller and its performance. In: Proceedings of Ottawa Linux Symposium (OLS), pp. 209–222 (2007)
– reference: Charwat, G., Ianni, G., Krennwallner, T., Kronegger, M., Pfandler, A., Redl, C., Schwengerer, M., Spendier, L., Wallner, J., Xiao, G.: VCWC: a versioning competition workflow compiler. In: Proceedings of LPNMR, LNCS 8148, pp. 233–238. Springer (2013)
– reference: Kalibera, T., Bulej, L., Tuma, P.: Benchmark precision and random initial state. In: Proceedings of SPECTS, pp. 484–490. SCS (2005)
– reference: TichyWFShould computer scientists experiment more?IEEE Comput.1998315324010.1109/2.675631
– reference: CollbergCSProebstingTARepeatability in computer-systems researchCommun. ACM2016593626910.1145/2812803
– reference: Balyo, T., Heule, M.J.H., Järvisalo, M.: SAT competition 2016: recent developments. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 5061–5063. AAAI Press (2017)
– reference: SuhY-KSnodgrassR TKececiogluJ DDowneyP JMaierR SYiCEMP: execution time measurement protocol for compute-bound programsSoftw. Pract. Exp.201747455959710.1002/spe.2476
– reference: Mytkowicz, T., Diwan, A., Hauswirth, M., Sweeney, P.F.: Producing wrong data without doing anything obviously wrong! In: Proceedings of ASPLOS, pp. 265–276. ACM (2009)
– reference: Petkovich, J., de Oliveira, A.B., Zhang, Y., Reidemeister, T., Fischmeister, S.: DataMill: a distributed heterogeneous infrastructure for robust experimentation. Softw. Pract. Exp. 46(10), 1411–1440 (2016)
– reference: Beyer, D.: Software verification with validation of results (Report on SV-COMP 2017). In: Proceedings of TACAS, LNCS 10206, pp. 331–349. Springer (2017)
– reference: de Oliveira, A.B., Petkovich, J.-C., Fischmeister, S.: How much does memory layout impact performance? A wide study. In: Proceedings of REPRODUCE (2014)
– reference: RousselOControlling a solver execution with the runsolver toolJSAT2011713914429178261331.68210
– reference: Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a cross-community infrastructure for logic solving. In: Proceedings of IJCAR, LNCS 8562, pp. 367–373. Springer (2014)
– reference: Beyer, D.. Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for efficient regression verification. In: Proceedings of FSE, pp. 389–399. ACM (2013)
– reference: Visser, W., Geldenhuys, J., Dwyer, M.B.: Green: reducing, reusing and recycling constraints in program analysis. In: Proceedings of FSE, pp. 58:1–58:11. ACM (2012)
– reference: Beyer, D.: Software verification and verifiable witnesses (Report on SV-COMP 2015). In: Proceedings of TACAS, LNCS 9035, pp. 401–416. Springer (2015)
– reference: Rizzi, E.F., Elbaum, S., Dwyer, M.B.: On the techniques we create, the tools we build, and their misalignments: a study of Klee. In: Proceedings of ICSE, pp. 132–143. ACM (2016)
– reference: Brooks, A., Roper, M., Wood, M., Daly, J., Miller, J.: Replication’s role in software engineering. In: Guide to Advanced Empirical Software Engineering, pp. 365–379. Springer (2008)
– reference: KrishnamurthiSVitekJThe real software crisis: repeatability as a core valueCommun. ACM2015583343610.1145/2658987
– reference: Beyer, D., Dresler, G., Wendler, P.: Software verification in the Google App-Engine cloud. In: Proceedings of CAV, LNCS 8559, pp. 327–333. Springer (2014)
– reference: Beyer, D.: Second competition on software verification (Summary of SV-COMP 2013). In: Proceedings of TACAS, LNCS 7795, pp. 594–609. Springer (2013)
– reference: Handigol, N., Heller, B., Jeyakumar, V., Lantz, B., McKeown, N.: Reproducible network experiments using container-based emulation. In: Proceedings of CoNEXT, pp. 253–264. ACM (2012)
– reference: Kordon, F., Hulin-Hubard, F.: BenchKit, a tool for massive concurrent benchmarking. In: Proceedings of ACSD, pp. 159–165. IEEE (2014)
– reference: Beyer, D., Löwe, S., Wendler, P.: Benchmarking and resource measurement. In: Proceedings of SPIN, LNCS 9232, pp. 160–178. Springer (2015)
– reference: Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (Report on SV-COMP 2016). In: Proceedings of TACAS, LNCS 9636, pp. 887–904. Springer (2016)
– reference: Hocko, M., Kalibera, T.: Reducing performance non-determinism via cache-aware page allocation strategies. In: Proceedings of ICPE, pp. 223–234. ACM (2010)
– reference: Vitek, J., Kalibera, T.: Repeatability, reproducibility, and rigor in systems research. In: Proceedings of EMSOFT, pp. 33–38. ACM (2011)
– reference: Beyer, D.: Competition on software verification (SV-COMP). In: Proceedings of TACAS, LNCS 7214, pp. 504–524. Springer (2012)
– reference: CokDRDéharbeDWeberTThe 2014 SMT competitionJSAT201692072423512057
– reference: Juristo, N., Gómez, O.S.: Replication of software engineering experiments. In: Empirical Software Engineering and Verification, pp. 60–88. Springer (2012)
– reference: Barrett, C., Fontaine, P., Tinelli, C.: The SMT-LIB standard: version 2.5. Technical report, University of Iowa (2015). www.smt-lib.org
– reference: JCGM Working Group 2. International vocabulary of metrology—basic and general concepts and associated terms (VIM), 3rd edition. Technical Report JCGM 200:2012, BIPM (2012)
– reference: GuDVerbruggeCGagnonECode layout as a source of noise in JVM performanceStud. Inform. Univ.2005418399
– ident: 469_CR6
  doi: 10.1007/978-3-662-49674-9_55
– volume: 4
  start-page: 83
  issue: 1
  year: 2005
  ident: 469_CR16
  publication-title: Stud. Inform. Univ.
– ident: 469_CR26
  doi: 10.1145/2884781.2884835
– ident: 469_CR28
– ident: 469_CR7
  doi: 10.1007/978-3-662-54580-5_20
– ident: 469_CR5
  doi: 10.1007/978-3-662-46681-0_31
– ident: 469_CR33
  doi: 10.1145/2038642.2038650
– ident: 469_CR29
  doi: 10.1007/978-3-319-08587-6_28
– ident: 469_CR25
  doi: 10.1002/spe.2382
– volume: 31
  start-page: 32
  issue: 5
  year: 1998
  ident: 469_CR31
  publication-title: IEEE Comput.
  doi: 10.1109/2.675631
– ident: 469_CR11
  doi: 10.1007/978-1-84800-044-5_14
– volume: 59
  start-page: 62
  issue: 3
  year: 2016
  ident: 469_CR14
  publication-title: Commun. ACM
  doi: 10.1145/2812803
– ident: 469_CR17
  doi: 10.1145/2413176.2413206
– volume: 58
  start-page: 34
  issue: 3
  year: 2015
  ident: 469_CR23
  publication-title: Commun. ACM
  doi: 10.1145/2658987
– ident: 469_CR24
  doi: 10.1145/1508284.1508275
– volume: 7
  start-page: 139
  year: 2011
  ident: 469_CR27
  publication-title: JSAT
– ident: 469_CR18
  doi: 10.1145/1712605.1712640
– ident: 469_CR1
  doi: 10.1609/aaai.v31i1.10641
– ident: 469_CR9
  doi: 10.1145/2491411.2491429
– volume: 47
  start-page: 559
  issue: 4
  year: 2017
  ident: 469_CR30
  publication-title: Softw. Pract. Exp.
  doi: 10.1002/spe.2476
– ident: 469_CR12
  doi: 10.1007/978-3-642-40564-8_23
– ident: 469_CR19
– ident: 469_CR4
  doi: 10.1007/978-3-642-36742-7_43
– ident: 469_CR3
  doi: 10.1007/978-3-642-28756-5_38
– ident: 469_CR15
– volume: 9
  start-page: 207
  year: 2016
  ident: 469_CR13
  publication-title: JSAT
– ident: 469_CR22
  doi: 10.1109/ACSD.2014.12
– ident: 469_CR10
  doi: 10.1007/978-3-319-23404-5_12
– ident: 469_CR2
– ident: 469_CR21
– ident: 469_CR32
  doi: 10.1145/2393596.2393665
– ident: 469_CR20
  doi: 10.1007/978-3-642-25231-0_2
– ident: 469_CR8
  doi: 10.1007/978-3-319-08867-9_21
SSID ssj0017679
Score 2.539822
Snippet Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence,...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Algorithms
Benchmarks
Computer Science
Program verification (computers)
Regular Paper
Software Engineering
Software Engineering/Programming and Operating Systems
Software reliability
Solvers
Source code
Theory of Computation
SummonAdditionalLinks – databaseName: Computer Science Database
  dbid: K7-
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEB60evBifWK1yh48KcFN9pGsFxGxCELx0ENvSzaboKBt7Vah_95Jmt2iaC9eN_tiXvnmkRmAc5aij8BZhG6JCUlsYk0ENYpwpqVIMqFNGbthE7zfF8Nh9uQDbpUvq6xtojPU5VjZGPmV7WzHM5Em0c3kndipUTa76kdorMMGZYxaOX_kpMki8NT12kNIEBHGeVZnNRdH51xaAG209RDJ_Pu-tASbP_Kjbtvptf_7wzuw7QFncLuQkF1Y06M9aNfDHAKv2_tAbXGyPUcVFHjh-U26IPp1MNW2VtgFEatAjsqgEdYDGPTuB3cPxM9TIApB0YwU3MhSUhULY1CvS2QeSyIENAjhigwZIxPBGXowKjUFS8pIyTKTItYi1Aa39UNojcYjfQRBnBahQXSQsbCICzQCjCkalaks7PwrSjsQ1sTMle81bkdevObLLsmW_jnSP7f0z-cduGgemSwabay6uVvTPPc6V-UO7Nksavz7csOPDlzWTF0u__mt49UvO4EtxFCZK-ROu9CaTT_0KWyqz9lLNT1z8vgFxw3j9A
  priority: 102
  providerName: ProQuest
Title Reliable benchmarking: requirements and solutions
URI https://link.springer.com/article/10.1007/s10009-017-0469-y
https://www.proquest.com/docview/2183984164
https://www.proquest.com/docview/2492798653
Volume 21
WOSCitedRecordID wos000459292700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1433-2787
  dateEnd: 20241207
  omitProxy: false
  ssIdentifier: ssj0017679
  issn: 1433-2779
  databaseCode: P5Z
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1433-2787
  dateEnd: 20241207
  omitProxy: false
  ssIdentifier: ssj0017679
  issn: 1433-2779
  databaseCode: K7-
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Engineering Database
  customDbUrl:
  eissn: 1433-2787
  dateEnd: 20241207
  omitProxy: false
  ssIdentifier: ssj0017679
  issn: 1433-2779
  databaseCode: M7S
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1433-2787
  dateEnd: 20241207
  omitProxy: false
  ssIdentifier: ssj0017679
  issn: 1433-2779
  databaseCode: BENPR
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Research Library
  customDbUrl:
  eissn: 1433-2787
  dateEnd: 20241207
  omitProxy: false
  ssIdentifier: ssj0017679
  issn: 1433-2779
  databaseCode: M2O
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/pqrl
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1433-2787
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017679
  issn: 1433-2779
  databaseCode: RSV
  dateStart: 19971201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwED_c5oMvzk-cztEHn5RAm6ZN4puKQxDn0CHDl5K2CQo6ZZvC_nsvWbupTEFf8tCkbbjL5X6Xu9wBHNAYbQROQzRLjE-YYZqIwGSEU61EJIU2OXPFJninI_p92S3ucY_KaPfSJel26k-X3dxBPu6q1qYjkwrUUNsJK403t3cz1wGPXYI9xAEhoZzL0pW56BNfldEcYX5zijpd067_a5ZrsFpAS-9kuhbWYUkPNqBelm3wCinehMCGIdsbU16KDx6elTsuP_aG2kYFu-PCkacGuTdbllvQa5_3zi5IUTmBZAh_xiTlRuUqyJgwBiU4RzbRKETogmAtlcgCFQlO0VbJYpPSKA8zlUslmBa-NqjAt6E6eBnoHfBYnPoGcYCkfspSFHdKsyDMY5XaSldB0AC_pGCSFVnFbXGLp2SeD9lSJEGKJJYiyaQBh7NXXqcpNX4b3CzZkhTSNUocrLP-Ura4m0nKpYijsAFHJZfm3T_-a_dPo_dgBcGTdBHccROq4-Gb3ofl7H38OBq2oHZ63unetKByyQm2V_TatvwW225033Jr9gNEEt-Z
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT8MwDLYQIMGFN2I8e4ALKKJN0yZFQgjxEGgwcdiBW5SmiUCCDbYB2o_iP-Jk7SYQcOPAtenbjv05dvwBbNMUYwROYwxLbEiYZYaIyGrCqVEiyYSxBfNkE7zRELe32c0YvFd7YVxZZWUTvaEu2tqtke-7znY8E2kSHz09E8ca5bKrFYXGQC3qpv-GIVv38PIU5btD6flZ8-SClKwCRCM06JGcW1WoSDNhLWp3gZ9AkxjdOgKZPMPXU4ngFHG8Tm1OkyLWqsiUYEaExgpHEoEWf4LFgrtW_XVOhkkLnvrWfohAYkI5z6ok6mCnns9CoEtwASnpf3aDI2z7JR3rvdz57D_7P3MwU8Lp4Hig__MwZloLMFtRVQSl5VqEyJVeu11iQY4H7h6VTxEcBB3jKqH9Emk3UK0iGE7FJWj-xXsvw3ir3TIrELA0Dy1in4yGOcvRxFGqo7hIVe7YvaKoBmElO6nLTuqO0ONBjnpAO3FLFLd04pb9GuwOL3katBH57eT1SsSytChd6aGsyxGz74eH4q_BXqVDo-Efn7X6-822YOqieX0lry4b9TWYRrSY-ZL1dB3Ge50XswGT-rV33-1s-qkQgPxj1foAzQZAwg
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LSwMxEB5ERbxYn1ituge9KMHd7CNZQUSsxVIpHnrwFrLZBAWt2lalP81_5yTdbVHUWw9eN_ueycw3mcl8APs0wRiB0RDDEuOTyESa8MAowqiWPE65NnnkyCZYu81vb9ObGfgo98LYssrSJjpDnT8pu0Z-bDvbsZQncXhsirKIm3rj7PmFWAYpm2kt6TRGKtLSw3cM3_qnzTrK-oDSxmXn4ooUDANEIUwYkIwZmctARdwY1PQcP4fGIbp4BDVZiq8qY84oYnqVmIzGeahknkoeae5rwy1hBFr_OXTCsZ1iLUbGCQyWuDZ_iEZCQhlLy4TqaNeey0ige7DBKRl-dYkTnPstNes8XqPyj__VMiwVMNs7H82LFZjR3VWolBQWXmHR1iCwJdl295iX4YG7R-lSBydeT9sKabd02vdkN_fGU3QdOtN47w2Y7T519SZ4UZL5BjFRSv0sytD0UaqCME9kZlm_gqAKfilHoYoO65bo40FMekNb0QsUvbCiF8MqHI4veR61F_nr5FopblFYmr5wENfmjqOfh8eqUIWjUp8mw78-a-vvm-3BAmqUuG62W9uwiCAydZXsSQ1mB71XvQPz6m1w3-_tulnhgZiyZn0CS5pJfA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reliable+benchmarking%3A+requirements+and+solutions&rft.jtitle=International+journal+on+software+tools+for+technology+transfer&rft.au=Beyer%2C+Dirk&rft.au=L%C3%B6we%2C+Stefan&rft.au=Wendler%2C+Philipp&rft.date=2019-02-06&rft.pub=Springer+Berlin+Heidelberg&rft.issn=1433-2779&rft.eissn=1433-2787&rft.volume=21&rft.issue=1&rft.spage=1&rft.epage=29&rft_id=info:doi/10.1007%2Fs10009-017-0469-y&rft.externalDocID=10_1007_s10009_017_0469_y
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1433-2779&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1433-2779&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1433-2779&client=summon