Reliable benchmarking: requirements and solutions
Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of...
Saved in:
| Published in: | International journal on software tools for technology transfer Vol. 21; no. 1; pp. 1 - 29 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Berlin/Heidelberg
Springer Berlin Heidelberg
06.02.2019
Springer Nature B.V |
| Subjects: | |
| ISSN: | 1433-2779, 1433-2787 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed
BenchExec
, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way. |
|---|---|
| AbstractList | Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed
BenchExec
, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way. Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way. |
| Author | Beyer, Dirk Löwe, Stefan Wendler, Philipp |
| Author_xml | – sequence: 1 givenname: Dirk surname: Beyer fullname: Beyer, Dirk organization: LMU Munich – sequence: 2 givenname: Stefan surname: Löwe fullname: Löwe, Stefan organization: One Logic – sequence: 3 givenname: Philipp surname: Wendler fullname: Wendler, Philipp organization: LMU Munich |
| BookMark | eNp9kE1LwzAYgINMcJv-AG8Fz9V8Nok3GX7BQJDdQ5omM7NLt6Q97N-bUVEQ3CXJ4XmSN88MTEIXLADXCN4iCPldyiuUJUS8hLSS5eEMTBElpMRc8MnPmcsLMEtpAzNYcTkF6N22XtetLWobzMdWx08f1vdFtPvBR7u1oU-FDk2RunbofRfSJTh3uk326nufg9XT42rxUi7fnl8XD8vSUML6suZONxoZKpwjEDXGaMwIQ4zgqpZMCs0Ex1RwU7kas4YY3UgtqBXQOsHIHNyM1-5itx9s6tWmG2LILypMJeZSVIycpJAgUlBU0UzxkTKxSylap4zv9fE3fdS-VQiqY0Q1RlS5jTpGVIdsoj_mLvoc6XDSwaOTMhvWNv7O9L_0BbwBhaA |
| CitedBy_id | crossref_primary_10_1007_s10009_025_00809_x crossref_primary_10_1007_s10270_024_01155_3 crossref_primary_10_1145_3571242 crossref_primary_10_1007_s10817_019_09535_x crossref_primary_10_1177_30504554241305110 crossref_primary_10_1007_s10009_021_00613_3 crossref_primary_10_1007_s10009_021_00615_1 crossref_primary_10_1109_ACCESS_2025_3560547 crossref_primary_10_1007_s10009_021_00631_1 crossref_primary_10_1134_S0361768820080022 crossref_primary_10_1145_3632925 crossref_primary_10_1145_3371082 crossref_primary_10_1007_s10009_020_00587_8 crossref_primary_10_1145_3477579 crossref_primary_10_1016_j_softx_2025_102087 crossref_primary_10_1007_s10009_024_00768_9 crossref_primary_10_1007_s10009_024_00769_8 crossref_primary_10_1007_s10817_024_09702_9 crossref_primary_10_1007_s10703_024_00449_y crossref_primary_10_1109_TGCN_2024_3420957 crossref_primary_10_1145_3660797 crossref_primary_10_1145_3748647 crossref_primary_10_1016_j_cis_2024_103360 crossref_primary_10_1007_s11219_023_09620_w crossref_primary_10_1145_3459080 crossref_primary_10_3390_math10132264 crossref_primary_10_14232_actacyb_298287 crossref_primary_10_1016_j_jss_2024_112058 crossref_primary_10_1007_s10703_025_00471_8 crossref_primary_10_1007_s10515_020_00270_x crossref_primary_10_1007_s10009_025_00811_3 crossref_primary_10_1145_3732933 crossref_primary_10_1016_j_scico_2024_103154 crossref_primary_10_1016_j_scico_2025_103336 crossref_primary_10_1145_3660766 crossref_primary_10_1016_j_amc_2024_128827 crossref_primary_10_1109_ACCESS_2022_3171408 |
| Cites_doi | 10.1007/978-3-662-49674-9_55 10.1145/2884781.2884835 10.1007/978-3-662-54580-5_20 10.1007/978-3-662-46681-0_31 10.1145/2038642.2038650 10.1007/978-3-319-08587-6_28 10.1002/spe.2382 10.1109/2.675631 10.1007/978-1-84800-044-5_14 10.1145/2812803 10.1145/2413176.2413206 10.1145/2658987 10.1145/1508284.1508275 10.1145/1712605.1712640 10.1609/aaai.v31i1.10641 10.1145/2491411.2491429 10.1002/spe.2476 10.1007/978-3-642-40564-8_23 10.1007/978-3-642-36742-7_43 10.1007/978-3-642-28756-5_38 10.1109/ACSD.2014.12 10.1007/978-3-319-23404-5_12 10.1145/2393596.2393665 10.1007/978-3-642-25231-0_2 10.1007/978-3-319-08867-9_21 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2017. corrected publication 2020 International Journal on Software Tools for Technology Transfer is a copyright of Springer, (2017). All Rights Reserved. The Author(s) 2017. corrected publication 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: The Author(s) 2017. corrected publication 2020 – notice: International Journal on Software Tools for Technology Transfer is a copyright of Springer, (2017). All Rights Reserved. – notice: The Author(s) 2017. corrected publication 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | C6C AAYXX CITATION 3V. 7SC 7XB 8AL 8AO 8FD 8FE 8FG 8FK 8G5 ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ GUQSH HCIFZ JQ2 K7- L6V L7M L~C L~D M0N M2O M7S MBDVC P5Z P62 PADUT PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS Q9U |
| DOI | 10.1007/s10009-017-0469-y |
| DatabaseName | Springer Nature OA Free Journals CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ProQuest Central (purchase pre-March 2016) Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) Research Library Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database ProQuest Research Library Engineering Database Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Research Library China ProQuest Databases ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection ProQuest Central Basic |
| DatabaseTitle | CrossRef Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ProQuest Pharma Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest Central Korea ProQuest Research Library Research Library China ProQuest Central (New) Advanced Technologies Database with Aerospace Engineering Collection Advanced Technologies & Aerospace Collection ProQuest Computing Engineering Database ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition Materials Science & Engineering Collection ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
| DatabaseTitleList | CrossRef Research Library Prep Research Library Prep |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1433-2787 |
| EndPage | 29 |
| ExternalDocumentID | 10_1007_s10009_017_0469_y |
| GroupedDBID | -59 -5G -BR -EM -~C .86 .DC .VR 06D 0R~ 0VY 1N0 203 29J 2J2 2JN 2JY 2KG 2KM 2LR 2~H 30V 4.4 406 408 409 40D 40E 5GY 67Z 6NX 8AO 8FE 8FG 8FW 8G5 8TC 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTD ABFTV ABHLI ABHQN ABJCF ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABUWG ABWNU ABXPI ACAOD ACDTI ACGFS ACHSB ACHXU ACIWK ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACSNA ACZOJ ADHHG ADHIR ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEFQL AEGAL AEGNC AEJHL AEJRE AEMSY AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BDATZ BENPR BGLVJ BGNMA BPHCQ BSONS C6C CCPQU CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GUQSH GXS HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~~C L~D MBDVC PKEHL PQEST PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c435t-b7fada1c48ff301dcca253515326b9598a5872487c6fb25d3cad9a84e80ef853 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 145 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000459292700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1433-2779 |
| IngestDate | Tue Nov 04 23:27:45 EST 2025 Wed Nov 05 00:36:14 EST 2025 Tue Nov 18 22:31:19 EST 2025 Sat Nov 29 03:07:48 EST 2025 Fri Feb 21 02:47:11 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Competition Benchmarking Container Process isolation Resource measurement Process control |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c435t-b7fada1c48ff301dcca253515326b9598a5872487c6fb25d3cad9a84e80ef853 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://link.springer.com/10.1007/s10009-017-0469-y |
| PQID | 2183984164 |
| PQPubID | 46652 |
| PageCount | 29 |
| ParticipantIDs | proquest_journals_2492798653 proquest_journals_2183984164 crossref_citationtrail_10_1007_s10009_017_0469_y crossref_primary_10_1007_s10009_017_0469_y springer_journals_10_1007_s10009_017_0469_y |
| PublicationCentury | 2000 |
| PublicationDate | 20190206 |
| PublicationDateYYYYMMDD | 2019-02-06 |
| PublicationDate_xml | – month: 2 year: 2019 text: 20190206 day: 6 |
| PublicationDecade | 2010 |
| PublicationPlace | Berlin/Heidelberg |
| PublicationPlace_xml | – name: Berlin/Heidelberg – name: Heidelberg |
| PublicationTitle | International journal on software tools for technology transfer |
| PublicationTitleAbbrev | Int J Softw Tools Technol Transfer |
| PublicationYear | 2019 |
| Publisher | Springer Berlin Heidelberg Springer Nature B.V |
| Publisher_xml | – name: Springer Berlin Heidelberg – name: Springer Nature B.V |
| References | RousselOControlling a solver execution with the runsolver toolJSAT2011713914429178261331.68210 CokDRDéharbeDWeberTThe 2014 SMT competitionJSAT201692072423512057 Beyer, D.. Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for efficient regression verification. In: Proceedings of FSE, pp. 389–399. ACM (2013) Beyer, D.: Second competition on software verification (Summary of SV-COMP 2013). In: Proceedings of TACAS, LNCS 7795, pp. 594–609. Springer (2013) Singh, B., Srinivasan, V.: Containers: challenges with the memory resource controller and its performance. In: Proceedings of Ottawa Linux Symposium (OLS), pp. 209–222 (2007) Beyer, D., Dresler, G., Wendler, P.: Software verification in the Google App-Engine cloud. In: Proceedings of CAV, LNCS 8559, pp. 327–333. Springer (2014) Beyer, D.: Software verification and verifiable witnesses (Report on SV-COMP 2015). In: Proceedings of TACAS, LNCS 9035, pp. 401–416. Springer (2015) Beyer, D., Löwe, S., Wendler, P.: Benchmarking and resource measurement. In: Proceedings of SPIN, LNCS 9232, pp. 160–178. Springer (2015) Barrett, C., Fontaine, P., Tinelli, C.: The SMT-LIB standard: version 2.5. Technical report, University of Iowa (2015). www.smt-lib.org Charwat, G., Ianni, G., Krennwallner, T., Kronegger, M., Pfandler, A., Redl, C., Schwengerer, M., Spendier, L., Wallner, J., Xiao, G.: VCWC: a versioning competition workflow compiler. In: Proceedings of LPNMR, LNCS 8148, pp. 233–238. Springer (2013) Kalibera, T., Bulej, L., Tuma, P.: Benchmark precision and random initial state. In: Proceedings of SPECTS, pp. 484–490. SCS (2005) Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (Report on SV-COMP 2016). In: Proceedings of TACAS, LNCS 9636, pp. 887–904. Springer (2016) Handigol, N., Heller, B., Jeyakumar, V., Lantz, B., McKeown, N.: Reproducible network experiments using container-based emulation. In: Proceedings of CoNEXT, pp. 253–264. ACM (2012) Kordon, F., Hulin-Hubard, F.: BenchKit, a tool for massive concurrent benchmarking. In: Proceedings of ACSD, pp. 159–165. IEEE (2014) de Oliveira, A.B., Petkovich, J.-C., Fischmeister, S.: How much does memory layout impact performance? A wide study. In: Proceedings of REPRODUCE (2014) Juristo, N., Gómez, O.S.: Replication of software engineering experiments. In: Empirical Software Engineering and Verification, pp. 60–88. Springer (2012) Brooks, A., Roper, M., Wood, M., Daly, J., Miller, J.: Replication’s role in software engineering. In: Guide to Advanced Empirical Software Engineering, pp. 365–379. Springer (2008) Beyer, D.: Software verification with validation of results (Report on SV-COMP 2017). In: Proceedings of TACAS, LNCS 10206, pp. 331–349. Springer (2017) Mytkowicz, T., Diwan, A., Hauswirth, M., Sweeney, P.F.: Producing wrong data without doing anything obviously wrong! In: Proceedings of ASPLOS, pp. 265–276. ACM (2009) SuhY-KSnodgrassR TKececiogluJ DDowneyP JMaierR SYiCEMP: execution time measurement protocol for compute-bound programsSoftw. Pract. Exp.201747455959710.1002/spe.2476 Balyo, T., Heule, M.J.H., Järvisalo, M.: SAT competition 2016: recent developments. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 5061–5063. AAAI Press (2017) KrishnamurthiSVitekJThe real software crisis: repeatability as a core valueCommun. ACM2015583343610.1145/2658987 Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a cross-community infrastructure for logic solving. In: Proceedings of IJCAR, LNCS 8562, pp. 367–373. Springer (2014) CollbergCSProebstingTARepeatability in computer-systems researchCommun. ACM2016593626910.1145/2812803 Hocko, M., Kalibera, T.: Reducing performance non-determinism via cache-aware page allocation strategies. In: Proceedings of ICPE, pp. 223–234. ACM (2010) Rizzi, E.F., Elbaum, S., Dwyer, M.B.: On the techniques we create, the tools we build, and their misalignments: a study of Klee. In: Proceedings of ICSE, pp. 132–143. ACM (2016) GuDVerbruggeCGagnonECode layout as a source of noise in JVM performanceStud. Inform. Univ.2005418399 TichyWFShould computer scientists experiment more?IEEE Comput.1998315324010.1109/2.675631 Beyer, D.: Competition on software verification (SV-COMP). In: Proceedings of TACAS, LNCS 7214, pp. 504–524. Springer (2012) Vitek, J., Kalibera, T.: Repeatability, reproducibility, and rigor in systems research. In: Proceedings of EMSOFT, pp. 33–38. ACM (2011) JCGM Working Group 2. International vocabulary of metrology—basic and general concepts and associated terms (VIM), 3rd edition. Technical Report JCGM 200:2012, BIPM (2012) Petkovich, J., de Oliveira, A.B., Zhang, Y., Reidemeister, T., Fischmeister, S.: DataMill: a distributed heterogeneous infrastructure for robust experimentation. Softw. Pract. Exp. 46(10), 1411–1440 (2016) Visser, W., Geldenhuys, J., Dwyer, M.B.: Green: reducing, reusing and recycling constraints in program analysis. In: Proceedings of FSE, pp. 58:1–58:11. ACM (2012) DR Cok (469_CR13) 2016; 9 469_CR28 469_CR29 CS Collberg (469_CR14) 2016; 59 469_CR24 469_CR25 469_CR26 469_CR8 469_CR7 469_CR20 469_CR21 469_CR9 469_CR22 O Roussel (469_CR27) 2011; 7 D Gu (469_CR16) 2005; 4 469_CR2 469_CR1 469_CR4 469_CR3 469_CR6 469_CR5 S Krishnamurthi (469_CR23) 2015; 58 469_CR17 469_CR18 469_CR19 469_CR12 Y-K Suh (469_CR30) 2017; 47 469_CR15 469_CR10 469_CR32 469_CR11 469_CR33 WF Tichy (469_CR31) 1998; 31 |
| References_xml | – reference: Singh, B., Srinivasan, V.: Containers: challenges with the memory resource controller and its performance. In: Proceedings of Ottawa Linux Symposium (OLS), pp. 209–222 (2007) – reference: Charwat, G., Ianni, G., Krennwallner, T., Kronegger, M., Pfandler, A., Redl, C., Schwengerer, M., Spendier, L., Wallner, J., Xiao, G.: VCWC: a versioning competition workflow compiler. In: Proceedings of LPNMR, LNCS 8148, pp. 233–238. Springer (2013) – reference: Kalibera, T., Bulej, L., Tuma, P.: Benchmark precision and random initial state. In: Proceedings of SPECTS, pp. 484–490. SCS (2005) – reference: TichyWFShould computer scientists experiment more?IEEE Comput.1998315324010.1109/2.675631 – reference: CollbergCSProebstingTARepeatability in computer-systems researchCommun. ACM2016593626910.1145/2812803 – reference: Balyo, T., Heule, M.J.H., Järvisalo, M.: SAT competition 2016: recent developments. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 5061–5063. AAAI Press (2017) – reference: SuhY-KSnodgrassR TKececiogluJ DDowneyP JMaierR SYiCEMP: execution time measurement protocol for compute-bound programsSoftw. Pract. Exp.201747455959710.1002/spe.2476 – reference: Mytkowicz, T., Diwan, A., Hauswirth, M., Sweeney, P.F.: Producing wrong data without doing anything obviously wrong! In: Proceedings of ASPLOS, pp. 265–276. ACM (2009) – reference: Petkovich, J., de Oliveira, A.B., Zhang, Y., Reidemeister, T., Fischmeister, S.: DataMill: a distributed heterogeneous infrastructure for robust experimentation. Softw. Pract. Exp. 46(10), 1411–1440 (2016) – reference: Beyer, D.: Software verification with validation of results (Report on SV-COMP 2017). In: Proceedings of TACAS, LNCS 10206, pp. 331–349. Springer (2017) – reference: de Oliveira, A.B., Petkovich, J.-C., Fischmeister, S.: How much does memory layout impact performance? A wide study. In: Proceedings of REPRODUCE (2014) – reference: RousselOControlling a solver execution with the runsolver toolJSAT2011713914429178261331.68210 – reference: Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a cross-community infrastructure for logic solving. In: Proceedings of IJCAR, LNCS 8562, pp. 367–373. Springer (2014) – reference: Beyer, D.. Löwe, S., Novikov, E., Stahlbauer, A., Wendler, P.: Precision reuse for efficient regression verification. In: Proceedings of FSE, pp. 389–399. ACM (2013) – reference: Visser, W., Geldenhuys, J., Dwyer, M.B.: Green: reducing, reusing and recycling constraints in program analysis. In: Proceedings of FSE, pp. 58:1–58:11. ACM (2012) – reference: Beyer, D.: Software verification and verifiable witnesses (Report on SV-COMP 2015). In: Proceedings of TACAS, LNCS 9035, pp. 401–416. Springer (2015) – reference: Rizzi, E.F., Elbaum, S., Dwyer, M.B.: On the techniques we create, the tools we build, and their misalignments: a study of Klee. In: Proceedings of ICSE, pp. 132–143. ACM (2016) – reference: Brooks, A., Roper, M., Wood, M., Daly, J., Miller, J.: Replication’s role in software engineering. In: Guide to Advanced Empirical Software Engineering, pp. 365–379. Springer (2008) – reference: KrishnamurthiSVitekJThe real software crisis: repeatability as a core valueCommun. ACM2015583343610.1145/2658987 – reference: Beyer, D., Dresler, G., Wendler, P.: Software verification in the Google App-Engine cloud. In: Proceedings of CAV, LNCS 8559, pp. 327–333. Springer (2014) – reference: Beyer, D.: Second competition on software verification (Summary of SV-COMP 2013). In: Proceedings of TACAS, LNCS 7795, pp. 594–609. Springer (2013) – reference: Handigol, N., Heller, B., Jeyakumar, V., Lantz, B., McKeown, N.: Reproducible network experiments using container-based emulation. In: Proceedings of CoNEXT, pp. 253–264. ACM (2012) – reference: Kordon, F., Hulin-Hubard, F.: BenchKit, a tool for massive concurrent benchmarking. In: Proceedings of ACSD, pp. 159–165. IEEE (2014) – reference: Beyer, D., Löwe, S., Wendler, P.: Benchmarking and resource measurement. In: Proceedings of SPIN, LNCS 9232, pp. 160–178. Springer (2015) – reference: Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses (Report on SV-COMP 2016). In: Proceedings of TACAS, LNCS 9636, pp. 887–904. Springer (2016) – reference: Hocko, M., Kalibera, T.: Reducing performance non-determinism via cache-aware page allocation strategies. In: Proceedings of ICPE, pp. 223–234. ACM (2010) – reference: Vitek, J., Kalibera, T.: Repeatability, reproducibility, and rigor in systems research. In: Proceedings of EMSOFT, pp. 33–38. ACM (2011) – reference: Beyer, D.: Competition on software verification (SV-COMP). In: Proceedings of TACAS, LNCS 7214, pp. 504–524. Springer (2012) – reference: CokDRDéharbeDWeberTThe 2014 SMT competitionJSAT201692072423512057 – reference: Juristo, N., Gómez, O.S.: Replication of software engineering experiments. In: Empirical Software Engineering and Verification, pp. 60–88. Springer (2012) – reference: Barrett, C., Fontaine, P., Tinelli, C.: The SMT-LIB standard: version 2.5. Technical report, University of Iowa (2015). www.smt-lib.org – reference: JCGM Working Group 2. International vocabulary of metrology—basic and general concepts and associated terms (VIM), 3rd edition. Technical Report JCGM 200:2012, BIPM (2012) – reference: GuDVerbruggeCGagnonECode layout as a source of noise in JVM performanceStud. Inform. Univ.2005418399 – ident: 469_CR6 doi: 10.1007/978-3-662-49674-9_55 – volume: 4 start-page: 83 issue: 1 year: 2005 ident: 469_CR16 publication-title: Stud. Inform. Univ. – ident: 469_CR26 doi: 10.1145/2884781.2884835 – ident: 469_CR28 – ident: 469_CR7 doi: 10.1007/978-3-662-54580-5_20 – ident: 469_CR5 doi: 10.1007/978-3-662-46681-0_31 – ident: 469_CR33 doi: 10.1145/2038642.2038650 – ident: 469_CR29 doi: 10.1007/978-3-319-08587-6_28 – ident: 469_CR25 doi: 10.1002/spe.2382 – volume: 31 start-page: 32 issue: 5 year: 1998 ident: 469_CR31 publication-title: IEEE Comput. doi: 10.1109/2.675631 – ident: 469_CR11 doi: 10.1007/978-1-84800-044-5_14 – volume: 59 start-page: 62 issue: 3 year: 2016 ident: 469_CR14 publication-title: Commun. ACM doi: 10.1145/2812803 – ident: 469_CR17 doi: 10.1145/2413176.2413206 – volume: 58 start-page: 34 issue: 3 year: 2015 ident: 469_CR23 publication-title: Commun. ACM doi: 10.1145/2658987 – ident: 469_CR24 doi: 10.1145/1508284.1508275 – volume: 7 start-page: 139 year: 2011 ident: 469_CR27 publication-title: JSAT – ident: 469_CR18 doi: 10.1145/1712605.1712640 – ident: 469_CR1 doi: 10.1609/aaai.v31i1.10641 – ident: 469_CR9 doi: 10.1145/2491411.2491429 – volume: 47 start-page: 559 issue: 4 year: 2017 ident: 469_CR30 publication-title: Softw. Pract. Exp. doi: 10.1002/spe.2476 – ident: 469_CR12 doi: 10.1007/978-3-642-40564-8_23 – ident: 469_CR19 – ident: 469_CR4 doi: 10.1007/978-3-642-36742-7_43 – ident: 469_CR3 doi: 10.1007/978-3-642-28756-5_38 – ident: 469_CR15 – volume: 9 start-page: 207 year: 2016 ident: 469_CR13 publication-title: JSAT – ident: 469_CR22 doi: 10.1109/ACSD.2014.12 – ident: 469_CR10 doi: 10.1007/978-3-319-23404-5_12 – ident: 469_CR2 – ident: 469_CR21 – ident: 469_CR32 doi: 10.1145/2393596.2393665 – ident: 469_CR20 doi: 10.1007/978-3-642-25231-0_2 – ident: 469_CR8 doi: 10.1007/978-3-319-08867-9_21 |
| SSID | ssj0017679 |
| Score | 2.539822 |
| Snippet | Benchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence,... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Algorithms Benchmarks Computer Science Program verification (computers) Regular Paper Software Engineering Software Engineering/Programming and Operating Systems Software reliability Solvers Source code Theory of Computation |
| SummonAdditionalLinks | – databaseName: Computer Science Database dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEB60evBifWK1yh48KcFN9pGsFxGxCELx0ENvSzaboKBt7Vah_95Jmt2iaC9eN_tiXvnmkRmAc5aij8BZhG6JCUlsYk0ENYpwpqVIMqFNGbthE7zfF8Nh9uQDbpUvq6xtojPU5VjZGPmV7WzHM5Em0c3kndipUTa76kdorMMGZYxaOX_kpMki8NT12kNIEBHGeVZnNRdH51xaAG209RDJ_Pu-tASbP_Kjbtvptf_7wzuw7QFncLuQkF1Y06M9aNfDHAKv2_tAbXGyPUcVFHjh-U26IPp1MNW2VtgFEatAjsqgEdYDGPTuB3cPxM9TIApB0YwU3MhSUhULY1CvS2QeSyIENAjhigwZIxPBGXowKjUFS8pIyTKTItYi1Aa39UNojcYjfQRBnBahQXSQsbCICzQCjCkalaks7PwrSjsQ1sTMle81bkdevObLLsmW_jnSP7f0z-cduGgemSwabay6uVvTPPc6V-UO7Nksavz7csOPDlzWTF0u__mt49UvO4EtxFCZK-ROu9CaTT_0KWyqz9lLNT1z8vgFxw3j9A priority: 102 providerName: ProQuest |
| Title | Reliable benchmarking: requirements and solutions |
| URI | https://link.springer.com/article/10.1007/s10009-017-0469-y https://www.proquest.com/docview/2183984164 https://www.proquest.com/docview/2492798653 |
| Volume | 21 |
| WOSCitedRecordID | wos000459292700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 1433-2787 dateEnd: 20241207 omitProxy: false ssIdentifier: ssj0017679 issn: 1433-2779 databaseCode: P5Z dateStart: 20190101 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1433-2787 dateEnd: 20241207 omitProxy: false ssIdentifier: ssj0017679 issn: 1433-2779 databaseCode: K7- dateStart: 20190101 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1433-2787 dateEnd: 20241207 omitProxy: false ssIdentifier: ssj0017679 issn: 1433-2779 databaseCode: M7S dateStart: 20190101 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1433-2787 dateEnd: 20241207 omitProxy: false ssIdentifier: ssj0017679 issn: 1433-2779 databaseCode: BENPR dateStart: 20190101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Research Library customDbUrl: eissn: 1433-2787 dateEnd: 20241207 omitProxy: false ssIdentifier: ssj0017679 issn: 1433-2779 databaseCode: M2O dateStart: 20190101 isFulltext: true titleUrlDefault: https://search.proquest.com/pqrl providerName: ProQuest – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1433-2787 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017679 issn: 1433-2779 databaseCode: RSV dateStart: 19971201 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwED_c5oMvzk-cztEHn5RAm6ZN4puKQxDn0CHDl5K2CQo6ZZvC_nsvWbupTEFf8tCkbbjL5X6Xu9wBHNAYbQROQzRLjE-YYZqIwGSEU61EJIU2OXPFJninI_p92S3ucY_KaPfSJel26k-X3dxBPu6q1qYjkwrUUNsJK403t3cz1wGPXYI9xAEhoZzL0pW56BNfldEcYX5zijpd067_a5ZrsFpAS-9kuhbWYUkPNqBelm3wCinehMCGIdsbU16KDx6elTsuP_aG2kYFu-PCkacGuTdbllvQa5_3zi5IUTmBZAh_xiTlRuUqyJgwBiU4RzbRKETogmAtlcgCFQlO0VbJYpPSKA8zlUslmBa-NqjAt6E6eBnoHfBYnPoGcYCkfspSFHdKsyDMY5XaSldB0AC_pGCSFVnFbXGLp2SeD9lSJEGKJJYiyaQBh7NXXqcpNX4b3CzZkhTSNUocrLP-Ura4m0nKpYijsAFHJZfm3T_-a_dPo_dgBcGTdBHccROq4-Gb3ofl7H38OBq2oHZ63unetKByyQm2V_TatvwW225033Jr9gNEEt-Z |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT8MwDLYQIMGFN2I8e4ALKKJN0yZFQgjxEGgwcdiBW5SmiUCCDbYB2o_iP-Jk7SYQcOPAtenbjv05dvwBbNMUYwROYwxLbEiYZYaIyGrCqVEiyYSxBfNkE7zRELe32c0YvFd7YVxZZWUTvaEu2tqtke-7znY8E2kSHz09E8ca5bKrFYXGQC3qpv-GIVv38PIU5btD6flZ8-SClKwCRCM06JGcW1WoSDNhLWp3gZ9AkxjdOgKZPMPXU4ngFHG8Tm1OkyLWqsiUYEaExgpHEoEWf4LFgrtW_XVOhkkLnvrWfohAYkI5z6ok6mCnns9CoEtwASnpf3aDI2z7JR3rvdz57D_7P3MwU8Lp4Hig__MwZloLMFtRVQSl5VqEyJVeu11iQY4H7h6VTxEcBB3jKqH9Emk3UK0iGE7FJWj-xXsvw3ir3TIrELA0Dy1in4yGOcvRxFGqo7hIVe7YvaKoBmElO6nLTuqO0ONBjnpAO3FLFLd04pb9GuwOL3katBH57eT1SsSytChd6aGsyxGz74eH4q_BXqVDo-Efn7X6-822YOqieX0lry4b9TWYRrSY-ZL1dB3Ge50XswGT-rV33-1s-qkQgPxj1foAzQZAwg |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LSwMxEB5ERbxYn1ituge9KMHd7CNZQUSsxVIpHnrwFrLZBAWt2lalP81_5yTdbVHUWw9eN_ueycw3mcl8APs0wRiB0RDDEuOTyESa8MAowqiWPE65NnnkyCZYu81vb9ObGfgo98LYssrSJjpDnT8pu0Z-bDvbsZQncXhsirKIm3rj7PmFWAYpm2kt6TRGKtLSw3cM3_qnzTrK-oDSxmXn4ooUDANEIUwYkIwZmctARdwY1PQcP4fGIbp4BDVZiq8qY84oYnqVmIzGeahknkoeae5rwy1hBFr_OXTCsZ1iLUbGCQyWuDZ_iEZCQhlLy4TqaNeey0ige7DBKRl-dYkTnPstNes8XqPyj__VMiwVMNs7H82LFZjR3VWolBQWXmHR1iCwJdl295iX4YG7R-lSBydeT9sKabd02vdkN_fGU3QdOtN47w2Y7T519SZ4UZL5BjFRSv0sytD0UaqCME9kZlm_gqAKfilHoYoO65bo40FMekNb0QsUvbCiF8MqHI4veR61F_nr5FopblFYmr5wENfmjqOfh8eqUIWjUp8mw78-a-vvm-3BAmqUuG62W9uwiCAydZXsSQ1mB71XvQPz6m1w3-_tulnhgZiyZn0CS5pJfA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reliable+benchmarking%3A+requirements+and+solutions&rft.jtitle=International+journal+on+software+tools+for+technology+transfer&rft.au=Beyer%2C+Dirk&rft.au=L%C3%B6we%2C+Stefan&rft.au=Wendler%2C+Philipp&rft.date=2019-02-06&rft.pub=Springer+Berlin+Heidelberg&rft.issn=1433-2779&rft.eissn=1433-2787&rft.volume=21&rft.issue=1&rft.spage=1&rft.epage=29&rft_id=info:doi/10.1007%2Fs10009-017-0469-y&rft.externalDocID=10_1007_s10009_017_0469_y |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1433-2779&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1433-2779&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1433-2779&client=summon |