Preempting flaky tests via non-idempotent-outcome tests.

Saved in:
Bibliographic Details
Title: Preempting flaky tests via non-idempotent-outcome tests.
Authors: Wei, Anjiang, Yi, Pu, Li, Zhengxi, Xie, Tao, Marinov, Darko, Lam, Wing
Source: ICSE: International Conference on Software Engineering; 2022, p1730-1742, 13p
Subject Terms: REGRESSION analysis, COMPUTER software, JAVA programming language, PYTHON programming language, RESEARCH
Abstract: Regression testing can greatly help in software development, but it can be seriously undermined by flaky tests, which can both pass and fail, seemingly nondeterministically, on the same code commit. Flaky tests are an emerging topic in both research and industry. Prior work has identified multiple categories of flaky tests, developed techniques for detecting these flaky tests, and analyzed some detected flaky tests. To proactively detect, i.e., preempt, flaky tests, we propose to detect non-idempotent-outcome (NIO) tests, a novel category related to flaky tests. In particular, we run each test twice in the same test execution environment, e.g., run each Java test twice in the same Java Virtual Machine. A test is NIO if it passes in the first run but fails in the second. Each NIO test has side effects and "self-pollutes" the state shared among test runs. We perform experiments on both Java and Python open-source projects, detecting 223 NIO Java tests and 138 NIO Python tests. We have inspected all 361 detected tests and opened pull requests that fix 268 tests, with 192 already accepted, only 6 rejected, and the remaining 70 pending. [ABSTRACT FROM AUTHOR]
Copyright of ICSE: International Conference on Software Engineering is the property of Association for Computing Machinery and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Complementary Index
Description
Abstract:Regression testing can greatly help in software development, but it can be seriously undermined by flaky tests, which can both pass and fail, seemingly nondeterministically, on the same code commit. Flaky tests are an emerging topic in both research and industry. Prior work has identified multiple categories of flaky tests, developed techniques for detecting these flaky tests, and analyzed some detected flaky tests. To proactively detect, i.e., preempt, flaky tests, we propose to detect non-idempotent-outcome (NIO) tests, a novel category related to flaky tests. In particular, we run each test twice in the same test execution environment, e.g., run each Java test twice in the same Java Virtual Machine. A test is NIO if it passes in the first run but fails in the second. Each NIO test has side effects and "self-pollutes" the state shared among test runs. We perform experiments on both Java and Python open-source projects, detecting 223 NIO Java tests and 138 NIO Python tests. We have inspected all 361 detected tests and opened pull requests that fix 268 tests, with 192 already accepted, only 6 rejected, and the remaining 70 pending. [ABSTRACT FROM AUTHOR]
DOI:10.1145/3510003.3510170