On the effects of program slicing for vulnerability detection during code inspection

Saved in:
Bibliographic Details
Title: On the effects of program slicing for vulnerability detection during code inspection
Authors: Papotti, Aurora, Tuma, Katja, Massacci, Fabio
Source: Papotti, A, Tuma, K & Massacci, F 2025, 'On the effects of program slicing for vulnerability detection during code inspection', Empirical Software Engineering, vol. 30, no. 3, 93, pp. 1-37. https://doi.org/10.1007/s10664-025-10636-y
Publication Year: 2025
Subject Terms: code review, controlled experiment, program comprehension, program slicing, vulnerability
Description: Slicing is a fault localization technique that has been proposed to support debugging and program comprehension. Yet, its empirical effectiveness during code inspection by humans has received limited attention. The goal of our study is two-fold. First, we aim to define what it means for a code reviewer to identify the vulnerable lines correctly. Second, we investigate whether reducing the number of to-be-inspected lines by method-level slicing supports code reviewers in detecting security vulnerabilities. We propose a novel approach based on the notion of a δ-neighborhood (intuitively based on the idea of the context size of the command git diff) to define correctly identified lines. Then, we conducted a multi-year controlled experiment (2017-2023) in which MSc students attending security courses (n=236) were tasked with identifying vulnerable lines in original or sliced Java files from Apache Tomcat. We provide perfect seed lines for a slicing algorithm to control for confounding factors. Each treatment differs in the pair (Vulnerability, Original/Sliced) with a balanced design with vulnerabilities from the OWASP Top 10 2017: A1 (Injection), A5 (Broken Access Control), A6 (Security Misconfiguration), and A7 (Cross-Site Scripting). To generate smaller slices for human consumption, we used a variant of intra-procedural thin slicing. We report the results for δ=0 which corresponds to exactly matching the vulnerable ground truth lines, and δ=3 which represents the scenario of identifying the vulnerable area. For both cases, we found that slicing helps in ‘finding something’ (the participant has found at least some vulnerable lines) as opposed to ‘finding nothing’. For the case of δ=0 analyzing a slice and analyzing the original file are statistically equivalent from the perspective of lines found by those who found something. With δ=3 slicing helps to find more vulnerabilities compared to analyzing an original file, as we would normally expect. Given the type of population, additional experiments are necessary to ...
Document Type: article in journal/newspaper
Language: English
Relation: info:eu-repo/semantics/altIdentifier/hdl/https://hdl.handle.net/1871.1/a687b0f1-e1ec-43c3-8c80-f9835e066182; info:eu-repo/semantics/altIdentifier/pissn/1382-3256; info:eu-repo/semantics/altIdentifier/eissn/1573-7616; info:eu-repo/semantics/reference/hdl/https://hdl.handle.net/1871.1/377cc03c-f42c-4c06-8e95-2ca12fb0ba52
DOI: 10.1007/s10664-025-10636-y
Availability: https://research.vu.nl/en/publications/a687b0f1-e1ec-43c3-8c80-f9835e066182
https://doi.org/10.1007/s10664-025-10636-y
https://hdl.handle.net/1871.1/a687b0f1-e1ec-43c3-8c80-f9835e066182
https://www.scopus.com/pages/publications/105003290900
https://www.scopus.com/inward/citedby.url?scp=105003290900&partnerID=8YFLogxK
Rights: info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by/4.0/
Accession Number: edsbas.9698448E
Database: BASE
Description
Abstract:Slicing is a fault localization technique that has been proposed to support debugging and program comprehension. Yet, its empirical effectiveness during code inspection by humans has received limited attention. The goal of our study is two-fold. First, we aim to define what it means for a code reviewer to identify the vulnerable lines correctly. Second, we investigate whether reducing the number of to-be-inspected lines by method-level slicing supports code reviewers in detecting security vulnerabilities. We propose a novel approach based on the notion of a δ-neighborhood (intuitively based on the idea of the context size of the command git diff) to define correctly identified lines. Then, we conducted a multi-year controlled experiment (2017-2023) in which MSc students attending security courses (n=236) were tasked with identifying vulnerable lines in original or sliced Java files from Apache Tomcat. We provide perfect seed lines for a slicing algorithm to control for confounding factors. Each treatment differs in the pair (Vulnerability, Original/Sliced) with a balanced design with vulnerabilities from the OWASP Top 10 2017: A1 (Injection), A5 (Broken Access Control), A6 (Security Misconfiguration), and A7 (Cross-Site Scripting). To generate smaller slices for human consumption, we used a variant of intra-procedural thin slicing. We report the results for δ=0 which corresponds to exactly matching the vulnerable ground truth lines, and δ=3 which represents the scenario of identifying the vulnerable area. For both cases, we found that slicing helps in ‘finding something’ (the participant has found at least some vulnerable lines) as opposed to ‘finding nothing’. For the case of δ=0 analyzing a slice and analyzing the original file are statistically equivalent from the perspective of lines found by those who found something. With δ=3 slicing helps to find more vulnerabilities compared to analyzing an original file, as we would normally expect. Given the type of population, additional experiments are necessary to ...
DOI:10.1007/s10664-025-10636-y